Beam Dependency Check Report (2018-07-31)

2018-07-31 Thread Apache Jenkins Server
High Priority Dependency Updates Of Beam Python SDK: Dependency Name Current Version Latest Version Release Date Of the Current Used Version Release Date Of The Latest Release google-cloud-bigquery 0.25.0 1.4.0 2017-06-26

Re: SQS source

2018-07-31 Thread Ismaël Mejía
Hi, we can try to speed up the review, but the 2.6.0 branch was already cut and was stabilizing for the last two weeks, so I am not sure it will make it. Next release should be cut shortly hopefully in 3-4 weeks to follow the 6 week release plan. Hope this can work for you. On Tue, Jul 31, 2018

Re: SQS source

2018-07-31 Thread John Rudolf Lewis
Understood. Thank you. On Tue, Jul 31, 2018 at 5:13 AM, Ismaël Mejía wrote: > Hi, we can try to speed up the review, but the 2.6.0 branch was > already cut and was stabilizing for the last two weeks, so I am not > sure it will make it. Next release should be cut shortly hopefully in > 3-4 weeks

Re: SQS source

2018-07-31 Thread Reuven Lax
Ismael, do you have time for this review? If you're too busy, I can try to help review it. John, unfortunately, as Ismael said, even if we speed up the review the 2.6.0 branch has already been cut, and we try and only cherry pick important bugfixes. Hopefully the next release will be soon, and

Re: pipeline with parquet and sql

2018-07-31 Thread Łukasz Gajowy
In terms of schema and ParquetIO source/sink, there was an answer in some previous thread: Currently (without introducing any change in ParquetIO) there is no way to not pass the avro schema. It will probably be replaced with Beam's schema in the future () [1]

Re: pipeline with parquet and sql

2018-07-31 Thread Łukasz Gajowy
Sorry, I sent not finished message. In terms of schema and ParquetIO source/sink, there was an answer in some previous thread [1]. Currently (without introducing any change in ParquetIO) there is no way to not pass the avro schema. It will probably be replaced with Beam's schema in the future

Re: Cleanup resources on pipeline cancelation

2018-07-31 Thread Romain Manni-Bucau
Hi Andrew, IIRC sources should clean up their resources per method since they dont have a better lifecycle. Readers can create anything longer and release it at close time. Le mer. 1 août 2018 00:31, Andrew Pilloud a écrit : > Some of our IOs create external resources that need to be cleaned

Re: SQS source

2018-07-31 Thread Tim Robertson
I took a pass at reviewing (non committer). I haven't worked on unbounded IO so wasn't familiar enough with the timestamp and checkpointing but otherwise it LGTM in general - thanks John and for applying the minor suggestions. OT: Reuven, if you have time on your hands there is also the KuduIO

Re: SQS source

2018-07-31 Thread Reuven Lax
Looking at Tim's PR. On Tue, Jul 31, 2018 at 10:53 AM Ismaël Mejía wrote: > Reuven. I already started review and hope to finish later on today or > tomorrow at latest. If you can, it would be good to take a look at Tim's PR > that has been opened for longer time. > > On Tue, Jul 31, 2018, 6:36

Re: SQS source

2018-07-31 Thread Ismaël Mejía
Reuven. I already started review and hope to finish later on today or tomorrow at latest. If you can, it would be good to take a look at Tim's PR that has been opened for longer time. On Tue, Jul 31, 2018, 6:36 PM Tim Robertson wrote: > I took a pass at reviewing (non committer). I haven't

Beam Dependency Check Report (2018-07-31)

2018-07-31 Thread Apache Jenkins Server
High Priority Dependency Updates Of Beam Python SDK: Dependency Name Current Version Latest Version Release Date Of the Current Used Version Release Date Of The Latest Release google-cloud-bigquery 0.25.0 1.4.0 2017-06-26

Beam Dependency Check Report (2018-07-31)

2018-07-31 Thread Apache Jenkins Server
High Priority Dependency Updates Of Beam Python SDK: Dependency Name Current Version Latest Version Release Date Of the Current Used Version Release Date Of The Latest Release google-cloud-bigquery 0.25.0 1.4.0 2017-06-26

Beam Dependency Check Report (2018-07-31)

2018-07-31 Thread Apache Jenkins Server
High Priority Dependency Updates Of Beam Python SDK: Dependency Name Current Version Latest Version Release Date Of the Current Used Version Release Date Of The Latest Release google-cloud-bigquery 0.25.0 1.4.0 2017-06-26

Beam Dependency Check Report (2018-07-31)

2018-07-31 Thread Apache Jenkins Server
High Priority Dependency Updates Of Beam Python SDK: Dependency Name Current Version Latest Version Release Date Of the Current Used Version Release Date Of The Latest Release google-cloud-bigquery 0.25.0 1.4.0 2017-06-26

Removing documentation for old Beam versions

2018-07-31 Thread Udi Meiri
Hi all, I'm writing a PR for apache/beam-site and beam_PreCommit_Website_Stage is timing out after 100 minutes, because it's trying to deletes 22k files and then copy 22k files (warning large file ). It seems that we

Build failed in Jenkins: beam_Release_Gradle_NightlySnapshot #125

2018-07-31 Thread Apache Jenkins Server
See Changes: [relax] Convert BeamSQL to use Schemas. [relax] Deprecate getRowCoder. [relax] Add setSchema to remaining Table objects. [relax] Delete a bunch of code that is no longer used.

Re: Proposal: keeping post-commit tests green

2018-07-31 Thread Mikhail Gryzykhin
Hi everyone, I've summarized things discussed in this design doc into a Beam site page: https://beam.apache.org/contribute/postcommits-policies/ Regards, --Mikhail Have feedback ? On Thu, Jun 14, 2018 at 9:13 AM Mikhail Gryzykhin wrote: > It is one-time action.

Re: Cron schedule is failing on the Beam jenkins jobs

2018-07-31 Thread Yifan Zou
Thanks Udi. I discussed with Mikhail offline and found a problem in the setAutoJob parameters. PR#6112 will fix the problem. Thanks Mikhail for taking quick actions. On Tue, Jul 31, 2018 at 2:25 PM Udi Meiri wrote: > +Mikhail Gryzykhin is working

Cleanup resources on pipeline cancelation

2018-07-31 Thread Andrew Pilloud
Some of our IOs create external resources that need to be cleaned up when a pipeline is terminated. It looks like the org.apache.beam.sdk.io.UnboundedSource interface is called on creation, but there is no call for cleanup. For example, PubsubIO creates a Pubsub subcription in

[VOTE] Apache Beam, version 2.6.0, release candidate #1

2018-07-31 Thread Pablo Estrada
Hello everyone! I have been able to prepare a release candidate for Beam 2.6.0. : D Please review and vote on the release candidate #1 for the version 2.6.0, as follows: [ ] +1, Approve the release [ ] -1, Do not approve the release (please provide specific comments) The complete staged set of

Re: Parallelizing test runs

2018-07-31 Thread Reuven Lax
There was also a proposal to lump multiple tests into a single Dataflow job instead of spinning up a separate Dataflow job for each test. On Tue, Jul 31, 2018 at 4:26 PM Mikhail Gryzykhin wrote: > I synced with Rafael. Below is summary of discussion. > > This quota is

Build failed in Jenkins: beam_Release_Gradle_NightlySnapshot #126

2018-07-31 Thread Apache Jenkins Server
See -- [...truncated 17.16 MB...] Task ':beam-sdks-python-container:prepare' is not up-to-date because: Task has not declared any outputs despite executing actions.

Re: Parallelizing test runs

2018-07-31 Thread Mikhail Gryzykhin
I synced with Rafael. Below is summary of discussion. This quota is CreateRequestsPerMinutePerUser and it has 60 requests per user by default. I've created Jira [BEAM-5053]( https://issues.apache.org/jira/browse/BEAM-5053) for this. I see following options we can utilize: 1. Add retry logic.