Re: [PROPOSAL] Preparing 2.5.0 release next week

2018-06-04 Thread Jean-Baptiste Onofré
New failure on the build: FAILURE: Build failed with an exception. * What went wrong: Could not resolve all files for configuration ':beam-sdks-java-io-hadoop-file-system:testCompileClasspath'. > Could not find zookeeper-tests.jar (org.apache.zookeeper:zookeeper:3.4.6). Searched in the

Re: [PROPOSAL] Preparing 2.5.0 release next week

2018-06-04 Thread Jean-Baptiste Onofré
Hi, yes, it's release blocker: the build is not fully stable. I'm trying to build the release for one week and it fails with different errors. I have a new build in progress. I hope it will be good. I keep you posted. Regards JB On 05/06/2018 01:38, Scott Wegner wrote: > Hey JB, you mentioned

Jenkins build is back to normal : beam_SeedJob #1882

2018-06-04 Thread Apache Jenkins Server
See

Build failed in Jenkins: beam_SeedJob #1881

2018-06-04 Thread Apache Jenkins Server
See -- GitHub pull request #5406 of commit fe004783708d765c14953797643582cda8818cec, no merge conflicts. Setting status of fe004783708d765c14953797643582cda8818cec to PENDING with url

Build failed in Jenkins: beam_SeedJob #1880

2018-06-04 Thread Apache Jenkins Server
See -- GitHub pull request #5406 of commit 4a23930d1d1d8e5461adb86528817f78fe8be377, no merge conflicts. Setting status of 4a23930d1d1d8e5461adb86528817f78fe8be377 to PENDING with url

Re: [PROPOSAL] Preparing 2.5.0 release next week

2018-06-04 Thread Scott Wegner
Hey JB, you mentioned some build issues on Slack [1]. Is this blocking the release? Let me know if there's anything I can help with. [1] https://the-asf.slack.com/archives/C9H0YNP3P/p1528133545000136 On Sun, Jun 3, 2018 at 10:58 PM Jean-Baptiste Onofré wrote: > Hi guys, > > just to let you

Jenkins build is unstable: beam_SeedJob #1879

2018-06-04 Thread Apache Jenkins Server
See

Build failed in Jenkins: beam_SeedJob #1878

2018-06-04 Thread Apache Jenkins Server
See -- GitHub pull request #5406 of commit d2a7ca03b5b1ff67bc0b7eb0f5e758a9f75076c0, no merge conflicts. Setting status of d2a7ca03b5b1ff67bc0b7eb0f5e758a9f75076c0 to PENDING with url

Re: Go SDK Example=

2018-06-04 Thread James Wilson
Hi Kenn and Henning, Thank you both for helping me to get started. I’ll definitely reach out on dev as I work on this. Best, James > On Jun 4, 2018, at 12:03 PM, Henning Rohde wrote: > > Welcome James! > > Awesome that you're interested in contributing to Apache Beam! If you're >

Re: Portability and Timers

2018-06-04 Thread Lukasz Cwik
Fixed the permissions, feel free to comment on the doc. The specs on the ParDoPayload will stay, analogous to the SideInputPayload. The PCollection will not be modified and will continue to contain the windowing strategy and coder. On Mon, Jun 4, 2018 at 3:41 PM Kenneth Knowles wrote: > I like

Re: Beam breaks when it isn't loaded via the Thread Context Class Loader

2018-06-04 Thread Lukasz Cwik
I totally agree, but there are so many Java APIs (including ours) that messed this up so everyone lives with the same hack. On Mon, Jun 4, 2018 at 3:41 PM Andrew Pilloud wrote: > It seems like a terribly fragile way to pass arguments but my tests pass > when I wrap the JDBC path into Beam

Re: Beam breaks when it isn't loaded via the Thread Context Class Loader

2018-06-04 Thread Andrew Pilloud
It seems like a terribly fragile way to pass arguments but my tests pass when I wrap the JDBC path into Beam pipeline execution with that pattern. Thanks! Andrew On Mon, Jun 4, 2018 at 3:20 PM Lukasz Cwik wrote: > It is a common mistake for APIs to not include a way to specify which > class

Re: Portability and Timers

2018-06-04 Thread Kenneth Knowles
I like it. Having the extra portability layer really opens up these possibilities that wouldn't make a usable API for a user, but are really helpful for modeling. I've only got View permissions to the doc, so commenting here. You mention that they are modeled as a PCollection, but it seems that

Re: Beam breaks when it isn't loaded via the Thread Context Class Loader

2018-06-04 Thread Lukasz Cwik
It is a common mistake for APIs to not include a way to specify which class loader to use when doing something like deserializing an instance of a class via the ObjectInputStream. This common issue also affects Apache Beam (SerializableCoder, PipelineOptionsFactory, ...) and the way that typical

Re: [VOTE] Code Review Response-time SLO

2018-06-04 Thread Huygaa Batsaikhan
Proposal 1: +1 Proposal 2: +1 Additional Comments: This is an example vote On Mon, Jun 4, 2018 at 3:15 PM Huygaa Batsaikhan wrote: > A few months ago, Reuven sent out an email >

[VOTE] Code Review Response-time SLO

2018-06-04 Thread Huygaa Batsaikhan
A few months ago, Reuven sent out an email about improvements to Beam's code review process. Because the email covered multiple issues, we did not really dig deep into each of them. One of the

Re: [SQL] Unsupported features

2018-06-04 Thread Kai Jiang
Ismaël, I was running this naive code snippet . Yes, IT would be interesting. Next step, I was thinking of is making the progress automatically and integrating with Nexmark. Do you have any ideas about this? Currently, I ingested

Portability and Timers

2018-06-04 Thread Lukasz Cwik
I have been working on a proposal for adding support for timers to the Apache Beam portability APIs. The synopsis is to model timers as PCollections. This allows us to treat timers as just another type of data that is transmitted/received by a Runner during execution and leverage all the work

Beam breaks when it isn't loaded via the Thread Context Class Loader

2018-06-04 Thread Andrew Pilloud
I'm having class loading issues that go away when I revert the changes in our use of Class.forName added in https://github.com/apache/beam/pull/4674. The problem I'm having is that the typical JDBC GUI (SqlWorkbench/J, SQuirreL SQL) creates an isolated class loader to load our library. Things work

Re: [VOTE] Code Review Process

2018-06-04 Thread Griselda Cuevas
+1 On Mon, 4 Jun 2018 at 12:30, Robert Burke wrote: > +1 > > On Mon, Jun 4, 2018, 9:01 AM Raghu Angadi wrote: > >> +1 >> >> On Fri, Jun 1, 2018 at 10:25 AM Thomas Groh wrote: >> >>> As we seem to largely have consensus in "Reducing Committer Load for >>> Code Reviews"[1], this is a vote to

Re: [VOTE] Code Review Process

2018-06-04 Thread Robert Burke
+1 On Mon, Jun 4, 2018, 9:01 AM Raghu Angadi wrote: > +1 > > On Fri, Jun 1, 2018 at 10:25 AM Thomas Groh wrote: > >> As we seem to largely have consensus in "Reducing Committer Load for Code >> Reviews"[1], this is a vote to change the Beam policy on Code Reviews to >> require that >> >> (1)

Re: Beam SQL Improvements

2018-06-04 Thread Romain Manni-Bucau
This can create other issues with IO if the runner is not designed for it (like direct runner) so probably not something reliable for beam generic part :(. Le lun. 4 juin 2018 20:10, Lukasz Cwik a écrit : > Shouldn't the runner isolate each instance of the pipeline behind an > appropriate class

Re: Multimap PCollectionViews' values udpated rather than appended

2018-06-04 Thread Lukasz Cwik
Carlos, can you provide a test/code snippet for the bug that shows the issue? On Mon, Jun 4, 2018 at 11:57 AM Lukasz Cwik wrote: > +dev@beam.apache.org > Note that this is likely a bug in the DirectRunner for accumulation mode, > filed: https://issues.apache.org/jira/browse/BEAM-4470 > >

Jenkins build is back to normal : beam_SeedJob #1870

2018-06-04 Thread Apache Jenkins Server
See

Re: Multimap PCollectionViews' values udpated rather than appended

2018-06-04 Thread Lukasz Cwik
+dev@beam.apache.org Note that this is likely a bug in the DirectRunner for accumulation mode, filed: https://issues.apache.org/jira/browse/BEAM-4470 Discarding mode is meant to always be the latest firing, the issue though is that you need to emit the entire map every time. If you can do this,

Build failed in Jenkins: beam_SeedJob #1869

2018-06-04 Thread Apache Jenkins Server
See -- GitHub pull request #5406 of commit be1be8d255d3c4b7eff09df0cbf7135b034b4ce6, no merge conflicts. Setting status of be1be8d255d3c4b7eff09df0cbf7135b034b4ce6 to PENDING with url

Re: Beam SQL Improvements

2018-06-04 Thread Lukasz Cwik
Shouldn't the runner isolate each instance of the pipeline behind an appropriate class loader? On Sun, Jun 3, 2018 at 12:45 PM Reuven Lax wrote: > Just an update: Romain and I chatted on Slack, and I think I understand > his concern. The concern wasn't specifically about schemas, rather about >

Jenkins build is back to normal : beam_SeedJob #1867

2018-06-04 Thread Apache Jenkins Server
See

Build failed in Jenkins: beam_SeedJob #1866

2018-06-04 Thread Apache Jenkins Server
See -- GitHub pull request #5406 of commit f4f753f9fe1195cf499f68339da9394eef8deb34, no merge conflicts. Setting status of f4f753f9fe1195cf499f68339da9394eef8deb34 to PENDING with url

Re: [ANNOUNCEMENT] New committers, May 2018 edition!

2018-06-04 Thread Mikhail Gryzykhin
Congratulations! --Mikhail On Fri, Jun 1, 2018 at 11:34 AM Huygaa Batsaikhan wrote: > Congrats! > > On Fri, Jun 1, 2018 at 10:26 AM Thomas Groh wrote: > >> Congrats, you three! >> >> On Thu, May 31, 2018 at 7:09 PM Davor Bonaci wrote: >> >>> Please join me and the rest of Beam PMC in

Re: [Proposal] Automation For Beam Dependency Check

2018-06-04 Thread Kenneth Knowles
This kind of leaking analysis that `mvn dependency:analyze` does is I think what is also called IWYU (Include What You Use). I looked around and there are some gradle plugins to do the same thing. I couldn't tell which was the most robust. Kenn On Mon, Jun 4, 2018 at 9:46 AM Chamikara Jayalath

Re: [Proposal] Automation For Beam Dependency Check

2018-06-04 Thread Chamikara Jayalath
On Mon, Jun 4, 2018 at 6:10 AM Ismaël Mejía wrote: > Is there a way to add to that weekly report the new dependencies that > were introduced in the week before, or that have changed? > I think it makes sense to add a recent changes section so that community is up to date and can discuss if

Re: Proposal: keeping post-commit tests green

2018-06-04 Thread Mikhail Gryzykhin
Hello everyone, I have addressed comments on the proposal doc and updated it accordingly. I have also added section on metrics that we want to track for pre-commit tests and contents for dashboard. Please, take a second look at the document. Highlights: * Sections that I feel require more

Re: Go SDK Example=

2018-06-04 Thread Henning Rohde
Welcome James! Awesome that you're interested in contributing to Apache Beam! If you're specifically interested in the Go SDK, the task you identified is a good one to start with. I assigned it to you. I also added a few similar tasks listed below as alternatives. Feel free to pick the one you

Re: [VOTE] Code Review Process

2018-06-04 Thread Raghu Angadi
+1 On Fri, Jun 1, 2018 at 10:25 AM Thomas Groh wrote: > As we seem to largely have consensus in "Reducing Committer Load for Code > Reviews"[1], this is a vote to change the Beam policy on Code Reviews to > require that > > (1) At least one committer is involved with the code review, as either

Re: Some extensions to the DoFn API

2018-06-04 Thread Jean-Baptiste Onofré
Thanks ! I will work on this one then ;) Regards JB On 04/06/2018 16:55, Reuven Lax wrote: > I'll file a JIRA to track the idea. > > On Mon, Jun 4, 2018 at 5:52 PM Jean-Baptiste Onofré > wrote: > > Exactly, that's why something like @xpath or @json-path could be

Re: Some extensions to the DoFn API

2018-06-04 Thread Reuven Lax
I'll file a JIRA to track the idea. On Mon, Jun 4, 2018 at 5:52 PM Jean-Baptiste Onofré wrote: > Exactly, that's why something like @xpath or @json-path could be > interesting. > > Regards > JB > > On 04/06/2018 16:48, Reuven Lax wrote: > > Interesting. And given that Beam Schemas are recursive

Re: Some extensions to the DoFn API

2018-06-04 Thread Jean-Baptiste Onofré
Exactly, that's why something like @xpath or @json-path could be interesting. Regards JB On 04/06/2018 16:48, Reuven Lax wrote: > Interesting. And given that Beam Schemas are recursive (a row can > contain nested rows), we might actually need something like xpath if we > want to make this fully

Re: Some extensions to the DoFn API

2018-06-04 Thread Reuven Lax
Interesting. And given that Beam Schemas are recursive (a row can contain nested rows), we might actually need something like xpath if we want to make this fully general. Reuven On Mon, Jun 4, 2018 at 5:45 PM Jean-Baptiste Onofré wrote: > Yup, it makes sense, it's what I had in mind. > > In

Re: Some extensions to the DoFn API

2018-06-04 Thread Jean-Baptiste Onofré
Yup, it makes sense, it's what I had in mind. In Apache Camel, in a Processor (similar to a DoFn), we can also pass directly languages to the arguments. We can imagine something like: @ProcessElement void process(@json-path("foo") String foo) @ProcessElement void process(@xpath("//foo") String

Re: Some extensions to the DoFn API

2018-06-04 Thread Reuven Lax
In the schema branch I have already added some annotations for Schema. However in the future I think we could go even further and allow users to pick individual fields out of the row schema. e.g. the user might have a Schema with 100 fields, but only want to process userId and geo location. I

Re: Some extensions to the DoFn API

2018-06-04 Thread Jean-Baptiste Onofré
Hi Reuven, That's a great improvement for user. I don't see an easy way to have annotation about side input/output. I think we can also plan some extension annotation about schema. Like @Element(schema = foo) in addition of the type. Thoughts ? Regards JB On 04/06/2018 16:06, Reuven Lax wrote:

Some extensions to the DoFn API

2018-06-04 Thread Reuven Lax
Beam was created with an annotation-based processing API, that allows the framework to automatically inject parameters to a DoFn's process method (and also allows the user to mark any method as the process method using @ProcessElement). However, these annotations were never completed. A specific

Re: [Proposal] Automation For Beam Dependency Check

2018-06-04 Thread Ismaël Mejía
Is there a way to add to that weekly report the new dependencies that were introduced in the week before, or that have changed? We are not addressing another important problem: Leaking of dependencies. I am not aware of the gradle equivalent of the maven dependency plugin that helps to determine

Re: [SQL] Unsupported features

2018-06-04 Thread Ismaël Mejía
This is super interesting, great work Kai! Just for curiosity, How are you validating this? It would be really interesting to have this also as part of some kind of IT for the future. On Fri, Jun 1, 2018 at 7:43 PM Kai Jiang wrote: > Sounds a good idea! I will file the major problems later

Re: [VOTE] Code Review Process

2018-06-04 Thread Jean-Baptiste Onofré
+1 I think it's already pretty close to what we do, so, no brainer ;) Regards JB On 01/06/2018 19:25, Thomas Groh wrote: > As we seem to largely have consensus in "Reducing Committer Load for > Code Reviews"[1], this is a vote to change the Beam policy on Code > Reviews to require that > > (1)

Re: [VOTE] Use probot/stale to automatically manage stale pull requests

2018-06-04 Thread Jean-Baptiste Onofré
+1 Regards JB On 01/06/2018 18:21, Kenneth Knowles wrote: > Hi all, > > Following the discussion, please vote on the move to activate > probot/stale [3] to notify authors of stale PRs per current policy and > then close them after a 7 day grace period. > > For more details, see: > >  - our

Re: [VOTE] Code Review Process

2018-06-04 Thread Reuven Lax
+1 On Mon, Jun 4, 2018 at 11:40 AM Łukasz Gajowy wrote: > +1 > > 2018-06-04 9:12 GMT+02:00 Etienne Chauchot : > >> +1 >> As I was already applying this. >> >> Le samedi 02 juin 2018 à 11:24 +0300, Reuven Lax a écrit : >> >> +1 >> >> I believe only some committers were aware of the old policy,

Re: [VOTE] Code Review Process

2018-06-04 Thread Łukasz Gajowy
+1 2018-06-04 9:12 GMT+02:00 Etienne Chauchot : > +1 > As I was already applying this. > > Le samedi 02 juin 2018 à 11:24 +0300, Reuven Lax a écrit : > > +1 > > I believe only some committers were aware of the old policy, and others > were effectively doing this anyway. > > On Sat, Jun 2, 2018

Re: [VOTE] Use probot/stale to automatically manage stale pull requests

2018-06-04 Thread Alexey Romanenko
+1 > On 4 Jun 2018, at 10:03, Reuven Lax wrote: > > +1 > > On Mon, Jun 4, 2018, 10:11 AM Etienne Chauchot > wrote: > +1 > Etienne > Le vendredi 01 juin 2018 à 17:58 -0700, Udi Meiri a écrit : >> +1 >> >> On Fri, Jun 1, 2018 at 4:27 PM Lukasz Cwik >

Re: [VOTE] Use probot/stale to automatically manage stale pull requests

2018-06-04 Thread Reuven Lax
+1 On Mon, Jun 4, 2018, 10:11 AM Etienne Chauchot wrote: > +1 > Etienne > Le vendredi 01 juin 2018 à 17:58 -0700, Udi Meiri a écrit : > > +1 > > On Fri, Jun 1, 2018 at 4:27 PM Lukasz Cwik wrote: > > +1 > > On Fri, Jun 1, 2018 at 2:53 PM Thomas Weise wrote: > > +1 > > On Fri, Jun 1, 2018 at

Re: [VOTE] Code Review Process

2018-06-04 Thread Etienne Chauchot
+1As I was already applying this. Le samedi 02 juin 2018 à 11:24 +0300, Reuven Lax a écrit : > +1 > > I believe only some committers were aware of the old policy, and others were > effectively doing this anyway. > > On Sat, Jun 2, 2018 at 2:51 AM Scott Wegner wrote: > > +1 > > > > On Fri, Jun

Re: [VOTE] Use probot/stale to automatically manage stale pull requests

2018-06-04 Thread Etienne Chauchot
+1EtienneLe vendredi 01 juin 2018 à 17:58 -0700, Udi Meiri a écrit : > +1 > > On Fri, Jun 1, 2018 at 4:27 PM Lukasz Cwik wrote: > > +1 > > > > On Fri, Jun 1, 2018 at 2:53 PM Thomas Weise wrote: > > > +1 > > > > > > On Fri, Jun 1, 2018 at 2:17 PM, Robert Bradshaw > > > wrote: > > > > +1 > >