Re: [DISCUSS] Move away from Apache Maven as build tool

2017-10-30 Thread Jean-Baptiste Onofré
-0 For the following reasons reasons: - maven is a Apache project and we can have support/improvement - I don't see how another build tool would speed up the build by itself - Apache default release process is based on Maven On the other hand, Gradle could be interesting. Anyway it's something

Re: [VOTE] Release 2.2.0, release candidate #1

2017-10-30 Thread Jean-Baptiste Onofré
Great. Thanks for the update. Regards JB On Oct 31, 2017, 01:04, at 01:04, Eugene Kirpichov wrote: >Only 2 issues remaining; Valentyn will ping this thread when they are >resolved. > >On Mon, Oct 30, 2017 at 1:08 PM Eugene Kirpichov >wrote:

Re: [VOTE] Release 2.2.0, release candidate #1

2017-10-30 Thread Eugene Kirpichov
Only 2 issues remaining; Valentyn will ping this thread when they are resolved. On Mon, Oct 30, 2017 at 1:08 PM Eugene Kirpichov wrote: > What are the next steps here: are we waiting for the resolution of these > remaining 5 issues? Are they all truly release blockers or

Re: The trouble with DisplayData tests

2017-10-30 Thread Eugene Kirpichov
On Fri, Oct 27, 2017 at 12:40 AM Kenneth Knowles wrote: > I see where you are coming from. It is truly a marginal feature of Beam at > the moment, but really *really* useful in debugging, when a runner takes > advantage of it. More inline - FWIW it may seem like I'm

Re: [VOTE] Release 2.2.0, release candidate #1

2017-10-30 Thread Eugene Kirpichov
What are the next steps here: are we waiting for the resolution of these remaining 5 issues? Are they all truly release blockers or can some be postponed to 2.3? On Fri, Oct 27, 2017 at 9:11 AM Eugene Kirpichov wrote: > FYI: list of remaining issues targeting 2.2.0 >

Re: [DISCUSS] Move away from Apache Maven as build tool

2017-10-30 Thread Chamikara Jayalath
+1 for exploring other build tools to improve the build time and for better cross language support. But, as others mentioned, we should continue to support Maven based builds for some time till things are fully migrated. For example, we have updated PerfKitBenchmarker to execute Beam jobs through

Re: Is upgrading to Kafka Client 0.10.2.0+ in the roadmap?

2017-10-30 Thread Raghu Angadi
> > Thanks a lot for the information. I am using Beam-2.0. > https://github.com/apache/beam/blob/release-2.0.0/sdks/ > java/io/kafka/pom.xml#L33 I think we should move kafka-clients dependency in KafkaIO to provided scope to avoid potential confusion like this. On Mon, Oct 30, 2017 at 11:10 AM,

Re: Is upgrading to Kafka Client 0.10.2.0+ in the roadmap?

2017-10-30 Thread Mingmin Xu
Thanks for the feedback, glad to know that it works now. Mingmin On Mon, Oct 30, 2017 at 11:10 AM, Shen Li wrote: > Dear All, > > Thanks a lot for the information. I am using Beam-2.0. > https://github.com/apache/beam/blob/release-2.0.0/sdks/ > java/io/kafka/pom.xml#L33 >

Re: Is upgrading to Kafka Client 0.10.2.0+ in the roadmap?

2017-10-30 Thread Shen Li
Dear All, Thanks a lot for the information. I am using Beam-2.0. https://github.com/apache/beam/blob/release-2.0.0/sdks/java/io/kafka/pom.xml#L33 I have just verified that adding Kafka-Client 0.11 in the application pom.xml works fine for me. I can now avoid the JAAS configuration file by using

Re: [DISCUSS] Move away from Apache Maven as build tool

2017-10-30 Thread Reuven Lax
One more comment: regardless of which build tool we use for development, we still need to publish Maven artifacts to support Maven users of Beam. On Mon, Oct 30, 2017 at 10:46 AM, Ted Yu wrote: > I agree with Ben's comment. > > Recently I have been using gradle in another

Re: Is upgrading to Kafka Client 0.10.2.0+ in the roadmap?

2017-10-30 Thread Mingmin Xu
Hi Shen, Can you share which Beam version are you using? Just check master code, the default version for Kafka is `0.11.0.1`. I cannot recall the usage for old versions, my application(2.2.0-SNAPSHOT) works with a customized kafka version based on 0.10.00-SASL. What you need to do is 1). exclude

Re: [DISCUSS] Move away from Apache Maven as build tool

2017-10-30 Thread Reuven Lax
I wonder if we could prototype both Bazel and Gradle to do a better comparison (and also to compare the results with our current Maven build). On Mon, Oct 30, 2017 at 10:32 AM, Ben Chambers wrote: > I think both Gradle and Bazel are worth exploring. Gradle is

Re: [DISCUSS] Move away from Apache Maven as build tool

2017-10-30 Thread Ben Chambers
I think both Gradle and Bazel are worth exploring. Gradle is definitely more common in the wild, but Bazel may be a better fit for the large mixture of languages being developed in one codebase within Beam. It might be helpful for us to list what functionality we want from such a tool, and then

Re: [DISCUSS] Move away from Apache Maven as build tool

2017-10-30 Thread Henning Rohde
+1 to the initiative. It would great to have better support for Go and Docker container images. The current Go maven integration in particular is clunky [1], but I'll have to look into the details of the alternatives to see if they are better. Henning [1]

Re: [DISCUSS] Move away from Apache Maven as build tool

2017-10-30 Thread Kenneth Knowles
I also support exploring a move away from Apache Maven for orchestrating our build. For a single-module project, I still think it can be a good build tool, and we could still use it for this sort of thing, but I think we are reaching a multi-module scale where it does not work well. Almost all of

Re: Is upgrading to Kafka Client 0.10.2.0+ in the roadmap?

2017-10-30 Thread Raghu Angadi
> https://issues.apache.org/jira/browse/BEAM-307 This should be closed. On Mon, Oct 30, 2017 at 9:00 AM, Lukasz Cwik wrote: > There has been some discussion about getting Kafka 0.10.x working on > BEAM-307[1]. > > As an immediate way to unblock yourself, modify your

Re: [DISCUSS] Move away from Apache Maven as build tool

2017-10-30 Thread Reuven Lax
If we're looking at other build options, I think we should also consider Bazel. Some advantages I see of Bazel over Gradle: Gradle is reportedly much slower than Bazel. I haven't measured this myself, but there are many user reports of slow builds, memory leaks, etc. Bazel's build language is

Re: [DISCUSS] Move away from Apache Maven as build tool

2017-10-30 Thread Eugene Kirpichov
This seems like a very good idea. With the effective complete lack of incremental builds in Maven, it's frustrating to routinely spend several minutes re-verifying a PR after fixing a checkstyle warning in an extension module. Another non-Apache alternative could be Bazel, which is even faster

Re: Is upgrading to Kafka Client 0.10.2.0+ in the roadmap?

2017-10-30 Thread Raghu Angadi
Shen, KafkaIO works with all the versions since 0.9. Just include kafka-clients version you like in your maven dependencies along with Beam dependencies. Glad to here Kafka 0.10.2 made it simpler to provide this config. On Mon, Oct 30, 2017 at 8:14 AM, Shen Li wrote: >

Is upgrading to Kafka Client 0.10.2.0+ in the roadmap?

2017-10-30 Thread Shen Li
Hi, To use KafkaIO in secure mode, I need to set -Djava.security.auth.login.config to point to a JAAS configuration file. It works fine for local execution. But how can I configure the "java.security.auth.login.config" property in the Beam app when the pipeline is submitted to a

Re: read source - MongoDbIO.read()

2017-10-30 Thread Jean-Baptiste Onofré
ReadAll is implemented in JDBC and Redis IOs. I will add for on other IOs. Regards JB On Oct 30, 2017, 13:39, at 13:39, Chaim Turkel wrote: >can you send me a link to the code? > >On Mon, Oct 30, 2017 at 2:21 PM, Jean-Baptiste Onofré >wrote: >> That's the

Re: read source - MongoDbIO.read()

2017-10-30 Thread Chaim Turkel
can you send me a link to the code? On Mon, Oct 30, 2017 at 2:21 PM, Jean-Baptiste Onofré wrote: > That's the evolution I'm proposing and I already implemented in some IO: > readAll pattern. Let me check for mongo. > > On Oct 30, 2017, 12:00, at 12:00, Chaim Turkel

Re: read source - MongoDbIO.read()

2017-10-30 Thread Chaim Turkel
thanks, and bigquery On Mon, Oct 30, 2017 at 2:21 PM, Jean-Baptiste Onofré wrote: > That's the evolution I'm proposing and I already implemented in some IO: > readAll pattern. Let me check for mongo. > > On Oct 30, 2017, 12:00, at 12:00, Chaim Turkel wrote:

Re: read source - MongoDbIO.read()

2017-10-30 Thread Jean-Baptiste Onofré
That's the evolution I'm proposing and I already implemented in some IO: readAll pattern. Let me check for mongo. On Oct 30, 2017, 12:00, at 12:00, Chaim Turkel wrote: >I am syncing multiple tables from mongo to bigquery. >So i first check how many records there are, and then

Re: read source - MongoDbIO.read()

2017-10-30 Thread Chaim Turkel
I am syncing multiple tables from mongo to bigquery. So i first check how many records there are, and then if there are records a need to sync them, else i need to update the status table, that there was nothing to sync. Also in the case that I do sync i need to update the status table with

Re: read source - MongoDbIO.read()

2017-10-30 Thread Jean-Baptiste Onofré
Can you describe your use case ? We can imagine to be able to define a custom FN in the read. But I'm afraid it would be too specific. On Oct 30, 2017, 10:46, at 10:46, Chaim Turkel wrote: >any reason for this, there should be a way to run it from any point > >On Mon, Oct 30,

Jenkins build is back to normal : beam_SeedJob_Standalone #49

2017-10-30 Thread Apache Jenkins Server
See

Re: read source - MongoDbIO.read()

2017-10-30 Thread Chaim Turkel
any reason for this, there should be a way to run it from any point On Mon, Oct 30, 2017 at 11:24 AM, Jean-Baptiste Onofré wrote: > Hi > > No the pipeline starts with the read. You can always create your own custom > read. > > Regards > JB > > On Oct 30, 2017, 09:33, at

Re: read source - MongoDbIO.read()

2017-10-30 Thread Jean-Baptiste Onofré
Hi No the pipeline starts with the read. You can always create your own custom read. Regards JB On Oct 30, 2017, 09:33, at 09:33, Chaim Turkel wrote: >Hi, > Is there a way to have some code run before the read? >I would like to check before how many records exists and

pipeline distribution

2017-10-30 Thread Chaim Turkel
Hi, I have a pipeline that has more that 20 collections. It seems that dataflow cannot deploy this pipeline. I see that from the code I can create more than one pipeline. Any one know what the limit is? Also if i split it, is there a recommended way as to how (the collection have different

read source - MongoDbIO.read()

2017-10-30 Thread Chaim Turkel
Hi, Is there a way to have some code run before the read? I would like to check before how many records exists and based on this have two different pipelines. Currently this code is in the runner but since i have 20 tables this takes a long time. I would like to move the check into the pipeline