Re: [RESULT][VOTE] Migrate to gitbox

2017-11-28 Thread Jean-Baptiste Onofré
FYI, waiting to move forward on the discussion, I disabled the notification on dev@ mailing list (to avoid the spam ;)). Regards JB On 11/24/2017 04:58 PM, Kenneth Knowles wrote: +1 for new mailing list (reviews@) On Fri, Nov 24, 2017 at 5:20 AM, James wrote: +1 for

Re: [DISCUSS] Move away from Apache Maven as build tool

2017-11-28 Thread Romain Manni-Bucau
Guys, just realized a lot of modules have threadCount=4 or so in the surefire/failsafe config. It makes it impossible to adapt the parallelism to the machine and therefore makes the parallelism inadapted and useless. Can it be a variable at least? -T1C (or -T2C) should allow to be smoother and

[GitHub] holdenk commented on issue #4183: [BEAM-3143] Type Inference Python 3 Compatibility

2017-11-28 Thread GitBox
holdenk commented on issue #4183: [BEAM-3143] Type Inference Python 3 Compatibility URL: https://github.com/apache/beam/pull/4183#issuecomment-347465959 Is this based on https://github.com/apache/beam/pull/4079 ? This is an

Re: [DISCUSS] Thinking about Beam 3.x roadmap and release schedule

2017-11-28 Thread Ben Chambers
Strong +1 to both increasing the frequency of minor releases and also putting together a road map for the next major release or two. I think it would be great to communicate to the community the direction Beam is taking in the future -- what things will users be able to do with 3.0 or 4.0 that

Re: [DISCUSS] Thinking about Beam 3.x roadmap and release schedule

2017-11-28 Thread Jean-Baptiste Onofré
+1 for monthly release if we can sustain this pace ;) Fully agree to improve the test, automation, documentation of the release process. On 11/28/2017 06:25 PM, Kenneth Knowles wrote: Yea, let's work hard on improving the ease and pace of releases. I am not really happy to have only quarterly

Re: [DISCUSS] Move away from Apache Maven as build tool

2017-11-28 Thread Kenneth Knowles
Yea, I think voting is the next step. Luke - I think you are obviously the right person to set up the email of what exactly we are voting on, since you've driven this improvement. On Tue, Nov 28, 2017 at 12:08 AM, Robert Bradshaw wrote: > It's great to see all the

Re: [DISCUSS] Move away from Apache Maven as build tool

2017-11-28 Thread Scott Wegner
To add one more data point measuring general adoption of gradle vs. maven, we can look at Stackoverflow trends comparing the two tags [1]. This shows the percentage of new SO questions in a given month by tag. 'gradle' represents ~0.25% of questions, while maven is ~0.45%. So, maven is more

[DISCUSS] Updating contribution guide for gitbox

2017-11-28 Thread Kenneth Knowles
Hi all, James brought up a great question in Slack, which was how should we use the merge button, illustrated [1] I want to broaden the discussion to talk about all the new capabilities: 1. Whether & how to use the "reviewer" field 2. Whether & how to use the "assignee" field 3. Whether & how

Re: [VOTE] Use Gradle for Apache Beam developmental processes

2017-11-28 Thread Valentyn Tymofieiev
+1 I support the process change On Tue, Nov 28, 2017 at 9:56 AM, Kenneth Knowles wrote: > +1 (binding) > > On Tue, Nov 28, 2017 at 9:55 AM, Lukasz Cwik wrote: > >> This is a procedural vote for migrating to use Gradle for all our >> development related

Re: [DISCUSS] Thinking about Beam 3.x roadmap and release schedule

2017-11-28 Thread Reuven Lax
On Tue, Nov 28, 2017 at 8:55 AM, Jean-Baptiste Onofré wrote: > Hi guys, > > Even if there's no rush, I think it would be great for the community to > have a better view on our roadmap and where we are going in term of > schedule. > > I would like to discuss the following: > -

Re: [DISCUSS] Thinking about Beam 3.x roadmap and release schedule

2017-11-28 Thread Reuven Lax
On Tue, Nov 28, 2017 at 9:14 AM, Jean-Baptiste Onofré wrote: > Hi Reuven, > > Yes, I remember that we agreed on a release per month. However, we didn't > do it before. I think the most important is not the period, it's more a > stable pace. I think it's more interesting for

Re: [DISCUSS] Updating contribution guide for gitbox

2017-11-28 Thread Thomas Groh
I am strongly in favor of (1); I have no strong feelings about (2); I agree on (3), but generically am not hugely concerned, so long as back-references to the original PR are maintained, which is where most of the context lives. It is nice to have the change broken up into as many individually

[VOTE] Use Gradle for Apache Beam developmental processes

2017-11-28 Thread Lukasz Cwik
This is a procedural vote for migrating to use Gradle for all our development related processes (building, testing, and releasing). A majority vote will signal that: * Gradle build files will be supported and maintained alongside any remaining Maven files. * Once Gradle is able to replace Maven in

Re: [DISCUSS] Move away from Apache Maven as build tool

2017-11-28 Thread Lukasz Cwik
Romain, Gradle has a Nexus plugin[1] which can sign and publish artifacts. Gradle also has excellent support to run Ant tasks since Ant can perform the entire release process for an ASF project. 1: https://github.com/bmuschko/gradle-nexus-plugin On Tue, Nov 28, 2017 at 9:45 AM, Scott Wegner

MergeBot bug when regenerating website?

2017-11-28 Thread Etienne Chauchot
Hi guys, I've just noticed a probable bug on MergeBot on the website static content regeneration. Mergebot seems to badly regenerate website when a page has moved. For example see mergebot commit 446586c68c1d244d240fe18ee48e69aba4462949 The page documentation/sdk/nexmark/index.html (old

Re: [DISCUSS] Updating contribution guide for gitbox

2017-11-28 Thread Jean-Baptiste Onofré
Hi, In other Apache projects using gitbox, I experiment, the following cinematic: 1. use the review button to assign someone 2. once changes approved, I use the merge button (supporting squash and merge) It's very convenient and works fine. So, +1 to (b) Regards JB On 11/28/2017 06:45 PM,

Re: [VOTE] Use Gradle for Apache Beam developmental processes

2017-11-28 Thread Kenneth Knowles
+1 (binding) On Tue, Nov 28, 2017 at 9:55 AM, Lukasz Cwik wrote: > This is a procedural vote for migrating to use Gradle for all our > development related processes (building, testing, and releasing). A > majority vote will signal that: > * Gradle build files will be supported

Re: [DISCUSS] Updating contribution guide for gitbox

2017-11-28 Thread Kenneth Knowles
On Tue, Nov 28, 2017 at 9:51 AM, Ben Chambers wrote: > One risk to "squash and merge" is that it may lead to commits that don't > have clean descriptions -- for instance, commits like "Fixing review > comments" will show up. If we use (a) these would also show up as

Re: [VOTE] Use Gradle for Apache Beam developmental processes

2017-11-28 Thread Thomas Groh
+1 On Tue, Nov 28, 2017 at 10:04 AM, Valentyn Tymofieiev wrote: > +1 I support the process change > > > On Tue, Nov 28, 2017 at 9:56 AM, Kenneth Knowles wrote: > >> +1 (binding) >> >> On Tue, Nov 28, 2017 at 9:55 AM, Lukasz Cwik wrote:

Re: [DISCUSS] Move away from Apache Maven as build tool

2017-11-28 Thread Robert Bradshaw
I also did an apache github query select count(*) as apache_projects, sum(uses_maven=true) as uses_maven, sum(uses_gradle=true) as uses_gradle from ( select repo_name, max(path contains 'pom.xml') as uses_maven, max(path contains 'gradle') as uses_gradle from

Re: [DISCUSS] Thinking about Beam 3.x roadmap and release schedule

2017-11-28 Thread Lukasz Cwik
I would suggest that for 3.x we target portability so that more runners can execute an Apache Beam python pipeline. We should start targeting JIRAs which we know are backwards incompatible as well since we know there are rough corners around some APIs. On Tue, Nov 28, 2017 at 9:48 AM, Reuven

Re: [VOTE] Use Gradle for Apache Beam developmental processes

2017-11-28 Thread Jason Kuster
+1 >From the perspective of Beam's infrastructure, I've found that Gradle provides us a good amount more flexibility to do the kinds of builds we want. Additionally, the shorter run times (while not the only factor here) will allow us to stretch our finite executor resources further, leading to

[DISCUSS] Thinking about Beam 3.x roadmap and release schedule

2017-11-28 Thread Jean-Baptiste Onofré
Hi guys, Even if there's no rush, I think it would be great for the community to have a better view on our roadmap and where we are going in term of schedule. I would like to discuss the following: - a best effort to maintain a good release pace or at least provide a rough schedule. For

Re: [DISCUSS] Thinking about Beam 3.x roadmap and release schedule

2017-11-28 Thread Jean-Baptiste Onofré
Hi Reuven, Yes, I remember that we agreed on a release per month. However, we didn't do it before. I think the most important is not the period, it's more a stable pace. I think it's more interesting for our community to have "always" a release every two months, more than a tentative of a

Re: [DISCUSS] Thinking about Beam 3.x roadmap and release schedule

2017-11-28 Thread Kenneth Knowles
Yea, let's work hard on improving the ease and pace of releases. I am not really happy to have only quarterly releases. Automation of release process where possible, better test coverage, a higher resistance to cherry-picks. Kenn On Tue, Nov 28, 2017 at 9:14 AM, Jean-Baptiste Onofré

Re: [DISCUSS] Move away from Apache Maven as build tool

2017-11-28 Thread Romain Manni-Bucau
Did you try a release (you can create a temporary staging repo on ASF nexus if it helps) before starting a vote? Cause you migrate and the project is no more able to release it can be a rude blocker - which never happens when needed ;). Release has a few more plugins I didn't find in gradle (can

Re: [DISCUSS] Updating contribution guide for gitbox

2017-11-28 Thread Ben Chambers
One risk to "squash and merge" is that it may lead to commits that don't have clean descriptions -- for instance, commits like "Fixing review comments" will show up. If we use (a) these would also show up as separate commits. It seems like there are two cases of multiple commits in a PR: 1.

Re: [VOTE] Use Gradle for Apache Beam developmental processes

2017-11-28 Thread Wesley Tanaka
+1 On 11/28/2017 07:55 AM, Lukasz Cwik wrote: This is a procedural vote for migrating to use Gradle for all our development related processes (building, testing, and releasing). A majority vote will signal that: * Gradle build files will be supported and maintained alongside any remaining

Re: [DISCUSS] Updating contribution guide for gitbox

2017-11-28 Thread Lukasz Cwik
Is it possible for mergebot to auto squash any fixup! and perform the merge commit as described in (a), if so then I would vote for mergebot. Without mergebot, I vote: (a) 0 I like squashing fixup! (b) -1 (c) +1 Most of our PRs are for focused singular changes which is why I would rather squash

Re: [discuss] java profile

2017-11-28 Thread Lukasz Cwik
Its been well shown that a build system that uses input/output set change detection can correctly implement incremental builds. Build systems are not tied to knowing the internal details of how Java compiles things. Knowing that there are some inputs, a process, and some outputs is enough to know

Re: SerializableCoder Structured Value

2017-11-28 Thread Lukasz Cwik
I think that at least we should be clear in the documentation for SerializableCoder and also make sure that the DirectRunner validates the consistentWithEquals property. Optionally one of: 1) Make a version of SerializableCoder that can be constructed where it says it is consistentWithEquals and

Re: [DISCUSS] Updating contribution guide for gitbox

2017-11-28 Thread Raghu Angadi
-1 for (a): no need to see all the private branch commits from contributor. It often makes me more conscious of local commits. +1 for (b): with committer replacing the squashed commit messages with '[BEAM-jira or PRID]: Brief cut-n-paste (or longer if it contributor provided one)'. -1 for (c):

Re: [discuss] java profile

2017-11-28 Thread Kenneth Knowles
I seem to remember a tool called `make` that was pretty good at this. On Tue, Nov 28, 2017 at 10:47 AM, Lukasz Cwik wrote: > Its been well shown that a build system that uses input/output set change > detection can correctly implement incremental builds. Build systems are not

Re: [DISCUSS] Updating contribution guide for gitbox

2017-11-28 Thread Kenneth Knowles
On Tue, Nov 28, 2017 at 11:16 AM, Raghu Angadi wrote: > -1 for (a): no need to see all the private branch commits from > contributor. It often makes me more conscious of local commits. > I want to note that on my PRs these are not private commits. Each one is a meaningful

Re: [discuss] java profile

2017-11-28 Thread Romain Manni-Bucau
Lukasz: only for an isolated "system" which is a module - assuming you still want to be able to build a submodule without building and revalidating the whole tree which is important in dev IMO. This means you shouldnt handle inputs outside "current" module and therefore miss easily some (typically

Re: [DISCUSS] Move away from Apache Maven as build tool

2017-11-28 Thread Romain Manni-Bucau
@Lukasz: was it tested in current setup? I know groovy does it but never checked it myself. If not the vote must be conditional to that IMHO. Le 28 nov. 2017 19:19, "Robert Bradshaw" a écrit : > I also did an apache github query > > select count(*) as apache_projects,

Re: [VOTE] Use Gradle for Apache Beam developmental processes

2017-11-28 Thread Romain Manni-Bucau
-1 (non binding) gradle discourages contributions which is a big pitfall for an asf project and maven/gradle comparison is unfair due to the threading setup of maven (hardcoded thread count and no parallelize builder tusage). Le 28 nov. 2017 19:38, "Jason Kuster" a écrit

Re: [VOTE] Use Gradle for Apache Beam developmental processes

2017-11-28 Thread Reuven Lax
+1 (binding) One caveat to the second part of this vote. I think we need to elaborate a clear list of criteria that Gradle must clear before any processes are migrated off of Maven. On Tue, Nov 28, 2017 at 12:51 PM, Romain Manni-Bucau wrote: > -1 (non binding) gradle

Re: [DISCUSS] Updating contribution guide for gitbox

2017-11-28 Thread Kenneth Knowles
Let's assume that when I say (a) the author has arranged commits to be meaningful. That's what I meant to say in each of my descriptions of the option. If they are noise, it doesn't apply. On Tue, Nov 28, 2017 at 8:04 PM, James wrote: > Thanks Kenn for bring up this

Apache Beam Workshop in Guadalajara Mexico

2017-11-28 Thread Griselda Cuevas
Hi Everyone, I wanted to share with you that on December 2nd, Wizeline Academy [1] will host an Apache Beam workshop in Guadalajara Mexico. The objective of this workshop is to identify adoption barriers and improvement opportunities for the project through the observation and documentation of

Re: [DISCUSS] Updating contribution guide for gitbox

2017-11-28 Thread James
Thanks Kenn for bring up this expanded discussion, my vote is: (a) -1 this preserves noise log like 'fix review comments' (b) +0 this keeps the commit log clean, but without a rebase (c) -1 similar to option a), it preserves noise log like 'fix review comments' My ideal option is the current

Re: [DISCUSS] Thinking about Beam 3.x roadmap and release schedule

2017-11-28 Thread Romain Manni-Bucau
Ps: forgot another wish: make usable beam sql. Today you need to add a fn before and after cause of that type breakage not consistent with the pipeline API. It would be nice to support pojo (extracted from the select fields or created from "views" like in jackson) bit not having to wrap the sql

Re: [DISCUSS] Thinking about Beam 3.x roadmap and release schedule

2017-11-28 Thread Romain Manni-Bucau
My user wishes - whatever version, it is just a number after all ;): - make coder usage simpler and consistent (PCollection TypeDescriptor and Coder are duplicated in term of API) - have a beam api (split from the sdk and internals and impl) - have SDF supported by runners - have a SDFRunner

Re: [DISCUSS] Thinking about Beam 3.x roadmap and release schedule

2017-11-28 Thread Robert Bradshaw
On Tue, Nov 28, 2017 at 9:48 AM, Reuven Lax wrote: > > On Tue, Nov 28, 2017 at 9:14 AM, Jean-Baptiste Onofré > wrote: >> >> Hi Reuven, >> >> Yes, I remember that we agreed on a release per month. However, we didn't >> do it before. I think the most important

Re: SerializableCoder Structured Value

2017-11-28 Thread Eugene Kirpichov
Kenn - I agree that consistentWithEquals() is redundant w.r.t. structuralValue(), and should be deprecated. I think our mutation detectors are already using structuralValue(), so the work here would be to simply mark the method deprecated, remove all remaining overrides in the SDK, and document

Re: [DISCUSS] Move away from Apache Maven as build tool

2017-11-28 Thread Robert Bradshaw
It's great to see all the discussion going on here. I think it's important to point out that merging a parallel set of gradle build scripts is a separate (and much less disruptive) step than, say, switching over the default (or even recommended) build/release process to use them, let alone

[GitHub] xumingming commented on issue #4168: [BEAM-3238][SQL] Add BeamRecordSqlTypeBuilder

2017-11-28 Thread GitBox
xumingming commented on issue #4168: [BEAM-3238][SQL] Add BeamRecordSqlTypeBuilder URL: https://github.com/apache/beam/pull/4168#issuecomment-347448647 retest this please. This is an automated message from the Apache Git

Re: gradle dirty files blocking maven build

2017-11-28 Thread Romain Manni-Bucau
Hi guys, happent again this morning with another folder in python sdk: $ find . -name etcd ./sdks/python/container/vendor/github.com/xordataexchange/crypt/backend/etcd ./sdks/python/container/vendor/github.com/coreos/etcd ./sdks/python/container/vendor/github.com/coreos/etcd/cmd/etcd

Performance tests - Spark and Flink current state of knowledge.

2017-11-28 Thread Łukasz Gajowy
Hello! Part of the job while writing the performance test infrastructure is to be able to run them on Spark and Flink. They seem to be problematic though. We provided a short description and a Proof of Concept showing our current state of knowledge and the only way we were able to actually run

Re: [VOTE] Use Gradle for Apache Beam developmental processes

2017-11-28 Thread Manu Zhang
+1 (binding) On Wed, Nov 29, 2017 at 5:08 AM Reuven Lax wrote: > +1 (binding) > > One caveat to the second part of this vote. I think we need to elaborate a > clear list of criteria that Gradle must clear before any processes are > migrated off of Maven. > > On Tue, Nov 28,

Re: Azure(ADLS) compatibility on Beam with Spark runner

2017-11-28 Thread Udi Meiri
Hi JB, I'm working on adding HDFS support to the Python runner. We're planning on using libhdfs3, which doesn't seem to support anything other than HDFS. On Mon, Nov 27, 2017 at 12:44 PM Lukasz Cwik wrote: > Out of curiosity, does using the DirectRunner with ADL work