Re: [VOTE] Release 2.2.0, release candidate #3

2017-11-10 Thread Jean-Baptiste Onofré

-1 (binding)

I agree with Eugene, data loss is severe.

As Eugene seems confident to fix that quickly, I think it's worth to cut a RC4.

However, I would introduce a deadline. As I would like to propose a release 
cycle of a release every 6 weeks (whatever it contains, but it really important 
to keep  a regular pace in releases), a release should be cut in couple of days. 
So, maybe we can give us 2 business days to fix that and propose a RC4. 
Basically, if this issue is not fix on Tuesday night, then, we move forward anyway.


Regards
JB

On 11/10/2017 07:42 PM, Eugene Kirpichov wrote:

Unfortunately I think I found a data loss bug - it was there since 2.0.0
but I think it's serious enough that delaying a fix until the next release
would be irresponsible.
See https://issues.apache.org/jira/browse/BEAM-3169

On Thu, Nov 9, 2017 at 3:57 PM Robert Bradshaw 
wrote:


Our release notes look like nothing more than a query for the closed
jira issues. Do we have a top-level summary to highlight the big
ticket items in the release? And in particular somewhere to mention
that this is likely the last release to support Java 7 that'll get
widely read?

On Thu, Nov 9, 2017 at 3:39 PM, Reuven Lax 
wrote:

Thanks,

This RC is currently failing on a number of validation steps, so we need

to

cut at least one more RC. Fingers crossed that it will be the last one.

Reuven

On Thu, Nov 9, 2017 at 3:36 PM, Konstantinos Katsiapis <
katsia...@google.com.invalid> wrote:


Just a remark: Release of Tensorflow Transform
 0.4.0 depends on release of
Apache Beam 2.2.0 so upvoting for a release (the sooner the better).

On Thu, Nov 9, 2017 at 3:33 PM, Reuven Lax 
wrote:


Are we waiting for any more validation of this candidate? If people

are

still running tests I'll hold off on RC4 (to reduce the chance of an

RC5),

otherwise I'll cut RC4 once Valentyn's PR is merged.

Reuven

On Thu, Nov 9, 2017 at 2:26 PM, Valentyn Tymofieiev <
valen...@google.com.invalid> wrote:


https://github.com/apache/beam/pull/4109 is out to address both

findings I

reported earlier.

On Thu, Nov 9, 2017 at 8:54 AM, Etienne Chauchot <

echauc...@gmail.com>

wrote:


Just as a remark, I compared (on my laptop though) queries

execution

times

on my previous run of 2.2.0-RC3 with release 2.1.0 and I did not

see

any

performance regression.

Best

Etienne


Le 09/11/2017 à 03:13, Valentyn Tymofieiev a écrit :


I looked at Python side of Dataflow & Direct runners on Linux.

There

are

two findings:

1. One of the mobile gaming examples did not pass for Dataflow

runner,

addressed in: https://github.com/apache/beam/pull/4102


.

2. Python streaming did not work for Dataflow runner, one PR is

out

https://github.com/apache/beam/pull/4106, but follow up PRs may

be

required
as we continue to investigate. If we had a PostCommit tests suite

running

against a release branch, this could have been caught earlier.

Filed

https://issues.apache.org/jira/browse/BEAM-3163.

On Wed, Nov 8, 2017 at 2:39 PM, Reuven Lax


[6] https://github.com/apache/beam-site/pull/337












--
Gus Katsiapis | Software Engineer | katsia...@google.com | 650-918-7487

<(650)%20918-7487>








--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com


Re: [VOTE] Release 2.2.0, release candidate #3

2017-11-10 Thread Ted Yu
Considering that the holiday is around the corner, it would be nice to release 
2.2.0 sooner. 
Cheers 
 Original message From: Chamikara Jayalath 
 Date: 11/10/17  12:22 PM  (GMT-08:00) To: 
dev@beam.apache.org Subject: Re: [VOTE] Release 2.2.0, release candidate #3 
We found another issue that should probably be fixed in 2.2.0 release:
https://issues.apache.org/jira/browse/BEAM-3172

A fix is out for review and will be merged soon.

Thanks,
Cham

On Fri, Nov 10, 2017 at 10:43 AM Eugene Kirpichov
 wrote:

> Unfortunately I think I found a data loss bug - it was there since 2.0.0
> but I think it's serious enough that delaying a fix until the next release
> would be irresponsible.
> See https://issues.apache.org/jira/browse/BEAM-3169
>
> On Thu, Nov 9, 2017 at 3:57 PM Robert Bradshaw  >
> wrote:
>
> > Our release notes look like nothing more than a query for the closed
> > jira issues. Do we have a top-level summary to highlight the big
> > ticket items in the release? And in particular somewhere to mention
> > that this is likely the last release to support Java 7 that'll get
> > widely read?
> >
> > On Thu, Nov 9, 2017 at 3:39 PM, Reuven Lax 
> > wrote:
> > > Thanks,
> > >
> > > This RC is currently failing on a number of validation steps, so we
> need
> > to
> > > cut at least one more RC. Fingers crossed that it will be the last one.
> > >
> > > Reuven
> > >
> > > On Thu, Nov 9, 2017 at 3:36 PM, Konstantinos Katsiapis <
> > > katsia...@google.com.invalid> wrote:
> > >
> > >> Just a remark: Release of Tensorflow Transform
> > >>  0.4.0 depends on release of
> > >> Apache Beam 2.2.0 so upvoting for a release (the sooner the better).
> > >>
> > >> On Thu, Nov 9, 2017 at 3:33 PM, Reuven Lax 
> > >> wrote:
> > >>
> > >> > Are we waiting for any more validation of this candidate? If people
> > are
> > >> > still running tests I'll hold off on RC4 (to reduce the chance of an
> > >> RC5),
> > >> > otherwise I'll cut RC4 once Valentyn's PR is merged.
> > >> >
> > >> > Reuven
> > >> >
> > >> > On Thu, Nov 9, 2017 at 2:26 PM, Valentyn Tymofieiev <
> > >> > valen...@google.com.invalid> wrote:
> > >> >
> > >> > > https://github.com/apache/beam/pull/4109 is out to address both
> > >> > findings I
> > >> > > reported earlier.
> > >> > >
> > >> > > On Thu, Nov 9, 2017 at 8:54 AM, Etienne Chauchot <
> > echauc...@gmail.com>
> > >> > > wrote:
> > >> > >
> > >> > > > Just as a remark, I compared (on my laptop though) queries
> > execution
> > >> > > times
> > >> > > > on my previous run of 2.2.0-RC3 with release 2.1.0 and I did not
> > see
> > >> > any
> > >> > > > performance regression.
> > >> > > >
> > >> > > > Best
> > >> > > >
> > >> > > > Etienne
> > >> > > >
> > >> > > >
> > >> > > > Le 09/11/2017 à 03:13, Valentyn Tymofieiev a écrit :
> > >> > > >
> > >> > > >> I looked at Python side of Dataflow & Direct runners on Linux.
> > There
> > >> > are
> > >> > > >> two findings:
> > >> > > >>
> > >> > > >> 1. One of the mobile gaming examples did not pass for Dataflow
> > >> runner,
> > >> > > >> addressed in: https://github.com/apache/beam/pull/4102
> > >> > > >>  > >> > > >> che%2Fbeam%2Fpull%2F4102=D=1=AFQjCNF3OS6Oo-MeNET
> > >> > > >> CCmOxJj5Gm2uH6g>
> > >> > > >>
> > >> > > >> .
> > >> > > >>
> > >> > > >> 2. Python streaming did not work for Dataflow runner, one PR is
> > out
> > >> > > >> https://github.com/apache/beam/pull/4106, but follow up PRs
> may
> > be
> > >> > > >> required
> > >> > > >> as we continue to investigate. If we had a PostCommit tests
> suite
> > >> > > running
> > >> > > >> against a release branch, this could have been caught earlier.
> > Filed
> > >> > > >> https://issues.apache.org/jira/browse/BEAM-3163.
> > >> > > >>
> > >> > > >> On Wed, Nov 8, 2017 at 2:39 PM, Reuven Lax
> >  > >> >
> > >> > > >> wrote:
> > >> > > >>
> > >> > > >> Hi everyone,
> > >> > > >>>
> > >> > > >>> Please review and vote on the release candidate #3 for the
> > version
> > >> > > 2.2.0,
> > >> > > >>> as follows:
> > >> > > >>>    [ ] +1, Approve the release
> > >> > > >>>    [ ] -1, Do not approve the release (please provide specific
> > >> > > comments)
> > >> > > >>>
> > >> > > >>>
> > >> > > >>> The complete staging area is available for your review, which
> > >> > includes:
> > >> > > >>>    * JIRA release notes [1],
> > >> > > >>>    * the official Apache source release to be deployed to
> > >> > > >>> dist.apache.org
> > >> > > >>> [2],
> > >> > > >>> which is signed with the key with fingerprint B98B7708 [3],
> > >> > > >>>    * all artifacts to be deployed to the Maven Central
> > Repository
> > >> > [4],
> > >> > > >>>    * source code tag "v2.2.0-RC3" [5],
> > >> > > >>>    * website pull request listing the release 

Re: [VOTE] Release 2.2.0, release candidate #3

2017-11-10 Thread Romain Manni-Bucau
Both issues are particular cases. Can the 2.2.0 be out and a 2.2.1 done
quickly after? Would be very appreciated to have the 2.2.0 fixes to not
depend on snapshots anymore due to some blockers found in the core of
previous releases.


Le 10 nov. 2017 21:23, "Chamikara Jayalath" 
a écrit :

> We found another issue that should probably be fixed in 2.2.0 release:
> https://issues.apache.org/jira/browse/BEAM-3172
>
> A fix is out for review and will be merged soon.
>
> Thanks,
> Cham
>
> On Fri, Nov 10, 2017 at 10:43 AM Eugene Kirpichov
>  wrote:
>
> > Unfortunately I think I found a data loss bug - it was there since 2.0.0
> > but I think it's serious enough that delaying a fix until the next
> release
> > would be irresponsible.
> > See https://issues.apache.org/jira/browse/BEAM-3169
> >
> > On Thu, Nov 9, 2017 at 3:57 PM Robert Bradshaw
>  > >
> > wrote:
> >
> > > Our release notes look like nothing more than a query for the closed
> > > jira issues. Do we have a top-level summary to highlight the big
> > > ticket items in the release? And in particular somewhere to mention
> > > that this is likely the last release to support Java 7 that'll get
> > > widely read?
> > >
> > > On Thu, Nov 9, 2017 at 3:39 PM, Reuven Lax 
> > > wrote:
> > > > Thanks,
> > > >
> > > > This RC is currently failing on a number of validation steps, so we
> > need
> > > to
> > > > cut at least one more RC. Fingers crossed that it will be the last
> one.
> > > >
> > > > Reuven
> > > >
> > > > On Thu, Nov 9, 2017 at 3:36 PM, Konstantinos Katsiapis <
> > > > katsia...@google.com.invalid> wrote:
> > > >
> > > >> Just a remark: Release of Tensorflow Transform
> > > >>  0.4.0 depends on release
> of
> > > >> Apache Beam 2.2.0 so upvoting for a release (the sooner the better).
> > > >>
> > > >> On Thu, Nov 9, 2017 at 3:33 PM, Reuven Lax  >
> > > >> wrote:
> > > >>
> > > >> > Are we waiting for any more validation of this candidate? If
> people
> > > are
> > > >> > still running tests I'll hold off on RC4 (to reduce the chance of
> an
> > > >> RC5),
> > > >> > otherwise I'll cut RC4 once Valentyn's PR is merged.
> > > >> >
> > > >> > Reuven
> > > >> >
> > > >> > On Thu, Nov 9, 2017 at 2:26 PM, Valentyn Tymofieiev <
> > > >> > valen...@google.com.invalid> wrote:
> > > >> >
> > > >> > > https://github.com/apache/beam/pull/4109 is out to address both
> > > >> > findings I
> > > >> > > reported earlier.
> > > >> > >
> > > >> > > On Thu, Nov 9, 2017 at 8:54 AM, Etienne Chauchot <
> > > echauc...@gmail.com>
> > > >> > > wrote:
> > > >> > >
> > > >> > > > Just as a remark, I compared (on my laptop though) queries
> > > execution
> > > >> > > times
> > > >> > > > on my previous run of 2.2.0-RC3 with release 2.1.0 and I did
> not
> > > see
> > > >> > any
> > > >> > > > performance regression.
> > > >> > > >
> > > >> > > > Best
> > > >> > > >
> > > >> > > > Etienne
> > > >> > > >
> > > >> > > >
> > > >> > > > Le 09/11/2017 à 03:13, Valentyn Tymofieiev a écrit :
> > > >> > > >
> > > >> > > >> I looked at Python side of Dataflow & Direct runners on
> Linux.
> > > There
> > > >> > are
> > > >> > > >> two findings:
> > > >> > > >>
> > > >> > > >> 1. One of the mobile gaming examples did not pass for
> Dataflow
> > > >> runner,
> > > >> > > >> addressed in: https://github.com/apache/beam/pull/4102
> > > >> > > >>  > > >> > > >> che%2Fbeam%2Fpull%2F4102=D=1=AFQjCNF3OS6Oo-MeNET
> > > >> > > >> CCmOxJj5Gm2uH6g>
> > > >> > > >>
> > > >> > > >> .
> > > >> > > >>
> > > >> > > >> 2. Python streaming did not work for Dataflow runner, one PR
> is
> > > out
> > > >> > > >> https://github.com/apache/beam/pull/4106, but follow up PRs
> > may
> > > be
> > > >> > > >> required
> > > >> > > >> as we continue to investigate. If we had a PostCommit tests
> > suite
> > > >> > > running
> > > >> > > >> against a release branch, this could have been caught
> earlier.
> > > Filed
> > > >> > > >> https://issues.apache.org/jira/browse/BEAM-3163.
> > > >> > > >>
> > > >> > > >> On Wed, Nov 8, 2017 at 2:39 PM, Reuven Lax
> > >  > > >> >
> > > >> > > >> wrote:
> > > >> > > >>
> > > >> > > >> Hi everyone,
> > > >> > > >>>
> > > >> > > >>> Please review and vote on the release candidate #3 for the
> > > version
> > > >> > > 2.2.0,
> > > >> > > >>> as follows:
> > > >> > > >>>[ ] +1, Approve the release
> > > >> > > >>>[ ] -1, Do not approve the release (please provide
> specific
> > > >> > > comments)
> > > >> > > >>>
> > > >> > > >>>
> > > >> > > >>> The complete staging area is available for your review,
> which
> > > >> > includes:
> > > >> > > >>>* JIRA release notes [1],
> > > >> > > >>>* the official Apache source release to be deployed to
> > > >> > > >>> dist.apache.org
> > > >> > > >>> 

Re: [VOTE] Release 2.2.0, release candidate #3

2017-11-10 Thread Chamikara Jayalath
We found another issue that should probably be fixed in 2.2.0 release:
https://issues.apache.org/jira/browse/BEAM-3172

A fix is out for review and will be merged soon.

Thanks,
Cham

On Fri, Nov 10, 2017 at 10:43 AM Eugene Kirpichov
 wrote:

> Unfortunately I think I found a data loss bug - it was there since 2.0.0
> but I think it's serious enough that delaying a fix until the next release
> would be irresponsible.
> See https://issues.apache.org/jira/browse/BEAM-3169
>
> On Thu, Nov 9, 2017 at 3:57 PM Robert Bradshaw  >
> wrote:
>
> > Our release notes look like nothing more than a query for the closed
> > jira issues. Do we have a top-level summary to highlight the big
> > ticket items in the release? And in particular somewhere to mention
> > that this is likely the last release to support Java 7 that'll get
> > widely read?
> >
> > On Thu, Nov 9, 2017 at 3:39 PM, Reuven Lax 
> > wrote:
> > > Thanks,
> > >
> > > This RC is currently failing on a number of validation steps, so we
> need
> > to
> > > cut at least one more RC. Fingers crossed that it will be the last one.
> > >
> > > Reuven
> > >
> > > On Thu, Nov 9, 2017 at 3:36 PM, Konstantinos Katsiapis <
> > > katsia...@google.com.invalid> wrote:
> > >
> > >> Just a remark: Release of Tensorflow Transform
> > >>  0.4.0 depends on release of
> > >> Apache Beam 2.2.0 so upvoting for a release (the sooner the better).
> > >>
> > >> On Thu, Nov 9, 2017 at 3:33 PM, Reuven Lax 
> > >> wrote:
> > >>
> > >> > Are we waiting for any more validation of this candidate? If people
> > are
> > >> > still running tests I'll hold off on RC4 (to reduce the chance of an
> > >> RC5),
> > >> > otherwise I'll cut RC4 once Valentyn's PR is merged.
> > >> >
> > >> > Reuven
> > >> >
> > >> > On Thu, Nov 9, 2017 at 2:26 PM, Valentyn Tymofieiev <
> > >> > valen...@google.com.invalid> wrote:
> > >> >
> > >> > > https://github.com/apache/beam/pull/4109 is out to address both
> > >> > findings I
> > >> > > reported earlier.
> > >> > >
> > >> > > On Thu, Nov 9, 2017 at 8:54 AM, Etienne Chauchot <
> > echauc...@gmail.com>
> > >> > > wrote:
> > >> > >
> > >> > > > Just as a remark, I compared (on my laptop though) queries
> > execution
> > >> > > times
> > >> > > > on my previous run of 2.2.0-RC3 with release 2.1.0 and I did not
> > see
> > >> > any
> > >> > > > performance regression.
> > >> > > >
> > >> > > > Best
> > >> > > >
> > >> > > > Etienne
> > >> > > >
> > >> > > >
> > >> > > > Le 09/11/2017 à 03:13, Valentyn Tymofieiev a écrit :
> > >> > > >
> > >> > > >> I looked at Python side of Dataflow & Direct runners on Linux.
> > There
> > >> > are
> > >> > > >> two findings:
> > >> > > >>
> > >> > > >> 1. One of the mobile gaming examples did not pass for Dataflow
> > >> runner,
> > >> > > >> addressed in: https://github.com/apache/beam/pull/4102
> > >> > > >>  > >> > > >> che%2Fbeam%2Fpull%2F4102=D=1=AFQjCNF3OS6Oo-MeNET
> > >> > > >> CCmOxJj5Gm2uH6g>
> > >> > > >>
> > >> > > >> .
> > >> > > >>
> > >> > > >> 2. Python streaming did not work for Dataflow runner, one PR is
> > out
> > >> > > >> https://github.com/apache/beam/pull/4106, but follow up PRs
> may
> > be
> > >> > > >> required
> > >> > > >> as we continue to investigate. If we had a PostCommit tests
> suite
> > >> > > running
> > >> > > >> against a release branch, this could have been caught earlier.
> > Filed
> > >> > > >> https://issues.apache.org/jira/browse/BEAM-3163.
> > >> > > >>
> > >> > > >> On Wed, Nov 8, 2017 at 2:39 PM, Reuven Lax
> >  > >> >
> > >> > > >> wrote:
> > >> > > >>
> > >> > > >> Hi everyone,
> > >> > > >>>
> > >> > > >>> Please review and vote on the release candidate #3 for the
> > version
> > >> > > 2.2.0,
> > >> > > >>> as follows:
> > >> > > >>>[ ] +1, Approve the release
> > >> > > >>>[ ] -1, Do not approve the release (please provide specific
> > >> > > comments)
> > >> > > >>>
> > >> > > >>>
> > >> > > >>> The complete staging area is available for your review, which
> > >> > includes:
> > >> > > >>>* JIRA release notes [1],
> > >> > > >>>* the official Apache source release to be deployed to
> > >> > > >>> dist.apache.org
> > >> > > >>> [2],
> > >> > > >>> which is signed with the key with fingerprint B98B7708 [3],
> > >> > > >>>* all artifacts to be deployed to the Maven Central
> > Repository
> > >> > [4],
> > >> > > >>>* source code tag "v2.2.0-RC3" [5],
> > >> > > >>>* website pull request listing the release and publishing
> the
> > >> API
> > >> > > >>> reference manual [6].
> > >> > > >>>* Java artifacts were built with Maven 3.5.0 and
> > OpenJDK/Oracle
> > >> > JDK
> > >> > > >>> 1.8.0_144.
> > >> > > >>>* Python artifacts are deployed along with the source
> > release to
> > >> > the
> > >> > > >>> 

Re: [DISCUSS] Move away from Apache Maven as build tool

2017-11-10 Thread Lukasz Cwik
The reason to get it on master is because that is where all the PRs are. An
upstream branch without any development means no data.
Also, our Jenkins setup via job-dsl doesn't honor using the Jenkins
configuration on the branch because the seed job always runs against master.

On Thu, Nov 9, 2017 at 9:59 PM, Romain Manni-Bucau 
wrote:

> What about pushing it on a "upstream" branch and testing it for 1 week in
> parallel of the maven reference build? If gradle is always 50% faster on
> jenkins then it could become master setup without much discussion I guess.
> We can even have 2 jenkins jobs: one with the daemon etc and one without.
>
> Also noticed yesterday that gradle build is killing my machine (all 8 cores
> are 100%) during the first minutes vs maven build which let me do something
> else. Then all the consumed time which makes gradle not that fast is about
> python. Will try to send figures later today.
>
> Le 10 nov. 2017 00:10, "Lukasz Cwik"  a écrit :
>
> > I wouldn't mind merging this change in so I could setup those Gradle
> > Jenkins precommits.
> >
> > As per our contribution guidelines, any committer willing to sign off on
> > the PR?
> >
> > On Thu, Nov 9, 2017 at 2:12 PM, Romain Manni-Bucau <
> rmannibu...@gmail.com>
> > wrote:
> >
> > > Le 9 nov. 2017 21:31, "Kenneth Knowles"  a
> > écrit :
> > >
> > > Keep in mind that a clean build is unusual during development (it is
> > common
> > > for mvn use and that is a bug) and also not necessary for precommits if
> > the
> > > build tool is correct enough that caching is safe. So while this number
> > > matters, it is not the most important.
> > >
> > >
> > > Not sure, in dev you bypass the build tool most of the time anyway -
> > thanks
> > > to IDE or other shortcuts - but not on PR and CI. Keep in mind that not
> > > doing a clean and killing gradle daemon makes the build not
> reproducible
> > > and therefore useful :(. Starting to build from a subpart of the
> reactor
> > -
> > > with the mentionned mvn plugin for instance - can be nice on some CI
> like
> > > travis if the caching is well configured but still not a guarantee the
> > > build is "green".
> > >
> > > My trade off is to ensure an easy build and relevant result over the
> time
> > > criteria. Do you share it as well or prefer time over other criteria -
> > > which leads to other conclusions and options indeed and can make us not
> > > understanding each other?
> > >
> > >
> > > On Thu, Nov 9, 2017 at 11:30 AM, Romain Manni-Bucau <
> > rmannibu...@gmail.com
> > > >
> > > wrote:
> > >
> > > > I will try next week yes but the 2 runs i did were 28mn vs 32mn from
> > > memory
> > > > - after having downloaded all deps once.
> > > >
> > > > Le 9 nov. 2017 19:45, "Lukasz Cwik"  a
> > écrit :
> > > >
> > > > > If Gradle was slow, do you mind running the build with --profile
> and
> > > > > sharing that and also sharing the Maven build log?
> > > > >
> > > > > On Thu, Nov 9, 2017 at 10:43 AM, Lukasz Cwik 
> > wrote:
> > > > >
> > > > > > Romain, I don't understand your last comment, were you trying to
> > say
> > > > that
> > > > > > you had the same Gradle build times like I did and it was an
> > > > improvement
> > > > > > over Maven or that you did not and you experienced build times
> that
> > > > were
> > > > > > equivalent to Maven?
> > > > > >
> > > > > > On Thu, Nov 9, 2017 at 9:51 AM, Romain Manni-Bucau <
> > > > > rmannibu...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > >> 2017-11-09 18:38 GMT+01:00 Kenneth Knowles
>  > > >:
> > > > > >> > On Thu, Nov 9, 2017 at 9:11 AM, Romain Manni-Bucau <
> > > > > >> rmannibu...@gmail.com>
> > > > > >> > wrote:
> > > > > >> >
> > > > > >> >> (this is another topic so we can maybe open another thread)
> > issue
> > > > is
> > > > > >> >> not much about python but more about the fact the build is
> not
> > > self
> > > > > >> >> contained. it is a maven build and maven should be sufficient
> > > > without
> > > > > >> >> having to install python + dependencies.
> > > > > >> >
> > > > > >> >
> > > > > >> > Let's leave out the topic of whether our build should install
> > > things
> > > > > >> like
> > > > > >> > JDKs, Python, Golang, Docker, protoc, findbugs, RAT, etc. That
> > > issue
> > > > > is
> > > > > >> > somewhat independent of build tool, and the new build isn't
> > worse
> > > > than
> > > > > >> the
> > > > > >> > old one as far as it goes.
> > > > > >>
> > > > > >>
> > > > > >> Yep, globally the same time with clean and killing the daemon.
> > > > > >>
> > > > > >> >
> > > > > >> > Kenn
> > > > > >> >
> > > > > >> >
> > > > > >> >
> > > > > >> >> I don't see any technical
> > > > > >> >> blockers to do it (except time ;)) but it is always a bit
> > > annoying
> > > > to
> > > > > >> >> git clone then not be able to build.
> > > > > >> >>
> > > > > >> >> Romain Manni-Bucau
> > > > 

Re: [Proposal] Add performance tests for commonly used file-based I/O PTransforms

2017-11-10 Thread Jean-Baptiste Onofré

Thanks for the update. I will take a look.

Regards
JB

On 11/10/2017 11:43 AM, Kamil Szewczyk wrote:

We updated Step #2 in our proposal.
Comments and suggestions are highly appreciated.

Thanks

2017-10-31 15:42 GMT+01:00 Łukasz Gajowy :


We edited the "Roadmap" section a little bit to reflect our state of
knowledge. As before, all comments are welcome.

Thank you in advance!

2017-10-27 5:10 GMT+02:00 Kenneth Knowles :


I am really excited about this development. Glad to have such a detailed
document! Thanks for taking the time to write it up thoughtfully.

On Wed, Oct 25, 2017 at 10:00 AM, Chamikara Jayalath <
chamik...@google.com.invalid> wrote:


Thanks Łukasz and the team for the proposal. I think fixing this JIRA

will

allow us to keep track of the performance of widely used
source/sink/runner/file-system combinations of Beam SDK. As Łukasz
mentioned, all comments are welcome.

Thanks,
Cham

On Wed, Oct 25, 2017 at 8:08 AM Łukasz Gajowy 

Re: [Proposal] Add performance tests for commonly used file-based I/O PTransforms

2017-11-10 Thread Kamil Szewczyk
We updated Step #2 in our proposal.
Comments and suggestions are highly appreciated.

Thanks

2017-10-31 15:42 GMT+01:00 Łukasz Gajowy :

> We edited the "Roadmap" section a little bit to reflect our state of
> knowledge. As before, all comments are welcome.
>
> Thank you in advance!
>
> 2017-10-27 5:10 GMT+02:00 Kenneth Knowles :
>
> > I am really excited about this development. Glad to have such a detailed
> > document! Thanks for taking the time to write it up thoughtfully.
> >
> > On Wed, Oct 25, 2017 at 10:00 AM, Chamikara Jayalath <
> > chamik...@google.com.invalid> wrote:
> >
> > > Thanks Łukasz and the team for the proposal. I think fixing this JIRA
> > will
> > > allow us to keep track of the performance of widely used
> > > source/sink/runner/file-system combinations of Beam SDK. As Łukasz
> > > mentioned, all comments are welcome.
> > >
> > > Thanks,
> > > Cham
> > >
> > > On Wed, Oct 25, 2017 at 8:08 AM Łukasz Gajowy  >
> > > wrote:
> > >
> > > > Hello Beam Community!
> > > >
> > > >
> > > > During the last year many of Beam developers has put much effort in
> > > > developing and discussing means of testing beam transforms. We would
> > like
> > > > to benefit from that and implement performance tests for file-based
> I/O
> > > > Transforms.
> > > >
> > > >
> > > > This proposal is strictly related to BEAM-3060 issue. Here’s the link
> > to
> > > > the doc:
> > > >
> > > > https://docs.google.com/document/d/1dA-5s6OHiP_cz-
> > > NRAbwapoKF5MEC1wKps4A5tFbIPKE/edit
> > > >
> > > >
> > > > All comments are deeply appreciated.
> > > >
> > > >
> > > > Thanks!
> > > >
> > > > ŁG
> > > >
> > >
> >
>


Re: [VOTE] Drop Spark 1.x support to focus on Spark 2.x

2017-11-10 Thread Jean-Baptiste Onofré

I think so ;)

Regards
JB

On 11/10/2017 09:29 AM, Reuven Lax wrote:

Sounds good. I doubt we will have much opposition from users, in which case
Beam 2.3.0 can deprecate Spark 1.x

On Thu, Nov 9, 2017 at 11:54 PM, Jean-Baptiste Onofré 
wrote:


Hi all,

thanks a lot for all your feedback.

The trend is about to upgrade to Spark 2.x and drop Spark 1.x support.

However, some of you (especially Reuven and Robert) commented that users
have to be pinged as well. It makes perfect sense, and it was my intention.

I propose the following action plan:
- from the technical front, currently, I have two private branches ready:
one with Spark 1.x & Spark 2.x support (with a common module and three
artifacts), another one with an upgrade to Spark 2.x (dropping 1.x). I will
merge the later on the PR.
- I will forward the vote e-mail to the user mailing list, hopefully we
will have user feedback.

Thanks again,
Regards
JB


On 11/08/2017 08:27 AM, Jean-Baptiste Onofré wrote:


Hi all,

as you might know, we are working on Spark 2.x support in the Spark
runner.

I'm working on a PR about that:

https://github.com/apache/beam/pull/3808

Today, we have something working with both Spark 1.x and 2.x from a code
standpoint, but I have to deal with dependencies. It's the first step of
the update as I'm still using RDD, the second step would be to support
dataframe (but for that, I would need PCollection elements with schemas,
that's another topic on which Eugene, Reuven and I are discussing).

However, as all major distributions now ship Spark 2.x, I don't think
it's required anymore to support Spark 1.x.

If we agree, I will update and cleanup the PR to only support and focus
on Spark 2.x.

So, that's why I'm calling for a vote:

[ ] +1 to drop Spark 1.x support and upgrade to Spark 2.x only
[ ] 0 (I don't care ;))
[ ] -1, I would like to still support Spark 1.x, and so having support
of both Spark 1.x and 2.x (please provide specific comment)

This vote is open for 48 hours (I have the commits ready, just waiting
the end of the vote to push on the PR).

Thanks !
Regards
JB



--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com





--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com


Re: [VOTE] Drop Spark 1.x support to focus on Spark 2.x

2017-11-10 Thread Reuven Lax
Sounds good. I doubt we will have much opposition from users, in which case
Beam 2.3.0 can deprecate Spark 1.x

On Thu, Nov 9, 2017 at 11:54 PM, Jean-Baptiste Onofré 
wrote:

> Hi all,
>
> thanks a lot for all your feedback.
>
> The trend is about to upgrade to Spark 2.x and drop Spark 1.x support.
>
> However, some of you (especially Reuven and Robert) commented that users
> have to be pinged as well. It makes perfect sense, and it was my intention.
>
> I propose the following action plan:
> - from the technical front, currently, I have two private branches ready:
> one with Spark 1.x & Spark 2.x support (with a common module and three
> artifacts), another one with an upgrade to Spark 2.x (dropping 1.x). I will
> merge the later on the PR.
> - I will forward the vote e-mail to the user mailing list, hopefully we
> will have user feedback.
>
> Thanks again,
> Regards
> JB
>
>
> On 11/08/2017 08:27 AM, Jean-Baptiste Onofré wrote:
>
>> Hi all,
>>
>> as you might know, we are working on Spark 2.x support in the Spark
>> runner.
>>
>> I'm working on a PR about that:
>>
>> https://github.com/apache/beam/pull/3808
>>
>> Today, we have something working with both Spark 1.x and 2.x from a code
>> standpoint, but I have to deal with dependencies. It's the first step of
>> the update as I'm still using RDD, the second step would be to support
>> dataframe (but for that, I would need PCollection elements with schemas,
>> that's another topic on which Eugene, Reuven and I are discussing).
>>
>> However, as all major distributions now ship Spark 2.x, I don't think
>> it's required anymore to support Spark 1.x.
>>
>> If we agree, I will update and cleanup the PR to only support and focus
>> on Spark 2.x.
>>
>> So, that's why I'm calling for a vote:
>>
>>[ ] +1 to drop Spark 1.x support and upgrade to Spark 2.x only
>>[ ] 0 (I don't care ;))
>>[ ] -1, I would like to still support Spark 1.x, and so having support
>> of both Spark 1.x and 2.x (please provide specific comment)
>>
>> This vote is open for 48 hours (I have the commits ready, just waiting
>> the end of the vote to push on the PR).
>>
>> Thanks !
>> Regards
>> JB
>>
>
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>


Jenkins build is back to stable : beam_Release_NightlySnapshot #589

2017-11-10 Thread Apache Jenkins Server
See