Build failed in Jenkins: beam_Release_NightlySnapshot #242

2016-11-22 Thread Apache Jenkins Server
See 

Changes:

[klk] Add JUnit category for stateful ParDo tests

[klk] Reject stateful DoFn in ApexRunner

[klk] Add JUnit category for stateful ParDo tests

[klk] Reject stateful DoFn in SparkRunner

[davor] Beam archetypes: enable snapshot repositories.

[lcwik] [BEAM-59] Drops public constructors and uses Factory methods in

[lcwik] [BEAM-59] Create IOChannelFactoryRegistrar interface and its gcs/file

[lcwik] [BEAM-59] Use ServiceLoader to register IOChannelFactories in

[tgroh] Update StarterPipeline

[klk] Reject stateful DoFn in FlinkRunner

[tgroh] Simplify the API for managing MetricsEnvironment

[klk] Output Keyed Bundles in GroupAlsoByWindowEvaluator

[klk] Add TransformHierarchyTest

--
[...truncated 8129 lines...]
Generating 

Generating 

Building index for all the packages and classes...
Generating 

Generating 

Generating 

Building index for all classes...
Generating 

Generating 

Generating 

Generating 

Generating 

[INFO] Building jar: 

[INFO] 
[INFO] --- maven-source-plugin:2.4:jar-no-fork (attach-sources) @ 
beam-runners-apex ---
[INFO] Building jar: 

[INFO] 
[INFO] --- maven-source-plugin:2.4:test-jar-no-fork (attach-test-sources) @ 
beam-runners-apex ---
[INFO] Building jar: 

[INFO] 
[INFO] --- maven-jar-plugin:2.5:test-jar (default-test-jar) @ beam-runners-apex 
---
[INFO] Building jar: 

[INFO] 
[INFO] --- maven-surefire-plugin:2.19.1:test (runnable-on-service-tests) @ 
beam-runners-apex ---
[INFO] Tests are skipped.
[INFO] 
[INFO] --- maven-dependency-plugin:2.10:analyze-only (default) @ 
beam-runners-apex ---
[WARNING] Used undeclared dependencies found:
[WARNING]commons-io:commons-io:jar:2.4:compile
[WARNING]com.datatorrent:netlet:jar:1.3.0:compile
[WARNING]org.apache.hadoop:hadoop-common:jar:2.6.0:compile
[INFO] 
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Beam :: Parent .. SUCCESS [ 17.003 s]
[INFO] Apache Beam :: SDKs :: Java :: Build Tools . SUCCESS [  7.625 s]
[INFO] Apache Beam :: SDKs  SUCCESS [  8.848 s]
[INFO] Apache Beam :: SDKs :: Java  SUCCESS [  4.922 s]
[INFO] Apache Beam :: SDKs :: Java :: Core  SUCCESS [03:38 min]
[INFO] Apache Beam :: Runners . SUCCESS [  3.516 s]
[INFO] Apache Beam :: Runners :: Core Java  SUCCESS [ 52.642 s]
[INFO] Apache Beam :: Runners :: Direct Java .. SUCCESS [02:04 min]
[INFO] Apache Beam :: Runners :: Google Cloud Dataflow  SUCCESS [ 28.671 s]
[INFO] Apache Beam :: SDKs :: Java :: IO .. SUCCESS [  3.098 s]
[INFO] Apache Beam :: SDKs :: Java :: IO :: Google Cloud Platform SUCCESS 
[04:17 min]
[INFO] Apache Beam :: SDKs :: Java :: IO :: HDFS .. SUCCESS [ 25.103 s]
[INFO] Apache Beam :: SDKs :: Java :: IO :: JMS ... SUCCESS [ 13.724 s]
[INFO] Apache Beam :: SDKs :: Java :: IO :: Kafka . SUCCESS [ 18.410 s]
[INFO] Apache Beam :: SDKs :: Java :: IO :: Kinesis ... SUCCESS [ 23.258 s]

Re: Hosting data stores for IO Transform testing

2016-11-22 Thread Jean-Baptiste Onofré

Hi Ismaël,

FYI, we also test the IOs on spark and flink small clusters (not yet 
apex): it's where I'm using Mesos/Marathon.


It's not a large cluster, but the integration tests are performed (by 
hand) on clusters.


We already discussed with Stephan and Jason to use Marathon JSON and 
Mesos docker images bootstrapped by Jenkins for the itests.


Regards
JB

On 11/22/2016 04:58 PM, Ismaël Mejía wrote:

​Hello,

@Stephen Thanks for your proposal, it is really interesting, I would really
like to help with this. I have never played with Kubernetes but this seems
a really nice chance to do something useful with it.

We (at Talend) are testing most of the IOs using simple container images
and in some particular cases ‘clusters’ of containers using docker-compose
(a little bit like Amit’s (2) proposal). It would be really nice to have
this at the Beam level, in particular to try to test more complex
semantics, I don’t know how programmable kubernetes is to achieve this for
example:

Let’s think we have a cluster of Cassandra or Kafka nodes, I would like to
have programmatic tests to simulate failure (e.g. kill a node), or simulate
a really slow node, to ensure that the IO behaves as expected in the Beam
pipeline for the given runner.

Another related idea is to improve IO consistency: Today the different IOs
have small differences in their failure behavior, I really would like to be
able to predict with more precision what will happen in case of errors,
e.g. what is the correct behavior if I am writing to a Kafka node and there
is a network partition, does the Kafka sink retries or no ? and what if it
is the JdbcIO ?, will it work the same e.g. assuming checkpointing? Or do
we guarantee exactly once writes somehow?, today I am not sure about what
happens (or if the expected behavior depends on the runner), but well maybe
it is just that I don’t know and we have tests to ensure this.

Of course both are really hard problems, but I think with your proposal we
can try to tackle them, as well as the performance ones. And apart of the
data stores, I think it will be also really nice to be able to test the
runners in a distributed manner.

So what is the next step? How do you imagine such integration tests? ? Who
can provide the test machines so we can mount the cluster?

Maybe my ideas are a bit too far away for an initial setup, but it will be
really nice to start working on this.

Ismael​


On Tue, Nov 22, 2016 at 11:00 AM, Amit Sela  wrote:


Hi Stephen,

I was wondering about how we plan to use the data stores across executions.

Clearly, it's best to setup a new instance (container) for every test,
running a "standalone" store (say HBase/Cassandra for example), and once
the test is done, teardown the instance. It should also be agnostic to the
runtime environment (e.g., Docker on Kubernetes).
I'm wondering though what's the overhead of managing such a deployment
which could become heavy and complicated as more IOs are supported and more
test cases introduced.

Another way to go would be to have small clusters of different data stores
and run against new "namespaces" (while lazily evicting old ones), but I
think this is less likely as maintaining a distributed instance (even a
small one) for each data store sounds even more complex.

A third approach would be to to simply have an "embedded" in-memory
instance of a data store as part of a test that runs against it (such as an
embedded Kafka, though not a data store).
This is probably the simplest solution in terms of orchestration, but it
depends on having a proper "embedded" implementation for an IO.

Does this make sense to you ? have you considered it ?

Thanks,
Amit

On Tue, Nov 22, 2016 at 8:20 AM Jean-Baptiste Onofré 
wrote:


Hi Stephen,

as already discussed a bit together, it sounds great ! Especially I like
it as a both integration test platform and good coverage for IOs.

I'm very late on this but, as said, I will share with you my Marathon
JSON and Mesos docker images.

By the way, I started to experiment a bit kubernetes and swamp but it's
not yet complete. I will share what I have on the same github repo.

Thanks !
Regards
JB

On 11/16/2016 11:36 PM, Stephen Sisk wrote:

Hi everyone!

Currently we have a good set of unit tests for our IO Transforms -

those

tend to run against in-memory versions of the data stores. However,

we'd

like to further increase our test coverage to include running them

against

real instances of the data stores that the IO Transforms work against

(e.g.

cassandra, mongodb, kafka, etc…), which means we'll need to have real
instances of various data stores.

Additionally, if we want to do performance regression detection, it's
important to have instances of the services that behave realistically,
which isn't true of in-memory or dev versions of the services.


Proposed solution
-
If we accept this proposal, we would create an infrastructure for

running

Re: [DISCUSS] Graduation to a top-level project

2016-11-22 Thread Stephen Sisk
+1 I'm excited to see the engagement of the community here.

S

On Tue, Nov 22, 2016 at 1:22 PM Ismaël Mejía  wrote:

> +1
> Excellent and congratulations everyone for the great work !
>
> On Tue, Nov 22, 2016 at 10:21 PM, Sergio Fernández 
> wrote:
>
> > As an external person who has been following the podling very closely
> from
> > the very beginning, I definetelly think you are ready for graduation :-)
> >
> > On Nov 22, 2016 19:19, "Davor Bonaci"  wrote:
> >
> > > Hi everyone,
> > > With all the progress we’ve had recently in Apache Beam, I think it is
> > time
> > > we start the discussion about graduation as a new top-level project at
> > the
> > > Apache Software Foundation.
> > >
> > > Graduation means we are a self-sustaining and self-governing community,
> > and
> > > ready to be a full participant in the Apache Software Foundation. It
> does
> > > not imply that our community growth is complete or that a particular
> > level
> > > of technical maturity has been reached, rather that we are on a solid
> > > trajectory in those areas. After graduation, we will still periodically
> > > report to, and be overseen by, the ASF Board to ensure continued growth
> > of
> > > a healthy community.
> > >
> > > Graduation is an important milestone for the project. It is also key to
> > > further grow the user community: many users (incorrectly) see
> incubation
> > as
> > > a sign of instability and are much less likely to consider us for a
> > > production use.
> > >
> > > A way to think about graduation readiness is through the Apache
> Maturity
> > > Model [1]. I think we clearly satisfy all the requirements [2]. It is
> > > probably worth emphasizing the recent community growth: over each of
> the
> > > past three months, no single organization contributing to Beam has had
> > more
> > > than ~50% of the unique contributors per month [2, see assumptions].
> > That’s
> > > a great statistic that shows how much we’ve grown our diversity!
> > >
> > > Process-wise, graduation consists of drafting a board resolution, which
> > > needs to identify the full Project Management Committee, and getting it
> > > approved by the community, the Incubator, and the Board. Within the
> Beam
> > > community, most of these discussions and votes have to be on the
> private@
> > > mailing list, but, as usual, we’ll try to keep dev@ updated as much as
> > > possible.
> > >
> > > With that in mind, let’s use this discussion on dev@ for two things:
> > > * Collect additional data points on our progress that we may want to
> > > present to the Incubator as a part of the proposal to accept our
> > > graduation.
> > > * Determine whether the community supports graduation. Please reply
> +1/-1
> > > with any additional comments, as appropriate. I’d encourage everyone to
> > > participate -- regardless whether you are an occasional visitor or
> have a
> > > specific role in the project -- we’d love to hear your perspective.
> > >
> > > Data points so far:
> > > * Project’s maturity self-assessment [2].
> > > * 1500 pull requests in incubation, which makes us one of the most
> active
> > > project across all of ASF on this metric.
> > > * 3 releases, each driven by a different release manager.
> > > * 120+ individual contributors.
> > > * 3 new committers added, 2 of which aren’t from the largest
> > organization.
> > > * 1027 issues created, 515 resolved.
> > > * 442 dev@ emails in October alone, sent by 51 individuals.
> > > * 50 user@ emails in the last 30 days, sent by 22 individuals.
> > >
> > > Thanks!
> > >
> > > Davor
> > >
> > > [1] http://community.apache.org/apache-way/apache-project-
> > > maturity-model.html
> > > [2] http://beam.incubator.apache.org/contribute/maturity-model/
> > >
> >
>


Re: Hosting data stores for IO Transform testing

2016-11-22 Thread Stephen Sisk
Hi,

I'm excited we're getting lots of discussion going. There are many threads
of conversation here, we may choose to split some of them off into a
different email thread. I'm also betting I missed some of the questions in
this thread, so apologies ahead of time for that. Also apologies for the
amount of text, I provided some quick summaries at the top of each section.

Amit - thanks for your thoughts. I've responded in detail below.
Ismael - thanks for offering to help. There's plenty of work here to go
around. I'll try and think about how we can divide up some next steps
(probably in a separate thread.) The main next step I see is deciding
between kubernetes/mesos+marathon/docker swarm - I'm working on that, but
having lots of different thoughts on what the advantages/disadvantages of
those are would be helpful (I'm not entirely sure of the protocol for
collaborating on sub-projects like this.)

These issues are all related to what kind of tests we want to write. I
think a kubernetes/mesos/swarm cluster could support all the use cases
we've discussed here (and thus should not block moving forward with this),
but understanding what we want to test will help us understand how the
cluster will be used. I'm working on a proposed user guide for testing IO
Transforms, and I'm going to send out a link to that + a short summary to
the list shortly so folks can get a better sense of where I'm coming from.



Here's my thinking on the questions we've raised here -

Embedded versions of data stores for testing

Summary: yes! But we still need real data stores to test against.

I am a gigantic fan of using embedded versions of the various data stores.
I think we should test everything we possibly can using them, and do the
majority of our correctness testing using embedded versions + the direct
runner. However, it's also important to have at least one test that
actually connects to an actual instance, so we can get coverage for things
like credentials, real connection strings, etc...

The key point is that embedded versions definitely can't cover the
performance tests, so we need to host instances if we want to test that.

I consider the integration tests/performance benchmarks to be costly things
that we do only for the IO transforms with large amounts of community
support/usage. A random IO transform used by a few users doesn't
necessarily need integration & perf tests, but for heavily used IO
transforms, there's a lot of community value in these tests. The
maintenance proposal below scales with the amount of community support for
a particular IO transform.



Reusing data stores ("use the data stores across executions.")
--
Summary: I favor a hybrid approach: some frequently used, very small
instances that we keep up all the time + larger multi-container data store
instances that we spin up for perf tests.

I don't think we need to have a strong answer to this question, but I think
we do need to know what range of capabilities we need, and use that to
inform our requirements on the hosting infrastructure. I think
kubernetes/mesos + docker can support all the scenarios I discuss below.

I had been thinking of a hybrid approach - reuse some instances and don't
reuse others. Some tests require isolation from other tests (eg.
performance benchmarking), while others can easily re-use the same
database/data store instance over time, provided they are written in the
correct manner (eg. a simple read or write correctness integration tests)

To me, the question of whether to use one instance over time for a test vs
spin up an instance for each test comes down to a trade off between these
factors:
1. Flakiness of spin-up of an instance - if it's super flaky, we'll want to
keep more instances up and running rather than bring them up/down. (this
may also vary by the data store in question)
2. Frequency of testing - if we are running tests every 5 minutes, it may
be wasteful to bring machines up/down every time. If we run tests once a
day or week, it seems wasteful to keep the machines up the whole time.
3. Isolation requirements - If tests must be isolated, it means we either
have to bring up the instances for each test, or we have to have some sort
of signaling mechanism to indicate that a given instance is in use. I
strongly favor bringing up an instance per test.
4. Number/size of containers - if we need a large number of machines for a
particular test, keeping them running all the time will use more resources.


The major unknown to me is how flaky it'll be to spin these up. I'm
hopeful/assuming they'll be pretty stable to bring up, but I think the best
way to test that is to start doing it.

I suspect the sweet spot is the following: have a set of very small data
store instances that stay up to support small-data-size post-commit end to
end tests (post-commits run frequently and the data size means the
instances would not use many resources), combined with the ability to spin
up 

Re: [DISCUSS] Graduation to a top-level project

2016-11-22 Thread Ismaël Mejía
+1
Excellent and congratulations everyone for the great work !

On Tue, Nov 22, 2016 at 10:21 PM, Sergio Fernández 
wrote:

> As an external person who has been following the podling very closely from
> the very beginning, I definetelly think you are ready for graduation :-)
>
> On Nov 22, 2016 19:19, "Davor Bonaci"  wrote:
>
> > Hi everyone,
> > With all the progress we’ve had recently in Apache Beam, I think it is
> time
> > we start the discussion about graduation as a new top-level project at
> the
> > Apache Software Foundation.
> >
> > Graduation means we are a self-sustaining and self-governing community,
> and
> > ready to be a full participant in the Apache Software Foundation. It does
> > not imply that our community growth is complete or that a particular
> level
> > of technical maturity has been reached, rather that we are on a solid
> > trajectory in those areas. After graduation, we will still periodically
> > report to, and be overseen by, the ASF Board to ensure continued growth
> of
> > a healthy community.
> >
> > Graduation is an important milestone for the project. It is also key to
> > further grow the user community: many users (incorrectly) see incubation
> as
> > a sign of instability and are much less likely to consider us for a
> > production use.
> >
> > A way to think about graduation readiness is through the Apache Maturity
> > Model [1]. I think we clearly satisfy all the requirements [2]. It is
> > probably worth emphasizing the recent community growth: over each of the
> > past three months, no single organization contributing to Beam has had
> more
> > than ~50% of the unique contributors per month [2, see assumptions].
> That’s
> > a great statistic that shows how much we’ve grown our diversity!
> >
> > Process-wise, graduation consists of drafting a board resolution, which
> > needs to identify the full Project Management Committee, and getting it
> > approved by the community, the Incubator, and the Board. Within the Beam
> > community, most of these discussions and votes have to be on the private@
> > mailing list, but, as usual, we’ll try to keep dev@ updated as much as
> > possible.
> >
> > With that in mind, let’s use this discussion on dev@ for two things:
> > * Collect additional data points on our progress that we may want to
> > present to the Incubator as a part of the proposal to accept our
> > graduation.
> > * Determine whether the community supports graduation. Please reply +1/-1
> > with any additional comments, as appropriate. I’d encourage everyone to
> > participate -- regardless whether you are an occasional visitor or have a
> > specific role in the project -- we’d love to hear your perspective.
> >
> > Data points so far:
> > * Project’s maturity self-assessment [2].
> > * 1500 pull requests in incubation, which makes us one of the most active
> > project across all of ASF on this metric.
> > * 3 releases, each driven by a different release manager.
> > * 120+ individual contributors.
> > * 3 new committers added, 2 of which aren’t from the largest
> organization.
> > * 1027 issues created, 515 resolved.
> > * 442 dev@ emails in October alone, sent by 51 individuals.
> > * 50 user@ emails in the last 30 days, sent by 22 individuals.
> >
> > Thanks!
> >
> > Davor
> >
> > [1] http://community.apache.org/apache-way/apache-project-
> > maturity-model.html
> > [2] http://beam.incubator.apache.org/contribute/maturity-model/
> >
>


Re: [DISCUSS] Graduation to a top-level project

2016-11-22 Thread Sergio Fernández
As an external person who has been following the podling very closely from
the very beginning, I definetelly think you are ready for graduation :-)

On Nov 22, 2016 19:19, "Davor Bonaci"  wrote:

> Hi everyone,
> With all the progress we’ve had recently in Apache Beam, I think it is time
> we start the discussion about graduation as a new top-level project at the
> Apache Software Foundation.
>
> Graduation means we are a self-sustaining and self-governing community, and
> ready to be a full participant in the Apache Software Foundation. It does
> not imply that our community growth is complete or that a particular level
> of technical maturity has been reached, rather that we are on a solid
> trajectory in those areas. After graduation, we will still periodically
> report to, and be overseen by, the ASF Board to ensure continued growth of
> a healthy community.
>
> Graduation is an important milestone for the project. It is also key to
> further grow the user community: many users (incorrectly) see incubation as
> a sign of instability and are much less likely to consider us for a
> production use.
>
> A way to think about graduation readiness is through the Apache Maturity
> Model [1]. I think we clearly satisfy all the requirements [2]. It is
> probably worth emphasizing the recent community growth: over each of the
> past three months, no single organization contributing to Beam has had more
> than ~50% of the unique contributors per month [2, see assumptions]. That’s
> a great statistic that shows how much we’ve grown our diversity!
>
> Process-wise, graduation consists of drafting a board resolution, which
> needs to identify the full Project Management Committee, and getting it
> approved by the community, the Incubator, and the Board. Within the Beam
> community, most of these discussions and votes have to be on the private@
> mailing list, but, as usual, we’ll try to keep dev@ updated as much as
> possible.
>
> With that in mind, let’s use this discussion on dev@ for two things:
> * Collect additional data points on our progress that we may want to
> present to the Incubator as a part of the proposal to accept our
> graduation.
> * Determine whether the community supports graduation. Please reply +1/-1
> with any additional comments, as appropriate. I’d encourage everyone to
> participate -- regardless whether you are an occasional visitor or have a
> specific role in the project -- we’d love to hear your perspective.
>
> Data points so far:
> * Project’s maturity self-assessment [2].
> * 1500 pull requests in incubation, which makes us one of the most active
> project across all of ASF on this metric.
> * 3 releases, each driven by a different release manager.
> * 120+ individual contributors.
> * 3 new committers added, 2 of which aren’t from the largest organization.
> * 1027 issues created, 515 resolved.
> * 442 dev@ emails in October alone, sent by 51 individuals.
> * 50 user@ emails in the last 30 days, sent by 22 individuals.
>
> Thanks!
>
> Davor
>
> [1] http://community.apache.org/apache-way/apache-project-
> maturity-model.html
> [2] http://beam.incubator.apache.org/contribute/maturity-model/
>


Re: [DISCUSS] Graduation to a top-level project

2016-11-22 Thread Mark Liu
Huge +1

On Tue, Nov 22, 2016 at 1:01 PM, Hadar Hod 
wrote:

> +1 !!!
>
> On Tue, Nov 22, 2016 at 12:54 PM, Jesse Anderson 
> wrote:
>
> > +1
> >
> > On Tue, Nov 22, 2016 at 12:35 PM Frances Perry 
> > wrote:
> >
> > > +1  You might even say I'm beaming with pride ;-)
> > >
> > > On Tue, Nov 22, 2016 at 11:58 AM, Kenneth Knowles
>  > >
> > > wrote:
> > >
> > > > +1 !!!
> > > >
> > > > I especially love how the diversity of the community has contributed
> to
> > > the
> > > > conceptual growth and quality of Beam. I can't wait for more!
> > > >
> > > > On Tue, Nov 22, 2016 at 11:22 AM, Thomas Groh
>  > >
> > > > wrote:
> > > >
> > > > > +1
> > > > >
> > > > > It's been a thrilling experience thus far, and I'm excited for the
> > > > future.
> > > > >
> > > > > On Tue, Nov 22, 2016 at 11:07 AM, Aljoscha Krettek <
> > > aljos...@apache.org>
> > > > > wrote:
> > > > >
> > > > > > +1
> > > > > >
> > > > > > I'm quite enthusiastic about the growth of the community and the
> > open
> > > > > > discussions!
> > > > > >
> > > > > > On Tue, 22 Nov 2016 at 19:51 Jason Kuster <
> jasonkus...@google.com.
> > > > > invalid>
> > > > > > wrote:
> > > > > >
> > > > > > > An enthusiastic +1!
> > > > > > >
> > > > > > > In particular it's been really great to see the commitment and
> > > > interest
> > > > > > of
> > > > > > > the community in different kinds of testing. Between what we
> > > > currently
> > > > > > have
> > > > > > > on Jenkins and Travis and the in-progress work on IO
> integration
> > > > tests
> > > > > > and
> > > > > > > performance tests (plus, I'm sure, other things I'm not aware
> of)
> > > > we're
> > > > > > in
> > > > > > > a really good place.
> > > > > > >
> > > > > > > On Tue, Nov 22, 2016 at 10:49 AM, Amit Sela <
> > amitsel...@gmail.com>
> > > > > > wrote:
> > > > > > >
> > > > > > > > +1, super exciting!
> > > > > > > >
> > > > > > > > Thanks to JB, Davor and the whole team for creating this
> > > > community. I
> > > > > > > think
> > > > > > > > we've achieved a lot in a short time.
> > > > > > > >
> > > > > > > > Amit.
> > > > > > > >
> > > > > > > > On Tue, Nov 22, 2016, 20:36 Tyler Akidau
> > > >  > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > +1, thanks to everyone who's invested time getting us to
> this
> > > > > point.
> > > > > > > :-)
> > > > > > > > >
> > > > > > > > > -Tyler
> > > > > > > > >
> > > > > > > > > On Tue, Nov 22, 2016 at 10:33 AM Jean-Baptiste Onofré <
> > > > > > j...@nanthrax.net
> > > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi,
> > > > > > > > > >
> > > > > > > > > > First of all, I would like to thank the whole team, and
> > > > > especially
> > > > > > > > Davor
> > > > > > > > > > for the great work and commitment to Apache and the
> > > community.
> > > > > > > > > >
> > > > > > > > > > Of course, a big +1 to move forward on graduation !
> > > > > > > > > >
> > > > > > > > > > Regards
> > > > > > > > > > JB
> > > > > > > > > >
> > > > > > > > > > On 11/22/2016 07:19 PM, Davor Bonaci wrote:
> > > > > > > > > > > Hi everyone,
> > > > > > > > > > > With all the progress we’ve had recently in Apache
> Beam,
> > I
> > > > > think
> > > > > > it
> > > > > > > > is
> > > > > > > > > > time
> > > > > > > > > > > we start the discussion about graduation as a new
> > top-level
> > > > > > project
> > > > > > > > at
> > > > > > > > > > the
> > > > > > > > > > > Apache Software Foundation.
> > > > > > > > > > >
> > > > > > > > > > > Graduation means we are a self-sustaining and
> > > self-governing
> > > > > > > > community,
> > > > > > > > > > and
> > > > > > > > > > > ready to be a full participant in the Apache Software
> > > > > Foundation.
> > > > > > > It
> > > > > > > > > does
> > > > > > > > > > > not imply that our community growth is complete or
> that a
> > > > > > > particular
> > > > > > > > > > level
> > > > > > > > > > > of technical maturity has been reached, rather that we
> > are
> > > > on a
> > > > > > > solid
> > > > > > > > > > > trajectory in those areas. After graduation, we will
> > still
> > > > > > > > periodically
> > > > > > > > > > > report to, and be overseen by, the ASF Board to ensure
> > > > > continued
> > > > > > > > growth
> > > > > > > > > > of
> > > > > > > > > > > a healthy community.
> > > > > > > > > > >
> > > > > > > > > > > Graduation is an important milestone for the project.
> It
> > is
> > > > > also
> > > > > > > key
> > > > > > > > to
> > > > > > > > > > > further grow the user community: many users
> (incorrectly)
> > > see
> > > > > > > > > incubation
> > > > > > > > > > as
> > > > > > > > > > > a sign of instability and are much less likely to
> > consider
> > > us
> > > > > > for a
> > > > > > > > > > > production use.
> > > > > > > > > > >
> > > > > > > > > > > A way to think about graduation readiness is 

Re: [DISCUSS] Graduation to a top-level project

2016-11-22 Thread Neelesh Salian
+1
Some of us met recently during the Apache Con week and we had a great time
discussing about the project and what can be done to better things to move
forward.
Great to see the progress and growth!


On Tue, Nov 22, 2016 at 10:01 PM, Hadar Hod 
wrote:

> +1 !!!
>
> On Tue, Nov 22, 2016 at 12:54 PM, Jesse Anderson 
> wrote:
>
> > +1
> >
> > On Tue, Nov 22, 2016 at 12:35 PM Frances Perry 
> > wrote:
> >
> > > +1  You might even say I'm beaming with pride ;-)
> > >
> > > On Tue, Nov 22, 2016 at 11:58 AM, Kenneth Knowles
>  > >
> > > wrote:
> > >
> > > > +1 !!!
> > > >
> > > > I especially love how the diversity of the community has contributed
> to
> > > the
> > > > conceptual growth and quality of Beam. I can't wait for more!
> > > >
> > > > On Tue, Nov 22, 2016 at 11:22 AM, Thomas Groh
>  > >
> > > > wrote:
> > > >
> > > > > +1
> > > > >
> > > > > It's been a thrilling experience thus far, and I'm excited for the
> > > > future.
> > > > >
> > > > > On Tue, Nov 22, 2016 at 11:07 AM, Aljoscha Krettek <
> > > aljos...@apache.org>
> > > > > wrote:
> > > > >
> > > > > > +1
> > > > > >
> > > > > > I'm quite enthusiastic about the growth of the community and the
> > open
> > > > > > discussions!
> > > > > >
> > > > > > On Tue, 22 Nov 2016 at 19:51 Jason Kuster <
> jasonkus...@google.com.
> > > > > invalid>
> > > > > > wrote:
> > > > > >
> > > > > > > An enthusiastic +1!
> > > > > > >
> > > > > > > In particular it's been really great to see the commitment and
> > > > interest
> > > > > > of
> > > > > > > the community in different kinds of testing. Between what we
> > > > currently
> > > > > > have
> > > > > > > on Jenkins and Travis and the in-progress work on IO
> integration
> > > > tests
> > > > > > and
> > > > > > > performance tests (plus, I'm sure, other things I'm not aware
> of)
> > > > we're
> > > > > > in
> > > > > > > a really good place.
> > > > > > >
> > > > > > > On Tue, Nov 22, 2016 at 10:49 AM, Amit Sela <
> > amitsel...@gmail.com>
> > > > > > wrote:
> > > > > > >
> > > > > > > > +1, super exciting!
> > > > > > > >
> > > > > > > > Thanks to JB, Davor and the whole team for creating this
> > > > community. I
> > > > > > > think
> > > > > > > > we've achieved a lot in a short time.
> > > > > > > >
> > > > > > > > Amit.
> > > > > > > >
> > > > > > > > On Tue, Nov 22, 2016, 20:36 Tyler Akidau
> > > >  > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > +1, thanks to everyone who's invested time getting us to
> this
> > > > > point.
> > > > > > > :-)
> > > > > > > > >
> > > > > > > > > -Tyler
> > > > > > > > >
> > > > > > > > > On Tue, Nov 22, 2016 at 10:33 AM Jean-Baptiste Onofré <
> > > > > > j...@nanthrax.net
> > > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi,
> > > > > > > > > >
> > > > > > > > > > First of all, I would like to thank the whole team, and
> > > > > especially
> > > > > > > > Davor
> > > > > > > > > > for the great work and commitment to Apache and the
> > > community.
> > > > > > > > > >
> > > > > > > > > > Of course, a big +1 to move forward on graduation !
> > > > > > > > > >
> > > > > > > > > > Regards
> > > > > > > > > > JB
> > > > > > > > > >
> > > > > > > > > > On 11/22/2016 07:19 PM, Davor Bonaci wrote:
> > > > > > > > > > > Hi everyone,
> > > > > > > > > > > With all the progress we’ve had recently in Apache
> Beam,
> > I
> > > > > think
> > > > > > it
> > > > > > > > is
> > > > > > > > > > time
> > > > > > > > > > > we start the discussion about graduation as a new
> > top-level
> > > > > > project
> > > > > > > > at
> > > > > > > > > > the
> > > > > > > > > > > Apache Software Foundation.
> > > > > > > > > > >
> > > > > > > > > > > Graduation means we are a self-sustaining and
> > > self-governing
> > > > > > > > community,
> > > > > > > > > > and
> > > > > > > > > > > ready to be a full participant in the Apache Software
> > > > > Foundation.
> > > > > > > It
> > > > > > > > > does
> > > > > > > > > > > not imply that our community growth is complete or
> that a
> > > > > > > particular
> > > > > > > > > > level
> > > > > > > > > > > of technical maturity has been reached, rather that we
> > are
> > > > on a
> > > > > > > solid
> > > > > > > > > > > trajectory in those areas. After graduation, we will
> > still
> > > > > > > > periodically
> > > > > > > > > > > report to, and be overseen by, the ASF Board to ensure
> > > > > continued
> > > > > > > > growth
> > > > > > > > > > of
> > > > > > > > > > > a healthy community.
> > > > > > > > > > >
> > > > > > > > > > > Graduation is an important milestone for the project.
> It
> > is
> > > > > also
> > > > > > > key
> > > > > > > > to
> > > > > > > > > > > further grow the user community: many users
> (incorrectly)
> > > see
> > > > > > > > > incubation
> > > > > > > > > > as
> > > > > > > > > > > a sign of instability and 

Re: [DISCUSS] Graduation to a top-level project

2016-11-22 Thread Hadar Hod
+1 !!!

On Tue, Nov 22, 2016 at 12:54 PM, Jesse Anderson 
wrote:

> +1
>
> On Tue, Nov 22, 2016 at 12:35 PM Frances Perry 
> wrote:
>
> > +1  You might even say I'm beaming with pride ;-)
> >
> > On Tue, Nov 22, 2016 at 11:58 AM, Kenneth Knowles  >
> > wrote:
> >
> > > +1 !!!
> > >
> > > I especially love how the diversity of the community has contributed to
> > the
> > > conceptual growth and quality of Beam. I can't wait for more!
> > >
> > > On Tue, Nov 22, 2016 at 11:22 AM, Thomas Groh  >
> > > wrote:
> > >
> > > > +1
> > > >
> > > > It's been a thrilling experience thus far, and I'm excited for the
> > > future.
> > > >
> > > > On Tue, Nov 22, 2016 at 11:07 AM, Aljoscha Krettek <
> > aljos...@apache.org>
> > > > wrote:
> > > >
> > > > > +1
> > > > >
> > > > > I'm quite enthusiastic about the growth of the community and the
> open
> > > > > discussions!
> > > > >
> > > > > On Tue, 22 Nov 2016 at 19:51 Jason Kuster  > > > invalid>
> > > > > wrote:
> > > > >
> > > > > > An enthusiastic +1!
> > > > > >
> > > > > > In particular it's been really great to see the commitment and
> > > interest
> > > > > of
> > > > > > the community in different kinds of testing. Between what we
> > > currently
> > > > > have
> > > > > > on Jenkins and Travis and the in-progress work on IO integration
> > > tests
> > > > > and
> > > > > > performance tests (plus, I'm sure, other things I'm not aware of)
> > > we're
> > > > > in
> > > > > > a really good place.
> > > > > >
> > > > > > On Tue, Nov 22, 2016 at 10:49 AM, Amit Sela <
> amitsel...@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > > +1, super exciting!
> > > > > > >
> > > > > > > Thanks to JB, Davor and the whole team for creating this
> > > community. I
> > > > > > think
> > > > > > > we've achieved a lot in a short time.
> > > > > > >
> > > > > > > Amit.
> > > > > > >
> > > > > > > On Tue, Nov 22, 2016, 20:36 Tyler Akidau
> > >  > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > +1, thanks to everyone who's invested time getting us to this
> > > > point.
> > > > > > :-)
> > > > > > > >
> > > > > > > > -Tyler
> > > > > > > >
> > > > > > > > On Tue, Nov 22, 2016 at 10:33 AM Jean-Baptiste Onofré <
> > > > > j...@nanthrax.net
> > > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi,
> > > > > > > > >
> > > > > > > > > First of all, I would like to thank the whole team, and
> > > > especially
> > > > > > > Davor
> > > > > > > > > for the great work and commitment to Apache and the
> > community.
> > > > > > > > >
> > > > > > > > > Of course, a big +1 to move forward on graduation !
> > > > > > > > >
> > > > > > > > > Regards
> > > > > > > > > JB
> > > > > > > > >
> > > > > > > > > On 11/22/2016 07:19 PM, Davor Bonaci wrote:
> > > > > > > > > > Hi everyone,
> > > > > > > > > > With all the progress we’ve had recently in Apache Beam,
> I
> > > > think
> > > > > it
> > > > > > > is
> > > > > > > > > time
> > > > > > > > > > we start the discussion about graduation as a new
> top-level
> > > > > project
> > > > > > > at
> > > > > > > > > the
> > > > > > > > > > Apache Software Foundation.
> > > > > > > > > >
> > > > > > > > > > Graduation means we are a self-sustaining and
> > self-governing
> > > > > > > community,
> > > > > > > > > and
> > > > > > > > > > ready to be a full participant in the Apache Software
> > > > Foundation.
> > > > > > It
> > > > > > > > does
> > > > > > > > > > not imply that our community growth is complete or that a
> > > > > > particular
> > > > > > > > > level
> > > > > > > > > > of technical maturity has been reached, rather that we
> are
> > > on a
> > > > > > solid
> > > > > > > > > > trajectory in those areas. After graduation, we will
> still
> > > > > > > periodically
> > > > > > > > > > report to, and be overseen by, the ASF Board to ensure
> > > > continued
> > > > > > > growth
> > > > > > > > > of
> > > > > > > > > > a healthy community.
> > > > > > > > > >
> > > > > > > > > > Graduation is an important milestone for the project. It
> is
> > > > also
> > > > > > key
> > > > > > > to
> > > > > > > > > > further grow the user community: many users (incorrectly)
> > see
> > > > > > > > incubation
> > > > > > > > > as
> > > > > > > > > > a sign of instability and are much less likely to
> consider
> > us
> > > > > for a
> > > > > > > > > > production use.
> > > > > > > > > >
> > > > > > > > > > A way to think about graduation readiness is through the
> > > Apache
> > > > > > > > Maturity
> > > > > > > > > > Model [1]. I think we clearly satisfy all the
> requirements
> > > [2].
> > > > > It
> > > > > > is
> > > > > > > > > > probably worth emphasizing the recent community growth:
> > over
> > > > each
> > > > > > of
> > > > > > > > the
> > > > > > > > > > past three months, no single organization contributing to
> > > Beam
> > > > > has
> > > > > > > had
> > > 

Re: [DISCUSS] Graduation to a top-level project

2016-11-22 Thread Luciano Resende
+1

On Tue, Nov 22, 2016 at 10:19 AM, Davor Bonaci  wrote:

> Hi everyone,
> With all the progress we’ve had recently in Apache Beam, I think it is time
> we start the discussion about graduation as a new top-level project at the
> Apache Software Foundation.
>
> Graduation means we are a self-sustaining and self-governing community, and
> ready to be a full participant in the Apache Software Foundation. It does
> not imply that our community growth is complete or that a particular level
> of technical maturity has been reached, rather that we are on a solid
> trajectory in those areas. After graduation, we will still periodically
> report to, and be overseen by, the ASF Board to ensure continued growth of
> a healthy community.
>
> Graduation is an important milestone for the project. It is also key to
> further grow the user community: many users (incorrectly) see incubation as
> a sign of instability and are much less likely to consider us for a
> production use.
>
> A way to think about graduation readiness is through the Apache Maturity
> Model [1]. I think we clearly satisfy all the requirements [2]. It is
> probably worth emphasizing the recent community growth: over each of the
> past three months, no single organization contributing to Beam has had more
> than ~50% of the unique contributors per month [2, see assumptions]. That’s
> a great statistic that shows how much we’ve grown our diversity!
>
> Process-wise, graduation consists of drafting a board resolution, which
> needs to identify the full Project Management Committee, and getting it
> approved by the community, the Incubator, and the Board. Within the Beam
> community, most of these discussions and votes have to be on the private@
> mailing list, but, as usual, we’ll try to keep dev@ updated as much as
> possible.
>
> With that in mind, let’s use this discussion on dev@ for two things:
> * Collect additional data points on our progress that we may want to
> present to the Incubator as a part of the proposal to accept our
> graduation.
> * Determine whether the community supports graduation. Please reply +1/-1
> with any additional comments, as appropriate. I’d encourage everyone to
> participate -- regardless whether you are an occasional visitor or have a
> specific role in the project -- we’d love to hear your perspective.
>
> Data points so far:
> * Project’s maturity self-assessment [2].
> * 1500 pull requests in incubation, which makes us one of the most active
> project across all of ASF on this metric.
> * 3 releases, each driven by a different release manager.
> * 120+ individual contributors.
> * 3 new committers added, 2 of which aren’t from the largest organization.
> * 1027 issues created, 515 resolved.
> * 442 dev@ emails in October alone, sent by 51 individuals.
> * 50 user@ emails in the last 30 days, sent by 22 individuals.
>
> Thanks!
>
> Davor
>
> [1] http://community.apache.org/apache-way/apache-project-
> maturity-model.html
> [2] http://beam.incubator.apache.org/contribute/maturity-model/
>



-- 
Luciano Resende
http://twitter.com/lresende1975
http://lresende.blogspot.com/


Re: [DISCUSS] Graduation to a top-level project

2016-11-22 Thread Jesse Anderson
+1

On Tue, Nov 22, 2016 at 12:35 PM Frances Perry 
wrote:

> +1  You might even say I'm beaming with pride ;-)
>
> On Tue, Nov 22, 2016 at 11:58 AM, Kenneth Knowles 
> wrote:
>
> > +1 !!!
> >
> > I especially love how the diversity of the community has contributed to
> the
> > conceptual growth and quality of Beam. I can't wait for more!
> >
> > On Tue, Nov 22, 2016 at 11:22 AM, Thomas Groh 
> > wrote:
> >
> > > +1
> > >
> > > It's been a thrilling experience thus far, and I'm excited for the
> > future.
> > >
> > > On Tue, Nov 22, 2016 at 11:07 AM, Aljoscha Krettek <
> aljos...@apache.org>
> > > wrote:
> > >
> > > > +1
> > > >
> > > > I'm quite enthusiastic about the growth of the community and the open
> > > > discussions!
> > > >
> > > > On Tue, 22 Nov 2016 at 19:51 Jason Kuster  > > invalid>
> > > > wrote:
> > > >
> > > > > An enthusiastic +1!
> > > > >
> > > > > In particular it's been really great to see the commitment and
> > interest
> > > > of
> > > > > the community in different kinds of testing. Between what we
> > currently
> > > > have
> > > > > on Jenkins and Travis and the in-progress work on IO integration
> > tests
> > > > and
> > > > > performance tests (plus, I'm sure, other things I'm not aware of)
> > we're
> > > > in
> > > > > a really good place.
> > > > >
> > > > > On Tue, Nov 22, 2016 at 10:49 AM, Amit Sela 
> > > > wrote:
> > > > >
> > > > > > +1, super exciting!
> > > > > >
> > > > > > Thanks to JB, Davor and the whole team for creating this
> > community. I
> > > > > think
> > > > > > we've achieved a lot in a short time.
> > > > > >
> > > > > > Amit.
> > > > > >
> > > > > > On Tue, Nov 22, 2016, 20:36 Tyler Akidau
> >  > > >
> > > > > > wrote:
> > > > > >
> > > > > > > +1, thanks to everyone who's invested time getting us to this
> > > point.
> > > > > :-)
> > > > > > >
> > > > > > > -Tyler
> > > > > > >
> > > > > > > On Tue, Nov 22, 2016 at 10:33 AM Jean-Baptiste Onofré <
> > > > j...@nanthrax.net
> > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > First of all, I would like to thank the whole team, and
> > > especially
> > > > > > Davor
> > > > > > > > for the great work and commitment to Apache and the
> community.
> > > > > > > >
> > > > > > > > Of course, a big +1 to move forward on graduation !
> > > > > > > >
> > > > > > > > Regards
> > > > > > > > JB
> > > > > > > >
> > > > > > > > On 11/22/2016 07:19 PM, Davor Bonaci wrote:
> > > > > > > > > Hi everyone,
> > > > > > > > > With all the progress we’ve had recently in Apache Beam, I
> > > think
> > > > it
> > > > > > is
> > > > > > > > time
> > > > > > > > > we start the discussion about graduation as a new top-level
> > > > project
> > > > > > at
> > > > > > > > the
> > > > > > > > > Apache Software Foundation.
> > > > > > > > >
> > > > > > > > > Graduation means we are a self-sustaining and
> self-governing
> > > > > > community,
> > > > > > > > and
> > > > > > > > > ready to be a full participant in the Apache Software
> > > Foundation.
> > > > > It
> > > > > > > does
> > > > > > > > > not imply that our community growth is complete or that a
> > > > > particular
> > > > > > > > level
> > > > > > > > > of technical maturity has been reached, rather that we are
> > on a
> > > > > solid
> > > > > > > > > trajectory in those areas. After graduation, we will still
> > > > > > periodically
> > > > > > > > > report to, and be overseen by, the ASF Board to ensure
> > > continued
> > > > > > growth
> > > > > > > > of
> > > > > > > > > a healthy community.
> > > > > > > > >
> > > > > > > > > Graduation is an important milestone for the project. It is
> > > also
> > > > > key
> > > > > > to
> > > > > > > > > further grow the user community: many users (incorrectly)
> see
> > > > > > > incubation
> > > > > > > > as
> > > > > > > > > a sign of instability and are much less likely to consider
> us
> > > > for a
> > > > > > > > > production use.
> > > > > > > > >
> > > > > > > > > A way to think about graduation readiness is through the
> > Apache
> > > > > > > Maturity
> > > > > > > > > Model [1]. I think we clearly satisfy all the requirements
> > [2].
> > > > It
> > > > > is
> > > > > > > > > probably worth emphasizing the recent community growth:
> over
> > > each
> > > > > of
> > > > > > > the
> > > > > > > > > past three months, no single organization contributing to
> > Beam
> > > > has
> > > > > > had
> > > > > > > > more
> > > > > > > > > than ~50% of the unique contributors per month [2, see
> > > > > assumptions].
> > > > > > > > That’s
> > > > > > > > > a great statistic that shows how much we’ve grown our
> > > diversity!
> > > > > > > > >
> > > > > > > > > Process-wise, graduation consists of drafting a board
> > > resolution,
> > > > > > which
> > > > > > > > > needs to identify the full Project Management Committee,
> 

Re: [DISCUSS] Graduation to a top-level project

2016-11-22 Thread Frances Perry
+1  You might even say I'm beaming with pride ;-)

On Tue, Nov 22, 2016 at 11:58 AM, Kenneth Knowles 
wrote:

> +1 !!!
>
> I especially love how the diversity of the community has contributed to the
> conceptual growth and quality of Beam. I can't wait for more!
>
> On Tue, Nov 22, 2016 at 11:22 AM, Thomas Groh 
> wrote:
>
> > +1
> >
> > It's been a thrilling experience thus far, and I'm excited for the
> future.
> >
> > On Tue, Nov 22, 2016 at 11:07 AM, Aljoscha Krettek 
> > wrote:
> >
> > > +1
> > >
> > > I'm quite enthusiastic about the growth of the community and the open
> > > discussions!
> > >
> > > On Tue, 22 Nov 2016 at 19:51 Jason Kuster  > invalid>
> > > wrote:
> > >
> > > > An enthusiastic +1!
> > > >
> > > > In particular it's been really great to see the commitment and
> interest
> > > of
> > > > the community in different kinds of testing. Between what we
> currently
> > > have
> > > > on Jenkins and Travis and the in-progress work on IO integration
> tests
> > > and
> > > > performance tests (plus, I'm sure, other things I'm not aware of)
> we're
> > > in
> > > > a really good place.
> > > >
> > > > On Tue, Nov 22, 2016 at 10:49 AM, Amit Sela 
> > > wrote:
> > > >
> > > > > +1, super exciting!
> > > > >
> > > > > Thanks to JB, Davor and the whole team for creating this
> community. I
> > > > think
> > > > > we've achieved a lot in a short time.
> > > > >
> > > > > Amit.
> > > > >
> > > > > On Tue, Nov 22, 2016, 20:36 Tyler Akidau
>  > >
> > > > > wrote:
> > > > >
> > > > > > +1, thanks to everyone who's invested time getting us to this
> > point.
> > > > :-)
> > > > > >
> > > > > > -Tyler
> > > > > >
> > > > > > On Tue, Nov 22, 2016 at 10:33 AM Jean-Baptiste Onofré <
> > > j...@nanthrax.net
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > First of all, I would like to thank the whole team, and
> > especially
> > > > > Davor
> > > > > > > for the great work and commitment to Apache and the community.
> > > > > > >
> > > > > > > Of course, a big +1 to move forward on graduation !
> > > > > > >
> > > > > > > Regards
> > > > > > > JB
> > > > > > >
> > > > > > > On 11/22/2016 07:19 PM, Davor Bonaci wrote:
> > > > > > > > Hi everyone,
> > > > > > > > With all the progress we’ve had recently in Apache Beam, I
> > think
> > > it
> > > > > is
> > > > > > > time
> > > > > > > > we start the discussion about graduation as a new top-level
> > > project
> > > > > at
> > > > > > > the
> > > > > > > > Apache Software Foundation.
> > > > > > > >
> > > > > > > > Graduation means we are a self-sustaining and self-governing
> > > > > community,
> > > > > > > and
> > > > > > > > ready to be a full participant in the Apache Software
> > Foundation.
> > > > It
> > > > > > does
> > > > > > > > not imply that our community growth is complete or that a
> > > > particular
> > > > > > > level
> > > > > > > > of technical maturity has been reached, rather that we are
> on a
> > > > solid
> > > > > > > > trajectory in those areas. After graduation, we will still
> > > > > periodically
> > > > > > > > report to, and be overseen by, the ASF Board to ensure
> > continued
> > > > > growth
> > > > > > > of
> > > > > > > > a healthy community.
> > > > > > > >
> > > > > > > > Graduation is an important milestone for the project. It is
> > also
> > > > key
> > > > > to
> > > > > > > > further grow the user community: many users (incorrectly) see
> > > > > > incubation
> > > > > > > as
> > > > > > > > a sign of instability and are much less likely to consider us
> > > for a
> > > > > > > > production use.
> > > > > > > >
> > > > > > > > A way to think about graduation readiness is through the
> Apache
> > > > > > Maturity
> > > > > > > > Model [1]. I think we clearly satisfy all the requirements
> [2].
> > > It
> > > > is
> > > > > > > > probably worth emphasizing the recent community growth: over
> > each
> > > > of
> > > > > > the
> > > > > > > > past three months, no single organization contributing to
> Beam
> > > has
> > > > > had
> > > > > > > more
> > > > > > > > than ~50% of the unique contributors per month [2, see
> > > > assumptions].
> > > > > > > That’s
> > > > > > > > a great statistic that shows how much we’ve grown our
> > diversity!
> > > > > > > >
> > > > > > > > Process-wise, graduation consists of drafting a board
> > resolution,
> > > > > which
> > > > > > > > needs to identify the full Project Management Committee, and
> > > > getting
> > > > > it
> > > > > > > > approved by the community, the Incubator, and the Board.
> Within
> > > the
> > > > > > Beam
> > > > > > > > community, most of these discussions and votes have to be on
> > the
> > > > > > private@
> > > > > > > > mailing list, but, as usual, we’ll try to keep dev@ updated
> as
> > > > much
> > > > > as
> > > > > > > > possible.
> > > > > > > >
> > > > > > > > 

Re: [DISCUSS] Graduation to a top-level project

2016-11-22 Thread Kenneth Knowles
+1 !!!

I especially love how the diversity of the community has contributed to the
conceptual growth and quality of Beam. I can't wait for more!

On Tue, Nov 22, 2016 at 11:22 AM, Thomas Groh 
wrote:

> +1
>
> It's been a thrilling experience thus far, and I'm excited for the future.
>
> On Tue, Nov 22, 2016 at 11:07 AM, Aljoscha Krettek 
> wrote:
>
> > +1
> >
> > I'm quite enthusiastic about the growth of the community and the open
> > discussions!
> >
> > On Tue, 22 Nov 2016 at 19:51 Jason Kuster  invalid>
> > wrote:
> >
> > > An enthusiastic +1!
> > >
> > > In particular it's been really great to see the commitment and interest
> > of
> > > the community in different kinds of testing. Between what we currently
> > have
> > > on Jenkins and Travis and the in-progress work on IO integration tests
> > and
> > > performance tests (plus, I'm sure, other things I'm not aware of) we're
> > in
> > > a really good place.
> > >
> > > On Tue, Nov 22, 2016 at 10:49 AM, Amit Sela 
> > wrote:
> > >
> > > > +1, super exciting!
> > > >
> > > > Thanks to JB, Davor and the whole team for creating this community. I
> > > think
> > > > we've achieved a lot in a short time.
> > > >
> > > > Amit.
> > > >
> > > > On Tue, Nov 22, 2016, 20:36 Tyler Akidau  >
> > > > wrote:
> > > >
> > > > > +1, thanks to everyone who's invested time getting us to this
> point.
> > > :-)
> > > > >
> > > > > -Tyler
> > > > >
> > > > > On Tue, Nov 22, 2016 at 10:33 AM Jean-Baptiste Onofré <
> > j...@nanthrax.net
> > > >
> > > > > wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > First of all, I would like to thank the whole team, and
> especially
> > > > Davor
> > > > > > for the great work and commitment to Apache and the community.
> > > > > >
> > > > > > Of course, a big +1 to move forward on graduation !
> > > > > >
> > > > > > Regards
> > > > > > JB
> > > > > >
> > > > > > On 11/22/2016 07:19 PM, Davor Bonaci wrote:
> > > > > > > Hi everyone,
> > > > > > > With all the progress we’ve had recently in Apache Beam, I
> think
> > it
> > > > is
> > > > > > time
> > > > > > > we start the discussion about graduation as a new top-level
> > project
> > > > at
> > > > > > the
> > > > > > > Apache Software Foundation.
> > > > > > >
> > > > > > > Graduation means we are a self-sustaining and self-governing
> > > > community,
> > > > > > and
> > > > > > > ready to be a full participant in the Apache Software
> Foundation.
> > > It
> > > > > does
> > > > > > > not imply that our community growth is complete or that a
> > > particular
> > > > > > level
> > > > > > > of technical maturity has been reached, rather that we are on a
> > > solid
> > > > > > > trajectory in those areas. After graduation, we will still
> > > > periodically
> > > > > > > report to, and be overseen by, the ASF Board to ensure
> continued
> > > > growth
> > > > > > of
> > > > > > > a healthy community.
> > > > > > >
> > > > > > > Graduation is an important milestone for the project. It is
> also
> > > key
> > > > to
> > > > > > > further grow the user community: many users (incorrectly) see
> > > > > incubation
> > > > > > as
> > > > > > > a sign of instability and are much less likely to consider us
> > for a
> > > > > > > production use.
> > > > > > >
> > > > > > > A way to think about graduation readiness is through the Apache
> > > > > Maturity
> > > > > > > Model [1]. I think we clearly satisfy all the requirements [2].
> > It
> > > is
> > > > > > > probably worth emphasizing the recent community growth: over
> each
> > > of
> > > > > the
> > > > > > > past three months, no single organization contributing to Beam
> > has
> > > > had
> > > > > > more
> > > > > > > than ~50% of the unique contributors per month [2, see
> > > assumptions].
> > > > > > That’s
> > > > > > > a great statistic that shows how much we’ve grown our
> diversity!
> > > > > > >
> > > > > > > Process-wise, graduation consists of drafting a board
> resolution,
> > > > which
> > > > > > > needs to identify the full Project Management Committee, and
> > > getting
> > > > it
> > > > > > > approved by the community, the Incubator, and the Board. Within
> > the
> > > > > Beam
> > > > > > > community, most of these discussions and votes have to be on
> the
> > > > > private@
> > > > > > > mailing list, but, as usual, we’ll try to keep dev@ updated as
> > > much
> > > > as
> > > > > > > possible.
> > > > > > >
> > > > > > > With that in mind, let’s use this discussion on dev@ for two
> > > things:
> > > > > > > * Collect additional data points on our progress that we may
> want
> > > to
> > > > > > > present to the Incubator as a part of the proposal to accept
> our
> > > > > > graduation.
> > > > > > > * Determine whether the community supports graduation. Please
> > reply
> > > > > +1/-1
> > > > > > > with any additional comments, as appropriate. I’d encourage
> > > everyone
> > > > 

Re: Hosting data stores for IO Transform testing

2016-11-22 Thread Jean-Baptiste Onofré

Hi Sourabh,

We raised the IO versioning point couple of months ago on the mailing list.

Basically, we have two options:

1. Same modules (for example sdks/java/io/kafka) with one branch per 
version (kafka-0.8 kafka-0.10)

2. Several modules: sdks/java/io/kafka-0.8 sdks/java/io/kafka-0.10

My preferences is on 2:
Pros:
- the IO can still be part of the main Beam release
- it's more visible for contribution
Cons:
- we might have code duplication

Regards
JB

On 11/22/2016 08:12 PM, Sourabh Bajaj wrote:

Hi,

One tangential question I had around the proposal was how do we currently
deal with versioning in IO sources/sinks.

For example Cassandra 1.2 vs 2.1 have some differences between them, so the
checked in sources and sink probably supports a particular version right
now. If yes, follow questions would be around how do we handle updating ?
deprecating and documenting the supported versions.

I can move this to a new thread if this seems like a different discussion.
Also if this has already been answered please feel free to direct me to a
doc or past thread.

Thanks
Sourabh

On Tue, Nov 22, 2016 at 7:59 AM Ismaël Mejía  wrote:


​Hello,

@Stephen Thanks for your proposal, it is really interesting, I would really
like to help with this. I have never played with Kubernetes but this seems
a really nice chance to do something useful with it.

We (at Talend) are testing most of the IOs using simple container images
and in some particular cases ‘clusters’ of containers using docker-compose
(a little bit like Amit’s (2) proposal). It would be really nice to have
this at the Beam level, in particular to try to test more complex
semantics, I don’t know how programmable kubernetes is to achieve this for
example:

Let’s think we have a cluster of Cassandra or Kafka nodes, I would like to
have programmatic tests to simulate failure (e.g. kill a node), or simulate
a really slow node, to ensure that the IO behaves as expected in the Beam
pipeline for the given runner.

Another related idea is to improve IO consistency: Today the different IOs
have small differences in their failure behavior, I really would like to be
able to predict with more precision what will happen in case of errors,
e.g. what is the correct behavior if I am writing to a Kafka node and there
is a network partition, does the Kafka sink retries or no ? and what if it
is the JdbcIO ?, will it work the same e.g. assuming checkpointing? Or do
we guarantee exactly once writes somehow?, today I am not sure about what
happens (or if the expected behavior depends on the runner), but well maybe
it is just that I don’t know and we have tests to ensure this.

Of course both are really hard problems, but I think with your proposal we
can try to tackle them, as well as the performance ones. And apart of the
data stores, I think it will be also really nice to be able to test the
runners in a distributed manner.

So what is the next step? How do you imagine such integration tests? ? Who
can provide the test machines so we can mount the cluster?

Maybe my ideas are a bit too far away for an initial setup, but it will be
really nice to start working on this.

Ismael​


On Tue, Nov 22, 2016 at 11:00 AM, Amit Sela  wrote:


Hi Stephen,

I was wondering about how we plan to use the data stores across

executions.


Clearly, it's best to setup a new instance (container) for every test,
running a "standalone" store (say HBase/Cassandra for example), and once
the test is done, teardown the instance. It should also be agnostic to

the

runtime environment (e.g., Docker on Kubernetes).
I'm wondering though what's the overhead of managing such a deployment
which could become heavy and complicated as more IOs are supported and

more

test cases introduced.

Another way to go would be to have small clusters of different data

stores

and run against new "namespaces" (while lazily evicting old ones), but I
think this is less likely as maintaining a distributed instance (even a
small one) for each data store sounds even more complex.

A third approach would be to to simply have an "embedded" in-memory
instance of a data store as part of a test that runs against it (such as

an

embedded Kafka, though not a data store).
This is probably the simplest solution in terms of orchestration, but it
depends on having a proper "embedded" implementation for an IO.

Does this make sense to you ? have you considered it ?

Thanks,
Amit

On Tue, Nov 22, 2016 at 8:20 AM Jean-Baptiste Onofré 
wrote:


Hi Stephen,

as already discussed a bit together, it sounds great ! Especially I

like

it as a both integration test platform and good coverage for IOs.

I'm very late on this but, as said, I will share with you my Marathon
JSON and Mesos docker images.

By the way, I started to experiment a bit kubernetes and swamp but it's
not yet complete. I will share what I have on the same github repo.

Thanks !
Regards
JB

On 11/16/2016 11:36 

Re: Hosting data stores for IO Transform testing

2016-11-22 Thread Sourabh Bajaj
Hi,

One tangential question I had around the proposal was how do we currently
deal with versioning in IO sources/sinks.

For example Cassandra 1.2 vs 2.1 have some differences between them, so the
checked in sources and sink probably supports a particular version right
now. If yes, follow questions would be around how do we handle updating ?
deprecating and documenting the supported versions.

I can move this to a new thread if this seems like a different discussion.
Also if this has already been answered please feel free to direct me to a
doc or past thread.

Thanks
Sourabh

On Tue, Nov 22, 2016 at 7:59 AM Ismaël Mejía  wrote:

> ​Hello,
>
> @Stephen Thanks for your proposal, it is really interesting, I would really
> like to help with this. I have never played with Kubernetes but this seems
> a really nice chance to do something useful with it.
>
> We (at Talend) are testing most of the IOs using simple container images
> and in some particular cases ‘clusters’ of containers using docker-compose
> (a little bit like Amit’s (2) proposal). It would be really nice to have
> this at the Beam level, in particular to try to test more complex
> semantics, I don’t know how programmable kubernetes is to achieve this for
> example:
>
> Let’s think we have a cluster of Cassandra or Kafka nodes, I would like to
> have programmatic tests to simulate failure (e.g. kill a node), or simulate
> a really slow node, to ensure that the IO behaves as expected in the Beam
> pipeline for the given runner.
>
> Another related idea is to improve IO consistency: Today the different IOs
> have small differences in their failure behavior, I really would like to be
> able to predict with more precision what will happen in case of errors,
> e.g. what is the correct behavior if I am writing to a Kafka node and there
> is a network partition, does the Kafka sink retries or no ? and what if it
> is the JdbcIO ?, will it work the same e.g. assuming checkpointing? Or do
> we guarantee exactly once writes somehow?, today I am not sure about what
> happens (or if the expected behavior depends on the runner), but well maybe
> it is just that I don’t know and we have tests to ensure this.
>
> Of course both are really hard problems, but I think with your proposal we
> can try to tackle them, as well as the performance ones. And apart of the
> data stores, I think it will be also really nice to be able to test the
> runners in a distributed manner.
>
> So what is the next step? How do you imagine such integration tests? ? Who
> can provide the test machines so we can mount the cluster?
>
> Maybe my ideas are a bit too far away for an initial setup, but it will be
> really nice to start working on this.
>
> Ismael​
>
>
> On Tue, Nov 22, 2016 at 11:00 AM, Amit Sela  wrote:
>
> > Hi Stephen,
> >
> > I was wondering about how we plan to use the data stores across
> executions.
> >
> > Clearly, it's best to setup a new instance (container) for every test,
> > running a "standalone" store (say HBase/Cassandra for example), and once
> > the test is done, teardown the instance. It should also be agnostic to
> the
> > runtime environment (e.g., Docker on Kubernetes).
> > I'm wondering though what's the overhead of managing such a deployment
> > which could become heavy and complicated as more IOs are supported and
> more
> > test cases introduced.
> >
> > Another way to go would be to have small clusters of different data
> stores
> > and run against new "namespaces" (while lazily evicting old ones), but I
> > think this is less likely as maintaining a distributed instance (even a
> > small one) for each data store sounds even more complex.
> >
> > A third approach would be to to simply have an "embedded" in-memory
> > instance of a data store as part of a test that runs against it (such as
> an
> > embedded Kafka, though not a data store).
> > This is probably the simplest solution in terms of orchestration, but it
> > depends on having a proper "embedded" implementation for an IO.
> >
> > Does this make sense to you ? have you considered it ?
> >
> > Thanks,
> > Amit
> >
> > On Tue, Nov 22, 2016 at 8:20 AM Jean-Baptiste Onofré 
> > wrote:
> >
> > > Hi Stephen,
> > >
> > > as already discussed a bit together, it sounds great ! Especially I
> like
> > > it as a both integration test platform and good coverage for IOs.
> > >
> > > I'm very late on this but, as said, I will share with you my Marathon
> > > JSON and Mesos docker images.
> > >
> > > By the way, I started to experiment a bit kubernetes and swamp but it's
> > > not yet complete. I will share what I have on the same github repo.
> > >
> > > Thanks !
> > > Regards
> > > JB
> > >
> > > On 11/16/2016 11:36 PM, Stephen Sisk wrote:
> > > > Hi everyone!
> > > >
> > > > Currently we have a good set of unit tests for our IO Transforms -
> > those
> > > > tend to run against in-memory versions of the data stores. However,
> > we'd
> > > > 

Re: [DISCUSS] Graduation to a top-level project

2016-11-22 Thread Thomas Weise
+1

I would like to mention the welcoming, growing community and the focus on
solid processes and testing.


On Tue, Nov 22, 2016 at 11:07 AM, Aljoscha Krettek 
wrote:

> +1
>
> I'm quite enthusiastic about the growth of the community and the open
> discussions!
>
> On Tue, 22 Nov 2016 at 19:51 Jason Kuster 
> wrote:
>
> > An enthusiastic +1!
> >
> > In particular it's been really great to see the commitment and interest
> of
> > the community in different kinds of testing. Between what we currently
> have
> > on Jenkins and Travis and the in-progress work on IO integration tests
> and
> > performance tests (plus, I'm sure, other things I'm not aware of) we're
> in
> > a really good place.
> >
> > On Tue, Nov 22, 2016 at 10:49 AM, Amit Sela 
> wrote:
> >
> > > +1, super exciting!
> > >
> > > Thanks to JB, Davor and the whole team for creating this community. I
> > think
> > > we've achieved a lot in a short time.
> > >
> > > Amit.
> > >
> > > On Tue, Nov 22, 2016, 20:36 Tyler Akidau 
> > > wrote:
> > >
> > > > +1, thanks to everyone who's invested time getting us to this point.
> > :-)
> > > >
> > > > -Tyler
> > > >
> > > > On Tue, Nov 22, 2016 at 10:33 AM Jean-Baptiste Onofré <
> j...@nanthrax.net
> > >
> > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > First of all, I would like to thank the whole team, and especially
> > > Davor
> > > > > for the great work and commitment to Apache and the community.
> > > > >
> > > > > Of course, a big +1 to move forward on graduation !
> > > > >
> > > > > Regards
> > > > > JB
> > > > >
> > > > > On 11/22/2016 07:19 PM, Davor Bonaci wrote:
> > > > > > Hi everyone,
> > > > > > With all the progress we’ve had recently in Apache Beam, I think
> it
> > > is
> > > > > time
> > > > > > we start the discussion about graduation as a new top-level
> project
> > > at
> > > > > the
> > > > > > Apache Software Foundation.
> > > > > >
> > > > > > Graduation means we are a self-sustaining and self-governing
> > > community,
> > > > > and
> > > > > > ready to be a full participant in the Apache Software Foundation.
> > It
> > > > does
> > > > > > not imply that our community growth is complete or that a
> > particular
> > > > > level
> > > > > > of technical maturity has been reached, rather that we are on a
> > solid
> > > > > > trajectory in those areas. After graduation, we will still
> > > periodically
> > > > > > report to, and be overseen by, the ASF Board to ensure continued
> > > growth
> > > > > of
> > > > > > a healthy community.
> > > > > >
> > > > > > Graduation is an important milestone for the project. It is also
> > key
> > > to
> > > > > > further grow the user community: many users (incorrectly) see
> > > > incubation
> > > > > as
> > > > > > a sign of instability and are much less likely to consider us
> for a
> > > > > > production use.
> > > > > >
> > > > > > A way to think about graduation readiness is through the Apache
> > > > Maturity
> > > > > > Model [1]. I think we clearly satisfy all the requirements [2].
> It
> > is
> > > > > > probably worth emphasizing the recent community growth: over each
> > of
> > > > the
> > > > > > past three months, no single organization contributing to Beam
> has
> > > had
> > > > > more
> > > > > > than ~50% of the unique contributors per month [2, see
> > assumptions].
> > > > > That’s
> > > > > > a great statistic that shows how much we’ve grown our diversity!
> > > > > >
> > > > > > Process-wise, graduation consists of drafting a board resolution,
> > > which
> > > > > > needs to identify the full Project Management Committee, and
> > getting
> > > it
> > > > > > approved by the community, the Incubator, and the Board. Within
> the
> > > > Beam
> > > > > > community, most of these discussions and votes have to be on the
> > > > private@
> > > > > > mailing list, but, as usual, we’ll try to keep dev@ updated as
> > much
> > > as
> > > > > > possible.
> > > > > >
> > > > > > With that in mind, let’s use this discussion on dev@ for two
> > things:
> > > > > > * Collect additional data points on our progress that we may want
> > to
> > > > > > present to the Incubator as a part of the proposal to accept our
> > > > > graduation.
> > > > > > * Determine whether the community supports graduation. Please
> reply
> > > > +1/-1
> > > > > > with any additional comments, as appropriate. I’d encourage
> > everyone
> > > to
> > > > > > participate -- regardless whether you are an occasional visitor
> or
> > > > have a
> > > > > > specific role in the project -- we’d love to hear your
> perspective.
> > > > > >
> > > > > > Data points so far:
> > > > > > * Project’s maturity self-assessment [2].
> > > > > > * 1500 pull requests in incubation, which makes us one of the
> most
> > > > active
> > > > > > project across all of ASF on this metric.
> > > > > > * 3 releases, each driven by a different release manager.
> > > > > > * 120+ 

Re: [DISCUSS] Graduation to a top-level project

2016-11-22 Thread Pei He
+1, very exciting and looking forward.
--
Pei

On Tue, Nov 22, 2016 at 11:07 AM, Aljoscha Krettek 
wrote:

> +1
>
> I'm quite enthusiastic about the growth of the community and the open
> discussions!
>
> On Tue, 22 Nov 2016 at 19:51 Jason Kuster 
> wrote:
>
> > An enthusiastic +1!
> >
> > In particular it's been really great to see the commitment and interest
> of
> > the community in different kinds of testing. Between what we currently
> have
> > on Jenkins and Travis and the in-progress work on IO integration tests
> and
> > performance tests (plus, I'm sure, other things I'm not aware of) we're
> in
> > a really good place.
> >
> > On Tue, Nov 22, 2016 at 10:49 AM, Amit Sela 
> wrote:
> >
> > > +1, super exciting!
> > >
> > > Thanks to JB, Davor and the whole team for creating this community. I
> > think
> > > we've achieved a lot in a short time.
> > >
> > > Amit.
> > >
> > > On Tue, Nov 22, 2016, 20:36 Tyler Akidau 
> > > wrote:
> > >
> > > > +1, thanks to everyone who's invested time getting us to this point.
> > :-)
> > > >
> > > > -Tyler
> > > >
> > > > On Tue, Nov 22, 2016 at 10:33 AM Jean-Baptiste Onofré <
> j...@nanthrax.net
> > >
> > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > First of all, I would like to thank the whole team, and especially
> > > Davor
> > > > > for the great work and commitment to Apache and the community.
> > > > >
> > > > > Of course, a big +1 to move forward on graduation !
> > > > >
> > > > > Regards
> > > > > JB
> > > > >
> > > > > On 11/22/2016 07:19 PM, Davor Bonaci wrote:
> > > > > > Hi everyone,
> > > > > > With all the progress we’ve had recently in Apache Beam, I think
> it
> > > is
> > > > > time
> > > > > > we start the discussion about graduation as a new top-level
> project
> > > at
> > > > > the
> > > > > > Apache Software Foundation.
> > > > > >
> > > > > > Graduation means we are a self-sustaining and self-governing
> > > community,
> > > > > and
> > > > > > ready to be a full participant in the Apache Software Foundation.
> > It
> > > > does
> > > > > > not imply that our community growth is complete or that a
> > particular
> > > > > level
> > > > > > of technical maturity has been reached, rather that we are on a
> > solid
> > > > > > trajectory in those areas. After graduation, we will still
> > > periodically
> > > > > > report to, and be overseen by, the ASF Board to ensure continued
> > > growth
> > > > > of
> > > > > > a healthy community.
> > > > > >
> > > > > > Graduation is an important milestone for the project. It is also
> > key
> > > to
> > > > > > further grow the user community: many users (incorrectly) see
> > > > incubation
> > > > > as
> > > > > > a sign of instability and are much less likely to consider us
> for a
> > > > > > production use.
> > > > > >
> > > > > > A way to think about graduation readiness is through the Apache
> > > > Maturity
> > > > > > Model [1]. I think we clearly satisfy all the requirements [2].
> It
> > is
> > > > > > probably worth emphasizing the recent community growth: over each
> > of
> > > > the
> > > > > > past three months, no single organization contributing to Beam
> has
> > > had
> > > > > more
> > > > > > than ~50% of the unique contributors per month [2, see
> > assumptions].
> > > > > That’s
> > > > > > a great statistic that shows how much we’ve grown our diversity!
> > > > > >
> > > > > > Process-wise, graduation consists of drafting a board resolution,
> > > which
> > > > > > needs to identify the full Project Management Committee, and
> > getting
> > > it
> > > > > > approved by the community, the Incubator, and the Board. Within
> the
> > > > Beam
> > > > > > community, most of these discussions and votes have to be on the
> > > > private@
> > > > > > mailing list, but, as usual, we’ll try to keep dev@ updated as
> > much
> > > as
> > > > > > possible.
> > > > > >
> > > > > > With that in mind, let’s use this discussion on dev@ for two
> > things:
> > > > > > * Collect additional data points on our progress that we may want
> > to
> > > > > > present to the Incubator as a part of the proposal to accept our
> > > > > graduation.
> > > > > > * Determine whether the community supports graduation. Please
> reply
> > > > +1/-1
> > > > > > with any additional comments, as appropriate. I’d encourage
> > everyone
> > > to
> > > > > > participate -- regardless whether you are an occasional visitor
> or
> > > > have a
> > > > > > specific role in the project -- we’d love to hear your
> perspective.
> > > > > >
> > > > > > Data points so far:
> > > > > > * Project’s maturity self-assessment [2].
> > > > > > * 1500 pull requests in incubation, which makes us one of the
> most
> > > > active
> > > > > > project across all of ASF on this metric.
> > > > > > * 3 releases, each driven by a different release manager.
> > > > > > * 120+ individual contributors.
> > > > > > * 3 new committers 

Re: [DISCUSS] Graduation to a top-level project

2016-11-22 Thread Aljoscha Krettek
+1

I'm quite enthusiastic about the growth of the community and the open
discussions!

On Tue, 22 Nov 2016 at 19:51 Jason Kuster 
wrote:

> An enthusiastic +1!
>
> In particular it's been really great to see the commitment and interest of
> the community in different kinds of testing. Between what we currently have
> on Jenkins and Travis and the in-progress work on IO integration tests and
> performance tests (plus, I'm sure, other things I'm not aware of) we're in
> a really good place.
>
> On Tue, Nov 22, 2016 at 10:49 AM, Amit Sela  wrote:
>
> > +1, super exciting!
> >
> > Thanks to JB, Davor and the whole team for creating this community. I
> think
> > we've achieved a lot in a short time.
> >
> > Amit.
> >
> > On Tue, Nov 22, 2016, 20:36 Tyler Akidau 
> > wrote:
> >
> > > +1, thanks to everyone who's invested time getting us to this point.
> :-)
> > >
> > > -Tyler
> > >
> > > On Tue, Nov 22, 2016 at 10:33 AM Jean-Baptiste Onofré  >
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > First of all, I would like to thank the whole team, and especially
> > Davor
> > > > for the great work and commitment to Apache and the community.
> > > >
> > > > Of course, a big +1 to move forward on graduation !
> > > >
> > > > Regards
> > > > JB
> > > >
> > > > On 11/22/2016 07:19 PM, Davor Bonaci wrote:
> > > > > Hi everyone,
> > > > > With all the progress we’ve had recently in Apache Beam, I think it
> > is
> > > > time
> > > > > we start the discussion about graduation as a new top-level project
> > at
> > > > the
> > > > > Apache Software Foundation.
> > > > >
> > > > > Graduation means we are a self-sustaining and self-governing
> > community,
> > > > and
> > > > > ready to be a full participant in the Apache Software Foundation.
> It
> > > does
> > > > > not imply that our community growth is complete or that a
> particular
> > > > level
> > > > > of technical maturity has been reached, rather that we are on a
> solid
> > > > > trajectory in those areas. After graduation, we will still
> > periodically
> > > > > report to, and be overseen by, the ASF Board to ensure continued
> > growth
> > > > of
> > > > > a healthy community.
> > > > >
> > > > > Graduation is an important milestone for the project. It is also
> key
> > to
> > > > > further grow the user community: many users (incorrectly) see
> > > incubation
> > > > as
> > > > > a sign of instability and are much less likely to consider us for a
> > > > > production use.
> > > > >
> > > > > A way to think about graduation readiness is through the Apache
> > > Maturity
> > > > > Model [1]. I think we clearly satisfy all the requirements [2]. It
> is
> > > > > probably worth emphasizing the recent community growth: over each
> of
> > > the
> > > > > past three months, no single organization contributing to Beam has
> > had
> > > > more
> > > > > than ~50% of the unique contributors per month [2, see
> assumptions].
> > > > That’s
> > > > > a great statistic that shows how much we’ve grown our diversity!
> > > > >
> > > > > Process-wise, graduation consists of drafting a board resolution,
> > which
> > > > > needs to identify the full Project Management Committee, and
> getting
> > it
> > > > > approved by the community, the Incubator, and the Board. Within the
> > > Beam
> > > > > community, most of these discussions and votes have to be on the
> > > private@
> > > > > mailing list, but, as usual, we’ll try to keep dev@ updated as
> much
> > as
> > > > > possible.
> > > > >
> > > > > With that in mind, let’s use this discussion on dev@ for two
> things:
> > > > > * Collect additional data points on our progress that we may want
> to
> > > > > present to the Incubator as a part of the proposal to accept our
> > > > graduation.
> > > > > * Determine whether the community supports graduation. Please reply
> > > +1/-1
> > > > > with any additional comments, as appropriate. I’d encourage
> everyone
> > to
> > > > > participate -- regardless whether you are an occasional visitor or
> > > have a
> > > > > specific role in the project -- we’d love to hear your perspective.
> > > > >
> > > > > Data points so far:
> > > > > * Project’s maturity self-assessment [2].
> > > > > * 1500 pull requests in incubation, which makes us one of the most
> > > active
> > > > > project across all of ASF on this metric.
> > > > > * 3 releases, each driven by a different release manager.
> > > > > * 120+ individual contributors.
> > > > > * 3 new committers added, 2 of which aren’t from the largest
> > > > organization.
> > > > > * 1027 issues created, 515 resolved.
> > > > > * 442 dev@ emails in October alone, sent by 51 individuals.
> > > > > * 50 user@ emails in the last 30 days, sent by 22 individuals.
> > > > >
> > > > > Thanks!
> > > > >
> > > > > Davor
> > > > >
> > > > > [1] http://community.apache.org/apache-way/apache-project-
> > > > > maturity-model.html
> > > > > [2] 

Re: [DISCUSS] Graduation to a top-level project

2016-11-22 Thread Amit Sela
+1, super exciting!

Thanks to JB, Davor and the whole team for creating this community. I think
we've achieved a lot in a short time.

Amit.

On Tue, Nov 22, 2016, 20:36 Tyler Akidau  wrote:

> +1, thanks to everyone who's invested time getting us to this point. :-)
>
> -Tyler
>
> On Tue, Nov 22, 2016 at 10:33 AM Jean-Baptiste Onofré 
> wrote:
>
> > Hi,
> >
> > First of all, I would like to thank the whole team, and especially Davor
> > for the great work and commitment to Apache and the community.
> >
> > Of course, a big +1 to move forward on graduation !
> >
> > Regards
> > JB
> >
> > On 11/22/2016 07:19 PM, Davor Bonaci wrote:
> > > Hi everyone,
> > > With all the progress we’ve had recently in Apache Beam, I think it is
> > time
> > > we start the discussion about graduation as a new top-level project at
> > the
> > > Apache Software Foundation.
> > >
> > > Graduation means we are a self-sustaining and self-governing community,
> > and
> > > ready to be a full participant in the Apache Software Foundation. It
> does
> > > not imply that our community growth is complete or that a particular
> > level
> > > of technical maturity has been reached, rather that we are on a solid
> > > trajectory in those areas. After graduation, we will still periodically
> > > report to, and be overseen by, the ASF Board to ensure continued growth
> > of
> > > a healthy community.
> > >
> > > Graduation is an important milestone for the project. It is also key to
> > > further grow the user community: many users (incorrectly) see
> incubation
> > as
> > > a sign of instability and are much less likely to consider us for a
> > > production use.
> > >
> > > A way to think about graduation readiness is through the Apache
> Maturity
> > > Model [1]. I think we clearly satisfy all the requirements [2]. It is
> > > probably worth emphasizing the recent community growth: over each of
> the
> > > past three months, no single organization contributing to Beam has had
> > more
> > > than ~50% of the unique contributors per month [2, see assumptions].
> > That’s
> > > a great statistic that shows how much we’ve grown our diversity!
> > >
> > > Process-wise, graduation consists of drafting a board resolution, which
> > > needs to identify the full Project Management Committee, and getting it
> > > approved by the community, the Incubator, and the Board. Within the
> Beam
> > > community, most of these discussions and votes have to be on the
> private@
> > > mailing list, but, as usual, we’ll try to keep dev@ updated as much as
> > > possible.
> > >
> > > With that in mind, let’s use this discussion on dev@ for two things:
> > > * Collect additional data points on our progress that we may want to
> > > present to the Incubator as a part of the proposal to accept our
> > graduation.
> > > * Determine whether the community supports graduation. Please reply
> +1/-1
> > > with any additional comments, as appropriate. I’d encourage everyone to
> > > participate -- regardless whether you are an occasional visitor or
> have a
> > > specific role in the project -- we’d love to hear your perspective.
> > >
> > > Data points so far:
> > > * Project’s maturity self-assessment [2].
> > > * 1500 pull requests in incubation, which makes us one of the most
> active
> > > project across all of ASF on this metric.
> > > * 3 releases, each driven by a different release manager.
> > > * 120+ individual contributors.
> > > * 3 new committers added, 2 of which aren’t from the largest
> > organization.
> > > * 1027 issues created, 515 resolved.
> > > * 442 dev@ emails in October alone, sent by 51 individuals.
> > > * 50 user@ emails in the last 30 days, sent by 22 individuals.
> > >
> > > Thanks!
> > >
> > > Davor
> > >
> > > [1] http://community.apache.org/apache-way/apache-project-
> > > maturity-model.html
> > > [2] http://beam.incubator.apache.org/contribute/maturity-model/
> > >
> >
> > --
> > Jean-Baptiste Onofré
> > jbono...@apache.org
> > http://blog.nanthrax.net
> > Talend - http://www.talend.com
> >
>


Re: [DISCUSS] Graduation to a top-level project

2016-11-22 Thread Tyler Akidau
+1, thanks to everyone who's invested time getting us to this point. :-)

-Tyler

On Tue, Nov 22, 2016 at 10:33 AM Jean-Baptiste Onofré 
wrote:

> Hi,
>
> First of all, I would like to thank the whole team, and especially Davor
> for the great work and commitment to Apache and the community.
>
> Of course, a big +1 to move forward on graduation !
>
> Regards
> JB
>
> On 11/22/2016 07:19 PM, Davor Bonaci wrote:
> > Hi everyone,
> > With all the progress we’ve had recently in Apache Beam, I think it is
> time
> > we start the discussion about graduation as a new top-level project at
> the
> > Apache Software Foundation.
> >
> > Graduation means we are a self-sustaining and self-governing community,
> and
> > ready to be a full participant in the Apache Software Foundation. It does
> > not imply that our community growth is complete or that a particular
> level
> > of technical maturity has been reached, rather that we are on a solid
> > trajectory in those areas. After graduation, we will still periodically
> > report to, and be overseen by, the ASF Board to ensure continued growth
> of
> > a healthy community.
> >
> > Graduation is an important milestone for the project. It is also key to
> > further grow the user community: many users (incorrectly) see incubation
> as
> > a sign of instability and are much less likely to consider us for a
> > production use.
> >
> > A way to think about graduation readiness is through the Apache Maturity
> > Model [1]. I think we clearly satisfy all the requirements [2]. It is
> > probably worth emphasizing the recent community growth: over each of the
> > past three months, no single organization contributing to Beam has had
> more
> > than ~50% of the unique contributors per month [2, see assumptions].
> That’s
> > a great statistic that shows how much we’ve grown our diversity!
> >
> > Process-wise, graduation consists of drafting a board resolution, which
> > needs to identify the full Project Management Committee, and getting it
> > approved by the community, the Incubator, and the Board. Within the Beam
> > community, most of these discussions and votes have to be on the private@
> > mailing list, but, as usual, we’ll try to keep dev@ updated as much as
> > possible.
> >
> > With that in mind, let’s use this discussion on dev@ for two things:
> > * Collect additional data points on our progress that we may want to
> > present to the Incubator as a part of the proposal to accept our
> graduation.
> > * Determine whether the community supports graduation. Please reply +1/-1
> > with any additional comments, as appropriate. I’d encourage everyone to
> > participate -- regardless whether you are an occasional visitor or have a
> > specific role in the project -- we’d love to hear your perspective.
> >
> > Data points so far:
> > * Project’s maturity self-assessment [2].
> > * 1500 pull requests in incubation, which makes us one of the most active
> > project across all of ASF on this metric.
> > * 3 releases, each driven by a different release manager.
> > * 120+ individual contributors.
> > * 3 new committers added, 2 of which aren’t from the largest
> organization.
> > * 1027 issues created, 515 resolved.
> > * 442 dev@ emails in October alone, sent by 51 individuals.
> > * 50 user@ emails in the last 30 days, sent by 22 individuals.
> >
> > Thanks!
> >
> > Davor
> >
> > [1] http://community.apache.org/apache-way/apache-project-
> > maturity-model.html
> > [2] http://beam.incubator.apache.org/contribute/maturity-model/
> >
>
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>


Re: [DISCUSS] Graduation to a top-level project

2016-11-22 Thread Jean-Baptiste Onofré

Hi,

First of all, I would like to thank the whole team, and especially Davor 
for the great work and commitment to Apache and the community.


Of course, a big +1 to move forward on graduation !

Regards
JB

On 11/22/2016 07:19 PM, Davor Bonaci wrote:

Hi everyone,
With all the progress we’ve had recently in Apache Beam, I think it is time
we start the discussion about graduation as a new top-level project at the
Apache Software Foundation.

Graduation means we are a self-sustaining and self-governing community, and
ready to be a full participant in the Apache Software Foundation. It does
not imply that our community growth is complete or that a particular level
of technical maturity has been reached, rather that we are on a solid
trajectory in those areas. After graduation, we will still periodically
report to, and be overseen by, the ASF Board to ensure continued growth of
a healthy community.

Graduation is an important milestone for the project. It is also key to
further grow the user community: many users (incorrectly) see incubation as
a sign of instability and are much less likely to consider us for a
production use.

A way to think about graduation readiness is through the Apache Maturity
Model [1]. I think we clearly satisfy all the requirements [2]. It is
probably worth emphasizing the recent community growth: over each of the
past three months, no single organization contributing to Beam has had more
than ~50% of the unique contributors per month [2, see assumptions]. That’s
a great statistic that shows how much we’ve grown our diversity!

Process-wise, graduation consists of drafting a board resolution, which
needs to identify the full Project Management Committee, and getting it
approved by the community, the Incubator, and the Board. Within the Beam
community, most of these discussions and votes have to be on the private@
mailing list, but, as usual, we’ll try to keep dev@ updated as much as
possible.

With that in mind, let’s use this discussion on dev@ for two things:
* Collect additional data points on our progress that we may want to
present to the Incubator as a part of the proposal to accept our graduation.
* Determine whether the community supports graduation. Please reply +1/-1
with any additional comments, as appropriate. I’d encourage everyone to
participate -- regardless whether you are an occasional visitor or have a
specific role in the project -- we’d love to hear your perspective.

Data points so far:
* Project’s maturity self-assessment [2].
* 1500 pull requests in incubation, which makes us one of the most active
project across all of ASF on this metric.
* 3 releases, each driven by a different release manager.
* 120+ individual contributors.
* 3 new committers added, 2 of which aren’t from the largest organization.
* 1027 issues created, 515 resolved.
* 442 dev@ emails in October alone, sent by 51 individuals.
* 50 user@ emails in the last 30 days, sent by 22 individuals.

Thanks!

Davor

[1] http://community.apache.org/apache-way/apache-project-
maturity-model.html
[2] http://beam.incubator.apache.org/contribute/maturity-model/



--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com


[DISCUSS] Graduation to a top-level project

2016-11-22 Thread Davor Bonaci
Hi everyone,
With all the progress we’ve had recently in Apache Beam, I think it is time
we start the discussion about graduation as a new top-level project at the
Apache Software Foundation.

Graduation means we are a self-sustaining and self-governing community, and
ready to be a full participant in the Apache Software Foundation. It does
not imply that our community growth is complete or that a particular level
of technical maturity has been reached, rather that we are on a solid
trajectory in those areas. After graduation, we will still periodically
report to, and be overseen by, the ASF Board to ensure continued growth of
a healthy community.

Graduation is an important milestone for the project. It is also key to
further grow the user community: many users (incorrectly) see incubation as
a sign of instability and are much less likely to consider us for a
production use.

A way to think about graduation readiness is through the Apache Maturity
Model [1]. I think we clearly satisfy all the requirements [2]. It is
probably worth emphasizing the recent community growth: over each of the
past three months, no single organization contributing to Beam has had more
than ~50% of the unique contributors per month [2, see assumptions]. That’s
a great statistic that shows how much we’ve grown our diversity!

Process-wise, graduation consists of drafting a board resolution, which
needs to identify the full Project Management Committee, and getting it
approved by the community, the Incubator, and the Board. Within the Beam
community, most of these discussions and votes have to be on the private@
mailing list, but, as usual, we’ll try to keep dev@ updated as much as
possible.

With that in mind, let’s use this discussion on dev@ for two things:
* Collect additional data points on our progress that we may want to
present to the Incubator as a part of the proposal to accept our graduation.
* Determine whether the community supports graduation. Please reply +1/-1
with any additional comments, as appropriate. I’d encourage everyone to
participate -- regardless whether you are an occasional visitor or have a
specific role in the project -- we’d love to hear your perspective.

Data points so far:
* Project’s maturity self-assessment [2].
* 1500 pull requests in incubation, which makes us one of the most active
project across all of ASF on this metric.
* 3 releases, each driven by a different release manager.
* 120+ individual contributors.
* 3 new committers added, 2 of which aren’t from the largest organization.
* 1027 issues created, 515 resolved.
* 442 dev@ emails in October alone, sent by 51 individuals.
* 50 user@ emails in the last 30 days, sent by 22 individuals.

Thanks!

Davor

[1] http://community.apache.org/apache-way/apache-project-
maturity-model.html
[2] http://beam.incubator.apache.org/contribute/maturity-model/


Re: Hosting data stores for IO Transform testing

2016-11-22 Thread Ismaël Mejía
​Hello,

@Stephen Thanks for your proposal, it is really interesting, I would really
like to help with this. I have never played with Kubernetes but this seems
a really nice chance to do something useful with it.

We (at Talend) are testing most of the IOs using simple container images
and in some particular cases ‘clusters’ of containers using docker-compose
(a little bit like Amit’s (2) proposal). It would be really nice to have
this at the Beam level, in particular to try to test more complex
semantics, I don’t know how programmable kubernetes is to achieve this for
example:

Let’s think we have a cluster of Cassandra or Kafka nodes, I would like to
have programmatic tests to simulate failure (e.g. kill a node), or simulate
a really slow node, to ensure that the IO behaves as expected in the Beam
pipeline for the given runner.

Another related idea is to improve IO consistency: Today the different IOs
have small differences in their failure behavior, I really would like to be
able to predict with more precision what will happen in case of errors,
e.g. what is the correct behavior if I am writing to a Kafka node and there
is a network partition, does the Kafka sink retries or no ? and what if it
is the JdbcIO ?, will it work the same e.g. assuming checkpointing? Or do
we guarantee exactly once writes somehow?, today I am not sure about what
happens (or if the expected behavior depends on the runner), but well maybe
it is just that I don’t know and we have tests to ensure this.

Of course both are really hard problems, but I think with your proposal we
can try to tackle them, as well as the performance ones. And apart of the
data stores, I think it will be also really nice to be able to test the
runners in a distributed manner.

So what is the next step? How do you imagine such integration tests? ? Who
can provide the test machines so we can mount the cluster?

Maybe my ideas are a bit too far away for an initial setup, but it will be
really nice to start working on this.

Ismael​


On Tue, Nov 22, 2016 at 11:00 AM, Amit Sela  wrote:

> Hi Stephen,
>
> I was wondering about how we plan to use the data stores across executions.
>
> Clearly, it's best to setup a new instance (container) for every test,
> running a "standalone" store (say HBase/Cassandra for example), and once
> the test is done, teardown the instance. It should also be agnostic to the
> runtime environment (e.g., Docker on Kubernetes).
> I'm wondering though what's the overhead of managing such a deployment
> which could become heavy and complicated as more IOs are supported and more
> test cases introduced.
>
> Another way to go would be to have small clusters of different data stores
> and run against new "namespaces" (while lazily evicting old ones), but I
> think this is less likely as maintaining a distributed instance (even a
> small one) for each data store sounds even more complex.
>
> A third approach would be to to simply have an "embedded" in-memory
> instance of a data store as part of a test that runs against it (such as an
> embedded Kafka, though not a data store).
> This is probably the simplest solution in terms of orchestration, but it
> depends on having a proper "embedded" implementation for an IO.
>
> Does this make sense to you ? have you considered it ?
>
> Thanks,
> Amit
>
> On Tue, Nov 22, 2016 at 8:20 AM Jean-Baptiste Onofré 
> wrote:
>
> > Hi Stephen,
> >
> > as already discussed a bit together, it sounds great ! Especially I like
> > it as a both integration test platform and good coverage for IOs.
> >
> > I'm very late on this but, as said, I will share with you my Marathon
> > JSON and Mesos docker images.
> >
> > By the way, I started to experiment a bit kubernetes and swamp but it's
> > not yet complete. I will share what I have on the same github repo.
> >
> > Thanks !
> > Regards
> > JB
> >
> > On 11/16/2016 11:36 PM, Stephen Sisk wrote:
> > > Hi everyone!
> > >
> > > Currently we have a good set of unit tests for our IO Transforms -
> those
> > > tend to run against in-memory versions of the data stores. However,
> we'd
> > > like to further increase our test coverage to include running them
> > against
> > > real instances of the data stores that the IO Transforms work against
> > (e.g.
> > > cassandra, mongodb, kafka, etc…), which means we'll need to have real
> > > instances of various data stores.
> > >
> > > Additionally, if we want to do performance regression detection, it's
> > > important to have instances of the services that behave realistically,
> > > which isn't true of in-memory or dev versions of the services.
> > >
> > >
> > > Proposed solution
> > > -
> > > If we accept this proposal, we would create an infrastructure for
> running
> > > real instances of data stores inside of containers, using container
> > > management software like mesos/marathon, kubernetes, docker swarm, etc…
> > to
> > > manage the instances.
> > >
> > 

Re: DoFn relying on Microservices

2016-11-22 Thread Jean-Baptiste Onofré

Hi Sergio,

DoFn will execute per element (with eventually a hook on StartBundle, 
FinishBundle, and Teardown). It's basic the way it works in IO WriteFn: 
we create the connection in StartBundle and send each element (with a 
batch) to external resource.


PTransform is maybe more flexible in case of interact with "outside" 
resources.


Do you have use case to be sure I understand ?

Thanks !
Regards
JB

On 11/22/2016 10:39 AM, Sergio Fernández wrote:

Hi,

I'd like resume the idea to have TensorFlow-based tasks running in a Beam
Pipeline. So far the cleaner approach I can imagine would be to have it
running outside (Functions in GCP, Lambdas in AWS, Microservices generally
speaking).

Therefore, does the current Beam model provide the sense of a DoFn which
actually runs externally?

Thanks in advance for the feedback.

Cheers,



--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com