Re: Community hackathon

2017-04-24 Thread Davor Bonaci
Thanks everyone for the enthusiasm!

Let's go with this Wednesday, 4/26, starting at 10 AM Pacific time, and
running for the following 24 hours. I'll try to seed the
instructions/starting point, and then let's take it from there.

(Michael, invite sent.)

Davor

On Mon, Apr 24, 2017 at 7:47 PM, Michael Huston 
wrote:

> Could you please add me to the Slack channel also? My apologizes for the
> noise on this mailing list and if there is a better way to request access.
>
> Cheers,
> Michael
>
> On Mon, Apr 24, 2017 at 6:15 PM, Lukasz Cwik 
> wrote:
>
> > Dylan, sent you invite to slack channel.
> >
> > On Mon, Apr 24, 2017 at 5:18 PM, Dylan Raithel 
> > wrote:
> >
> > > Can you please add me to the Slack channel?
> > >
> > > On Apr 24, 2017 12:51 AM, "Jean-Baptiste Onofré" 
> > wrote:
> > >
> > > > That's a wonderful idea !
> > > >
> > > > I think the easiest way to organize this event is using the Slack
> > > channels
> > > > to discuss, help each other, and sync together.
> > > >
> > > > Regards
> > > > JB
> > > >
> > > > On 04/24/2017 09:48 AM, Davor Bonaci wrote:
> > > >
> > > >> We've been working as a community towards the first stable release
> > for a
> > > >> while now, and I think we made a ton of progress across the board
> over
> > > the
> > > >> last few weeks.
> > > >>
> > > >> We could try to organize a community-wide hackathon to identify and
> > fix
> > > >> those last few issues, as well as to get a better sense of the
> overall
> > > >> project quality as it stands right now.
> > > >>
> > > >> This could be a self-organized event, and coordinated via the Slack
> > > >> channel. For example, we (as a community and participants) can try
> out
> > > the
> > > >> project in various ways -- quickstart, examples, different runners,
> > > >> different platforms -- immediately fixing issues as we run into
> them.
> > It
> > > >> could last, say, 24 hours, with people from different time zones
> > > >> participating at the time of their choosing.
> > > >>
> > > >> Thoughts?
> > > >>
> > > >> Davor
> > > >>
> > > >>
> > > > --
> > > > Jean-Baptiste Onofré
> > > > jbono...@apache.org
> > > > http://blog.nanthrax.net
> > > > Talend - http://www.talend.com
> > > >
> > >
> >
>


Re: Help needed with WordCount commands

2017-04-24 Thread Frances Perry
This is a great way for folks to contribute as part of the community
hackathon.

I love the way these walkthroughs have instructions for all runners --
really helps drive home the portability message ;-)


On Fri, Apr 21, 2017 at 4:13 PM, Hadar Hod 
wrote:

> Hi everyone!
>
> Your help is needed with the WordCount [1] documentation!
>
> I'm a technical writer, continuing to work on Beam docs. In getting the
> site ready for the 2.0 release, I added a “How to run” section to each of
> the examples in the WordCount doc (staged here [2]). To be more exact, I
> added a Java section and Python section for each example (MinimalWC, WC,
> etc). Each section like this is broken down into the different runner
> options so that users can toggle to the one they need.
>
> *However(!)* the actual commands I added in the doc are guesses and have
> not been tested for correctness. I’d like to ask for your help in finishing
> up this page; If you know the correct directions for any of the runners /
> examples / SDKs, please leave me a comment in PR 222 [3] that I have out.
>
> Your help is greatly appreciated!
>
> Thank you,
> Hadar
>
> [1] https://beam.apache.org/get-started/wordcount-example/
> [2] *http://apache-beam-website-pull-requests.storage.
> googleapis.com/222/get-started/wordcount-example/index.html
>  started/wordcount-example/index.html>*
> [3] https://github.com/apache/beam-site/pull/222
>


Re: [PROPOSAL] Apache Hive connector - HiveIO

2017-04-24 Thread Jean-Baptiste Onofré

Great news !

I'm ready to review and help !

Thanks !
Regards
JB

On 04/24/2017 11:29 PM, Madhusudan Borkar wrote:

Hi all,

In response to BEAM-1158,our team has worked on HiveIO. We are ready with
code including unit testing. Please, let us know the next steps.
Since, our team is new to Beam, your help and suggestions will be
appreciated.

Madhu Borkar


[1]
https://docs.google.com/a/etouch.net/document/d/161_7y0A3kgLxmp6-TSNUVZgFgzCH1knwSWtNDwSQj7w/edit?usp=sharing

[2] https://issues.apache.org/jira/browse/BEAM-1158




--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com


Re: Community hackathon

2017-04-24 Thread Michael Huston
Could you please add me to the Slack channel also? My apologizes for the
noise on this mailing list and if there is a better way to request access.

Cheers,
Michael

On Mon, Apr 24, 2017 at 6:15 PM, Lukasz Cwik 
wrote:

> Dylan, sent you invite to slack channel.
>
> On Mon, Apr 24, 2017 at 5:18 PM, Dylan Raithel 
> wrote:
>
> > Can you please add me to the Slack channel?
> >
> > On Apr 24, 2017 12:51 AM, "Jean-Baptiste Onofré" 
> wrote:
> >
> > > That's a wonderful idea !
> > >
> > > I think the easiest way to organize this event is using the Slack
> > channels
> > > to discuss, help each other, and sync together.
> > >
> > > Regards
> > > JB
> > >
> > > On 04/24/2017 09:48 AM, Davor Bonaci wrote:
> > >
> > >> We've been working as a community towards the first stable release
> for a
> > >> while now, and I think we made a ton of progress across the board over
> > the
> > >> last few weeks.
> > >>
> > >> We could try to organize a community-wide hackathon to identify and
> fix
> > >> those last few issues, as well as to get a better sense of the overall
> > >> project quality as it stands right now.
> > >>
> > >> This could be a self-organized event, and coordinated via the Slack
> > >> channel. For example, we (as a community and participants) can try out
> > the
> > >> project in various ways -- quickstart, examples, different runners,
> > >> different platforms -- immediately fixing issues as we run into them.
> It
> > >> could last, say, 24 hours, with people from different time zones
> > >> participating at the time of their choosing.
> > >>
> > >> Thoughts?
> > >>
> > >> Davor
> > >>
> > >>
> > > --
> > > Jean-Baptiste Onofré
> > > jbono...@apache.org
> > > http://blog.nanthrax.net
> > > Talend - http://www.talend.com
> > >
> >
>


Re: Community hackathon

2017-04-24 Thread Lukasz Cwik
Dylan, sent you invite to slack channel.

On Mon, Apr 24, 2017 at 5:18 PM, Dylan Raithel 
wrote:

> Can you please add me to the Slack channel?
>
> On Apr 24, 2017 12:51 AM, "Jean-Baptiste Onofré"  wrote:
>
> > That's a wonderful idea !
> >
> > I think the easiest way to organize this event is using the Slack
> channels
> > to discuss, help each other, and sync together.
> >
> > Regards
> > JB
> >
> > On 04/24/2017 09:48 AM, Davor Bonaci wrote:
> >
> >> We've been working as a community towards the first stable release for a
> >> while now, and I think we made a ton of progress across the board over
> the
> >> last few weeks.
> >>
> >> We could try to organize a community-wide hackathon to identify and fix
> >> those last few issues, as well as to get a better sense of the overall
> >> project quality as it stands right now.
> >>
> >> This could be a self-organized event, and coordinated via the Slack
> >> channel. For example, we (as a community and participants) can try out
> the
> >> project in various ways -- quickstart, examples, different runners,
> >> different platforms -- immediately fixing issues as we run into them. It
> >> could last, say, 24 hours, with people from different time zones
> >> participating at the time of their choosing.
> >>
> >> Thoughts?
> >>
> >> Davor
> >>
> >>
> > --
> > Jean-Baptiste Onofré
> > jbono...@apache.org
> > http://blog.nanthrax.net
> > Talend - http://www.talend.com
> >
>


Re: [DISCUSSION] Encouraging more contributions

2017-04-24 Thread Stephen Sisk
general +1 to the concept, including driving down
assigned-but-not-actually-being-worked-on items.

I also really like the idea of having a mentor on tickets.

Etienne,  Re: specific help for I/Os - is the I/O Authoring docs not a good
answer? https://beam.apache.org/documentation/io/io-toc/  (or perhaps we
need to update that somehow)

S

On Mon, Apr 24, 2017 at 5:45 PM Sourabh Bajaj
 wrote:

> For 6. I think having them in one page on the website where we can find the
> design docs more easily would be great.
>
> 7. For low-hanging-fruit, one thing I really liked from some Mozilla
> projects was assigning a mentor on the ticket. Someone you can reach out to
> if you have questions. I think this makes the entry barrier really low for
> first time contributors who might feel intimidated asking questions
> completely in public.
>
> On Mon, Apr 24, 2017 at 10:06 AM Kenneth Knowles 
> wrote:
>
> > I like the subject Etienne has brought up, and will give it a number in
> > this list :-)
> >
> > 6. Have more technical reference docs (not just workspace set up) for
> > contributors.
> >
> > I think this overlaps a lot with a prior discussion about where to
> collect
> > design proposals [1]. Design docs used to be just dropped into a public
> > folder, but that got disorganized. And that thread was about work in
> > progress, so JIRA was a good place for details after a dev@ thread
> agrees
> > on a proposal. At this point, the designs are pretty solid conceptually
> or
> > even implemented and we could start to build out deeper technical bits on
> > the web site, or at least some place that people can find it. We do have
> > the Testing Guide and the PTransform Style Guide and somewhere near there
> > we could have deeper references. I think we need a broader vision for the
> > "table of contents" here.
> >
> > For my docs (triggers, lateness, runner API, side inputs, state, coders)
> I
> > haven't had time, but I do intend to both translate from GDoc to some
> other
> > format and also rewrite versions for users where appropriate. Probably
> this
> > will mean coming up with that table of contents.
> >
> > Kenn
> >
> > [1]
> >
> >
> https://lists.apache.org/thread.html/%3c6bc60c88-cf91-4fff-eae6-fea6ee06f...@nanthrax.net%3E
> >
> >
> > On Mon, Apr 24, 2017 at 9:33 AM, Neelesh Salian <
> neeleshssal...@gmail.com>
> > wrote:
> >
> > > Agreed. I have some old JIRAs that I am cleaning up.
> > >
> > > Thank you for bringing this up.
> > >
> > > On Mon, Apr 24, 2017 at 9:29 AM, Jean-Baptiste Onofré  >
> > > wrote:
> > >
> > > > Same also for Slack, github comments, etc.
> > > >
> > > > From a Apache perspective, it should happen on the mailing list,
> > > > eventually referencing a central wiki/faq/whatever.
> > > >
> > > > Regards
> > > > JB
> > > >
> > > >
> > > > On 04/24/2017 06:23 PM, Mingmin Xu wrote:
> > > >
> > > >> many design documents are mixed in maillist, jira comments, it would
> > be
> > > a
> > > >> big help to put them in a centralized list. Also I would expect more
> > > >> wiki/blogs to provide in-depth analysis, like the translation from
> > > >> pipeline
> > > >> to runner specified topology, window/trigger implementation. Without
> > > these
> > > >> knowledge, it's hard to touch the core concepts.
> > > >>
> > > >> On Mon, Apr 24, 2017 at 6:03 AM, Jean-Baptiste Onofré <
> > j...@nanthrax.net>
> > > >> wrote:
> > > >>
> > > >> Got it. By experience on other Apache projects, it's really hard to
> > > >>> maintain ;)
> > > >>>
> > > >>> Regards
> > > >>> JB
> > > >>>
> > > >>>
> > > >>> On 04/24/2017 02:56 PM, Etienne Chauchot wrote:
> > > >>>
> > > >>> Hi JB,
> > > 
> > >  I was proposing a FAQ (or another form), not something about IDE
> > > setup.
> > >  The FAQ
> > >  could group in the same place Q/A like for example "what is a
> > source,
> > >  how
> > >  do I
> > >  use it to implement an IO"
> > > 
> > >  Etienne
> > > 
> > > 
> > >  Le 24/04/2017 à 14:19, Jean-Baptiste Onofré a écrit :
> > > 
> > >  Hi Etienne,
> > > >
> > > > What about the contribution guide ? I think it's covered in the
> > > > IntelliJ
> > > > and
> > > > Eclipse setup sections.
> > > >
> > > > Regards
> > > > JB
> > > >
> > > > On 04/24/2017 02:12 PM, Etienne Chauchot wrote:
> > > >
> > > > Hi all,
> > > >>
> > > >> I definitely agree with everything that is said in this thread.
> > > >>
> > > >> I might suggest another good to have:
> > > >>
> > > >> to ease the work of a new contributor, it would be nice to have
> > some
> > > >> sort of
> > > >> programming guide but not oriented to pipeline writers but to
> > > >> sdk/runner/io/...
> > > >> writers.
> > > >>
> > > >> I know that new contributors have the docs available in the
> google
> > > >> drive, the
> > > >> ML, the code base, and the availability of beamers, but maybe
> > having
> > > >> key po

Re: [DISCUSSION] Encouraging more contributions

2017-04-24 Thread Sourabh Bajaj
For 6. I think having them in one page on the website where we can find the
design docs more easily would be great.

7. For low-hanging-fruit, one thing I really liked from some Mozilla
projects was assigning a mentor on the ticket. Someone you can reach out to
if you have questions. I think this makes the entry barrier really low for
first time contributors who might feel intimidated asking questions
completely in public.

On Mon, Apr 24, 2017 at 10:06 AM Kenneth Knowles 
wrote:

> I like the subject Etienne has brought up, and will give it a number in
> this list :-)
>
> 6. Have more technical reference docs (not just workspace set up) for
> contributors.
>
> I think this overlaps a lot with a prior discussion about where to collect
> design proposals [1]. Design docs used to be just dropped into a public
> folder, but that got disorganized. And that thread was about work in
> progress, so JIRA was a good place for details after a dev@ thread agrees
> on a proposal. At this point, the designs are pretty solid conceptually or
> even implemented and we could start to build out deeper technical bits on
> the web site, or at least some place that people can find it. We do have
> the Testing Guide and the PTransform Style Guide and somewhere near there
> we could have deeper references. I think we need a broader vision for the
> "table of contents" here.
>
> For my docs (triggers, lateness, runner API, side inputs, state, coders) I
> haven't had time, but I do intend to both translate from GDoc to some other
> format and also rewrite versions for users where appropriate. Probably this
> will mean coming up with that table of contents.
>
> Kenn
>
> [1]
>
> https://lists.apache.org/thread.html/%3c6bc60c88-cf91-4fff-eae6-fea6ee06f...@nanthrax.net%3E
>
>
> On Mon, Apr 24, 2017 at 9:33 AM, Neelesh Salian 
> wrote:
>
> > Agreed. I have some old JIRAs that I am cleaning up.
> >
> > Thank you for bringing this up.
> >
> > On Mon, Apr 24, 2017 at 9:29 AM, Jean-Baptiste Onofré 
> > wrote:
> >
> > > Same also for Slack, github comments, etc.
> > >
> > > From a Apache perspective, it should happen on the mailing list,
> > > eventually referencing a central wiki/faq/whatever.
> > >
> > > Regards
> > > JB
> > >
> > >
> > > On 04/24/2017 06:23 PM, Mingmin Xu wrote:
> > >
> > >> many design documents are mixed in maillist, jira comments, it would
> be
> > a
> > >> big help to put them in a centralized list. Also I would expect more
> > >> wiki/blogs to provide in-depth analysis, like the translation from
> > >> pipeline
> > >> to runner specified topology, window/trigger implementation. Without
> > these
> > >> knowledge, it's hard to touch the core concepts.
> > >>
> > >> On Mon, Apr 24, 2017 at 6:03 AM, Jean-Baptiste Onofré <
> j...@nanthrax.net>
> > >> wrote:
> > >>
> > >> Got it. By experience on other Apache projects, it's really hard to
> > >>> maintain ;)
> > >>>
> > >>> Regards
> > >>> JB
> > >>>
> > >>>
> > >>> On 04/24/2017 02:56 PM, Etienne Chauchot wrote:
> > >>>
> > >>> Hi JB,
> > 
> >  I was proposing a FAQ (or another form), not something about IDE
> > setup.
> >  The FAQ
> >  could group in the same place Q/A like for example "what is a
> source,
> >  how
> >  do I
> >  use it to implement an IO"
> > 
> >  Etienne
> > 
> > 
> >  Le 24/04/2017 à 14:19, Jean-Baptiste Onofré a écrit :
> > 
> >  Hi Etienne,
> > >
> > > What about the contribution guide ? I think it's covered in the
> > > IntelliJ
> > > and
> > > Eclipse setup sections.
> > >
> > > Regards
> > > JB
> > >
> > > On 04/24/2017 02:12 PM, Etienne Chauchot wrote:
> > >
> > > Hi all,
> > >>
> > >> I definitely agree with everything that is said in this thread.
> > >>
> > >> I might suggest another good to have:
> > >>
> > >> to ease the work of a new contributor, it would be nice to have
> some
> > >> sort of
> > >> programming guide but not oriented to pipeline writers but to
> > >> sdk/runner/io/...
> > >> writers.
> > >>
> > >> I know that new contributors have the docs available in the google
> > >> drive, the
> > >> ML, the code base, and the availability of beamers, but maybe
> having
> > >> key points
> > >> in a common place (like FAQ for sdk/runner/io/... writers, for
> > >> example)
> > >> would be
> > >> interesting.
> > >>
> > >> Best,
> > >>
> > >> Etienne
> > >>
> > >>
> > >> Le 24/04/2017 à 09:14, Jean-Baptiste Onofré a écrit :
> > >>
> > >> Hi,
> > >>>
> > >>> I think we already tag the newbie jira ("low hanging fruit" ;)).
> > >>>
> > >>> Good idea for domain of interest/concept.
> > >>>
> > >>> Regards
> > >>> JB
> > >>>
> > >>> On 04/24/2017 09:01 AM, Ankur Chauhan wrote:
> > >>>
> > >>> Might I suggest adding tags to projects based on area of
> intetest,
> >  concept
> > 

Re: [PROPOSAL] Apache Hive connector - HiveIO

2017-04-24 Thread Madhusudan Borkar
Please, use following link for the HiveIO proposal.

https://docs.google.com/document/d/1JOzihFiXkQjtv6rur8-vCixSK-nHhIoIij9MwJZ_Dp0/edit?usp=sharing

Madhu Borkar

On Mon, Apr 24, 2017 at 5:01 PM, Madhusudan Borkar 
wrote:

> Hi Stephen,
> Sure. I shared the document for comments but it didn't work.
> Let me look into it.
>
> Madhu Borkar
>
> On Mon, Apr 24, 2017 at 4:48 PM, Stephen Sisk 
> wrote:
>
>> hi Madhu!
>>
>> thanks for working on this. I'm excited to see your team's work!
>>
>> You might comment on the jira issue mentioned and request it be assigned
>> to
>> you - the way the beam community works [1], we normally assign bugs to the
>> people working on them, so that multiple people don't work on them at the
>> same time.
>>
>> I tried opening that doc, and didn't have permission - can you please make
>> the document world-readable (and commentable)? thanks!
>>
>> The way to proceed from here:
>> 1) I would strongly suggest checking out the I/O Authoring overview [2]
>> (probably review for you), PTransform Style Guide [3] and the draft of the
>> I/O testing doc [4].
>> 2) Submit a pull request to beam's github repo (see the contribution guide
>> for details)
>>
>> Thanks again, I look forward to seeing your PR.
>> Stephen
>>
>> [1] Beam Contribution Guide -
>> https://beam.apache.org/contribute/contribution-guide/
>> [2] I/O Authoring overview -
>> https://beam.apache.org/documentation/io/authoring-overview/
>> [3] PTransform style guide -
>> https://beam.apache.org/contribute/ptransform-style-guide/
>> [4] Testing doc -
>> https://docs.google.com/document/d/153J9jPQhMCNi_eBzJfhAg-Np
>> rQ7vbf1jNVRgdqeEE8I/edit?usp=sharing
>>
>> On Mon, Apr 24, 2017 at 2:29 PM Madhusudan Borkar 
>> wrote:
>>
>> > Hi all,
>> >
>> > In response to BEAM-1158,our team has worked on HiveIO. We are ready
>> with
>> > code including unit testing. Please, let us know the next steps.
>> > Since, our team is new to Beam, your help and suggestions will be
>> > appreciated.
>> >
>> > Madhu Borkar
>> >
>> >
>> > [1]
>> >
>> > https://docs.google.com/a/etouch.net/document/d/161_7y0A3kgL
>> xmp6-TSNUVZgFgzCH1knwSWtNDwSQj7w/edit?usp=sharing
>> >
>> > [2] https://issues.apache.org/jira/browse/BEAM-1158
>> > 
>> >
>>
>
>


Re: Community hackathon

2017-04-24 Thread Dylan Raithel
Can you please add me to the Slack channel?

On Apr 24, 2017 12:51 AM, "Jean-Baptiste Onofré"  wrote:

> That's a wonderful idea !
>
> I think the easiest way to organize this event is using the Slack channels
> to discuss, help each other, and sync together.
>
> Regards
> JB
>
> On 04/24/2017 09:48 AM, Davor Bonaci wrote:
>
>> We've been working as a community towards the first stable release for a
>> while now, and I think we made a ton of progress across the board over the
>> last few weeks.
>>
>> We could try to organize a community-wide hackathon to identify and fix
>> those last few issues, as well as to get a better sense of the overall
>> project quality as it stands right now.
>>
>> This could be a self-organized event, and coordinated via the Slack
>> channel. For example, we (as a community and participants) can try out the
>> project in various ways -- quickstart, examples, different runners,
>> different platforms -- immediately fixing issues as we run into them. It
>> could last, say, 24 hours, with people from different time zones
>> participating at the time of their choosing.
>>
>> Thoughts?
>>
>> Davor
>>
>>
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>


Re: [PROPOSAL] Apache Hive connector - HiveIO

2017-04-24 Thread Madhusudan Borkar
Hi Stephen,
Sure. I shared the document for comments but it didn't work.
Let me look into it.

Madhu Borkar

On Mon, Apr 24, 2017 at 4:48 PM, Stephen Sisk 
wrote:

> hi Madhu!
>
> thanks for working on this. I'm excited to see your team's work!
>
> You might comment on the jira issue mentioned and request it be assigned to
> you - the way the beam community works [1], we normally assign bugs to the
> people working on them, so that multiple people don't work on them at the
> same time.
>
> I tried opening that doc, and didn't have permission - can you please make
> the document world-readable (and commentable)? thanks!
>
> The way to proceed from here:
> 1) I would strongly suggest checking out the I/O Authoring overview [2]
> (probably review for you), PTransform Style Guide [3] and the draft of the
> I/O testing doc [4].
> 2) Submit a pull request to beam's github repo (see the contribution guide
> for details)
>
> Thanks again, I look forward to seeing your PR.
> Stephen
>
> [1] Beam Contribution Guide -
> https://beam.apache.org/contribute/contribution-guide/
> [2] I/O Authoring overview -
> https://beam.apache.org/documentation/io/authoring-overview/
> [3] PTransform style guide -
> https://beam.apache.org/contribute/ptransform-style-guide/
> [4] Testing doc -
> https://docs.google.com/document/d/153J9jPQhMCNi_eBzJfhAg-
> NprQ7vbf1jNVRgdqeEE8I/edit?usp=sharing
>
> On Mon, Apr 24, 2017 at 2:29 PM Madhusudan Borkar 
> wrote:
>
> > Hi all,
> >
> > In response to BEAM-1158,our team has worked on HiveIO. We are ready with
> > code including unit testing. Please, let us know the next steps.
> > Since, our team is new to Beam, your help and suggestions will be
> > appreciated.
> >
> > Madhu Borkar
> >
> >
> > [1]
> >
> > https://docs.google.com/a/etouch.net/document/d/161_7y0A3kgLxmp6-
> TSNUVZgFgzCH1knwSWtNDwSQj7w/edit?usp=sharing
> >
> > [2] https://issues.apache.org/jira/browse/BEAM-1158
> > 
> >
>


Re: Community hackathon

2017-04-24 Thread Jason Kuster
+1, looking forward to it!

On Mon, Apr 24, 2017 at 2:56 PM, Sourabh Bajaj <
sourabhba...@google.com.invalid> wrote:

> +1
>
> On Mon, Apr 24, 2017 at 2:55 PM Ahmet Altay 
> wrote:
>
> > +1, this is a great idea.
> >
> > On Mon, Apr 24, 2017 at 3:54 AM, JingsongLee 
> > wrote:
> >
> > > +1
> > > best,
> > > Jingsonglee
> > >
> > 
> --From:Ted
> > > Yu Time:2017 Apr 24 (Mon) 17:29To:dev <
> > > dev@beam.apache.org>Subject:Re: Community hackathon
> > > +1
> > >
> > > > On Apr 24, 2017, at 12:51 AM, Jean-Baptiste Onofré  > > > wrote:
> > > >
> > > > That's a wonderful idea !
> > > >
> > > > I think the easiest way to organize this event is using th
> > > e Slack channels to discuss, help each other, and sync together.
> > > >
> > > > Regards
> > > > JB
> > > >
> > > >> On 04/24/2017 09:48 AM, Davor Bonaci wrote:
> > > >> We've been working as a community towards the first stabl
> > > e release for a
> > > >> while now, and I think we made a ton of progress across t
> > > he board over the
> > > >> last few weeks.
> > > >>
> > > >> We could try to organize a community-wide hackathon to identify and
> > fix
> > > >> those last few issues, as well as to get a better sense of the
> overall
> > > >> project quality as it stands right now.
> > > >>
> > > >> This could be a self-organized event, and coordinated via the Slack
> > > >> channel. For example, we (as a community and participants
> > > ) can try out the
> > > >> project in various ways -- quickstart, examples, different runners,
> > > >> different platforms -- immediately fixing issues as we
> > > run into them. It
> > > >> could last, say, 24 hours, with people from different time zones
> > > >> participating at the time of their choosing.
> > > >>
> > > >> Thoughts?
> > > >>
> > > >> Davor
> > > >
> > > > --
> > > > Jean-Baptiste Onofré
> > > > jbono...@apache.org
> > > > http://blog.nanthrax.net
> > > > Talend - http://www.talend.com
> > >
> >
>



-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow


Re: [PROPOSAL] Apache Hive connector - HiveIO

2017-04-24 Thread Stephen Sisk
hi Madhu!

thanks for working on this. I'm excited to see your team's work!

You might comment on the jira issue mentioned and request it be assigned to
you - the way the beam community works [1], we normally assign bugs to the
people working on them, so that multiple people don't work on them at the
same time.

I tried opening that doc, and didn't have permission - can you please make
the document world-readable (and commentable)? thanks!

The way to proceed from here:
1) I would strongly suggest checking out the I/O Authoring overview [2]
(probably review for you), PTransform Style Guide [3] and the draft of the
I/O testing doc [4].
2) Submit a pull request to beam's github repo (see the contribution guide
for details)

Thanks again, I look forward to seeing your PR.
Stephen

[1] Beam Contribution Guide -
https://beam.apache.org/contribute/contribution-guide/
[2] I/O Authoring overview -
https://beam.apache.org/documentation/io/authoring-overview/
[3] PTransform style guide -
https://beam.apache.org/contribute/ptransform-style-guide/
[4] Testing doc -
https://docs.google.com/document/d/153J9jPQhMCNi_eBzJfhAg-NprQ7vbf1jNVRgdqeEE8I/edit?usp=sharing

On Mon, Apr 24, 2017 at 2:29 PM Madhusudan Borkar 
wrote:

> Hi all,
>
> In response to BEAM-1158,our team has worked on HiveIO. We are ready with
> code including unit testing. Please, let us know the next steps.
> Since, our team is new to Beam, your help and suggestions will be
> appreciated.
>
> Madhu Borkar
>
>
> [1]
>
> https://docs.google.com/a/etouch.net/document/d/161_7y0A3kgLxmp6-TSNUVZgFgzCH1knwSWtNDwSQj7w/edit?usp=sharing
>
> [2] https://issues.apache.org/jira/browse/BEAM-1158
> 
>


Re: Community hackathon

2017-04-24 Thread Sourabh Bajaj
+1

On Mon, Apr 24, 2017 at 2:55 PM Ahmet Altay 
wrote:

> +1, this is a great idea.
>
> On Mon, Apr 24, 2017 at 3:54 AM, JingsongLee 
> wrote:
>
> > +1
> > best,
> > Jingsonglee
> >
> --From:Ted
> > Yu Time:2017 Apr 24 (Mon) 17:29To:dev <
> > dev@beam.apache.org>Subject:Re: Community hackathon
> > +1
> >
> > > On Apr 24, 2017, at 12:51 AM, Jean-Baptiste Onofré  > > wrote:
> > >
> > > That's a wonderful idea !
> > >
> > > I think the easiest way to organize this event is using th
> > e Slack channels to discuss, help each other, and sync together.
> > >
> > > Regards
> > > JB
> > >
> > >> On 04/24/2017 09:48 AM, Davor Bonaci wrote:
> > >> We've been working as a community towards the first stabl
> > e release for a
> > >> while now, and I think we made a ton of progress across t
> > he board over the
> > >> last few weeks.
> > >>
> > >> We could try to organize a community-wide hackathon to identify and
> fix
> > >> those last few issues, as well as to get a better sense of the overall
> > >> project quality as it stands right now.
> > >>
> > >> This could be a self-organized event, and coordinated via the Slack
> > >> channel. For example, we (as a community and participants
> > ) can try out the
> > >> project in various ways -- quickstart, examples, different runners,
> > >> different platforms -- immediately fixing issues as we
> > run into them. It
> > >> could last, say, 24 hours, with people from different time zones
> > >> participating at the time of their choosing.
> > >>
> > >> Thoughts?
> > >>
> > >> Davor
> > >
> > > --
> > > Jean-Baptiste Onofré
> > > jbono...@apache.org
> > > http://blog.nanthrax.net
> > > Talend - http://www.talend.com
> >
>


Re: Community hackathon

2017-04-24 Thread Ahmet Altay
+1, this is a great idea.

On Mon, Apr 24, 2017 at 3:54 AM, JingsongLee 
wrote:

> +1
> best,
> Jingsonglee
> --From:Ted
> Yu Time:2017 Apr 24 (Mon) 17:29To:dev <
> dev@beam.apache.org>Subject:Re: Community hackathon
> +1
>
> > On Apr 24, 2017, at 12:51 AM, Jean-Baptiste Onofré  > wrote:
> >
> > That's a wonderful idea !
> >
> > I think the easiest way to organize this event is using th
> e Slack channels to discuss, help each other, and sync together.
> >
> > Regards
> > JB
> >
> >> On 04/24/2017 09:48 AM, Davor Bonaci wrote:
> >> We've been working as a community towards the first stabl
> e release for a
> >> while now, and I think we made a ton of progress across t
> he board over the
> >> last few weeks.
> >>
> >> We could try to organize a community-wide hackathon to identify and fix
> >> those last few issues, as well as to get a better sense of the overall
> >> project quality as it stands right now.
> >>
> >> This could be a self-organized event, and coordinated via the Slack
> >> channel. For example, we (as a community and participants
> ) can try out the
> >> project in various ways -- quickstart, examples, different runners,
> >> different platforms -- immediately fixing issues as we
> run into them. It
> >> could last, say, 24 hours, with people from different time zones
> >> participating at the time of their choosing.
> >>
> >> Thoughts?
> >>
> >> Davor
> >
> > --
> > Jean-Baptiste Onofré
> > jbono...@apache.org
> > http://blog.nanthrax.net
> > Talend - http://www.talend.com
>


[PROPOSAL] Apache Hive connector - HiveIO

2017-04-24 Thread Madhusudan Borkar
Hi all,

In response to BEAM-1158,our team has worked on HiveIO. We are ready with
code including unit testing. Please, let us know the next steps.
Since, our team is new to Beam, your help and suggestions will be
appreciated.

Madhu Borkar


[1]
https://docs.google.com/a/etouch.net/document/d/161_7y0A3kgLxmp6-TSNUVZgFgzCH1knwSWtNDwSQj7w/edit?usp=sharing

[2] https://issues.apache.org/jira/browse/BEAM-1158



Re: [PROPOSAL] Remove KeyedCombineFn

2017-04-24 Thread Kenneth Knowles
Great, this is https://issues.apache.org/jira/browse/BEAM-2049 via
https://github.com/apache/beam/pull/2636.

On Fri, Apr 21, 2017 at 11:40 PM, Pei HE  wrote:

> +1
>
> On Sat, Apr 22, 2017 at 12:16 PM, Jean-Baptiste Onofré 
> wrote:
>
> > +1
> >
> > Regards
> > JB
> >
> >
> > On 04/21/2017 07:24 PM, Kenneth Knowles wrote:
> >
> >> Hi all,
> >>
> >> I propose that we remove KeyedCombineFn before the first stable release.
> >>
> >> I don't think it adds enough value for the complexity it adds to e.g.
> >> CombineWithContext [1] and state [2, 3], and it doesn't seem to me that
> >> users really use it when we might expect. I am happy to be demonstrated
> >> wrong.
> >>
> >> It is very likely that you have never written [4, 5] or thought about
> >> KeyedCombineFn. So for context, here are excepts from signatures just to
> >> show the difference from CombineFn:
> >>
> >> CombineFn {
> >>   AccumT createAccumulator();
> >>   AccumT addInput(AccumT accum, InputT input);
> >>   AccumT mergeAccumulators(Iterable accums);
> >>   OutputT extractOutput(AccumT accum);
> >> }
> >>
> >> KeyedCombineFn {
> >>   AccumT createAccumulator(K key);
> >>   AccumT addInput(K key, AccumT accum, InputT input);
> >>   AccumT mergeAccumulators(K key, Iterable accums);
> >>   OutputT extractOutput(K key, AccumT accum);
> >> }
> >>
> >> So what are the particular reasons for this, versus a CombineFn that has
> >> KVs as its input and accumulator types?
> >>
> >>  - There are some performance improvements potentially from not passing
> >> keys around, based on the assumption they are always available.
> >>
> >>  - There is also a spec difference because it only has to be associative
> >> and commutative per key, cannot be applied in a global combine, and
> >> addInput is automatically key preserving.
> >>
> >> But in fact, in all of my code crawling the class is almost never used
> >> (even over the course of its history at Google) and even the few uses I
> >> found were often mistakes where the key is totally ignored, probably
> >> because a user thinks "I am doing a keyed combine so I need a keyed
> >> combine
> >> function". So the number of users actually affected is about zero.
> >>
> >> I would be curious if anyone has a compelling case for keeping
> >> KeyedCombineFn.
> >>
> >> Kenn
> >>
> >> [1]
> >> https://github.com/yafengguo/Apache-beam/blob/master/sdks/ja
> >> va/core/src/main/java/org/apache/beam/sdk/transforms/Combine
> >> WithContext.java
> >> [2] https://issues.apache.org/jira/browse/BEAM-1336
> >> [3] https://github.com/apache/beam/pull/2627
> >> [4]
> >> https://github.com/search?l=Java&q=KeyedCombineFn&ref=advsea
> >> rch&type=Code&utf8=%E2%9C%93
> >> [5] https://www.google.com/search?q=KeyedCombineFn
> >>
> >>
> > --
> > Jean-Baptiste Onofré
> > jbono...@apache.org
> > http://blog.nanthrax.net
> > Talend - http://www.talend.com
> >
>


Re: [jira] [Reopened] (BEAM-846) Decouple side input window mapping from WindowFn

2017-04-24 Thread Robert Bradshaw
They're already decoupled in the Python implementation, though they're
not exposed in the API.

On Mon, Apr 24, 2017 at 10:16 AM, Thomas Groh (JIRA)  wrote:
>
>  [ 
> https://issues.apache.org/jira/browse/BEAM-846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
>  ]
>
> Thomas Groh reopened BEAM-846:
> --
>   Assignee: Robert Bradshaw  (was: Thomas Groh)
>
> This is done the the Java SDK. I don't know if there's any more work to do in 
> python.
>
>> Decouple side input window mapping from WindowFn
>> 
>>
>> Key: BEAM-846
>> URL: https://issues.apache.org/jira/browse/BEAM-846
>> Project: Beam
>>  Issue Type: New Feature
>>  Components: beam-model-runner-api, sdk-java-core
>>Reporter: Robert Bradshaw
>>Assignee: Robert Bradshaw
>>  Labels: backward-incompatible
>> Fix For: First stable release
>>
>>
>> Currently the main WindowFn provides as getSideInputWindow method. Instead, 
>> this mapping should be specified per-side-input (thought the default mapping 
>> would remain the same).
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.3.15#6346)


Re: [DISCUSSION] Encouraging more contributions

2017-04-24 Thread Kenneth Knowles
I like the subject Etienne has brought up, and will give it a number in
this list :-)

6. Have more technical reference docs (not just workspace set up) for
contributors.

I think this overlaps a lot with a prior discussion about where to collect
design proposals [1]. Design docs used to be just dropped into a public
folder, but that got disorganized. And that thread was about work in
progress, so JIRA was a good place for details after a dev@ thread agrees
on a proposal. At this point, the designs are pretty solid conceptually or
even implemented and we could start to build out deeper technical bits on
the web site, or at least some place that people can find it. We do have
the Testing Guide and the PTransform Style Guide and somewhere near there
we could have deeper references. I think we need a broader vision for the
"table of contents" here.

For my docs (triggers, lateness, runner API, side inputs, state, coders) I
haven't had time, but I do intend to both translate from GDoc to some other
format and also rewrite versions for users where appropriate. Probably this
will mean coming up with that table of contents.

Kenn

[1]
https://lists.apache.org/thread.html/%3c6bc60c88-cf91-4fff-eae6-fea6ee06f...@nanthrax.net%3E


On Mon, Apr 24, 2017 at 9:33 AM, Neelesh Salian 
wrote:

> Agreed. I have some old JIRAs that I am cleaning up.
>
> Thank you for bringing this up.
>
> On Mon, Apr 24, 2017 at 9:29 AM, Jean-Baptiste Onofré 
> wrote:
>
> > Same also for Slack, github comments, etc.
> >
> > From a Apache perspective, it should happen on the mailing list,
> > eventually referencing a central wiki/faq/whatever.
> >
> > Regards
> > JB
> >
> >
> > On 04/24/2017 06:23 PM, Mingmin Xu wrote:
> >
> >> many design documents are mixed in maillist, jira comments, it would be
> a
> >> big help to put them in a centralized list. Also I would expect more
> >> wiki/blogs to provide in-depth analysis, like the translation from
> >> pipeline
> >> to runner specified topology, window/trigger implementation. Without
> these
> >> knowledge, it's hard to touch the core concepts.
> >>
> >> On Mon, Apr 24, 2017 at 6:03 AM, Jean-Baptiste Onofré 
> >> wrote:
> >>
> >> Got it. By experience on other Apache projects, it's really hard to
> >>> maintain ;)
> >>>
> >>> Regards
> >>> JB
> >>>
> >>>
> >>> On 04/24/2017 02:56 PM, Etienne Chauchot wrote:
> >>>
> >>> Hi JB,
> 
>  I was proposing a FAQ (or another form), not something about IDE
> setup.
>  The FAQ
>  could group in the same place Q/A like for example "what is a source,
>  how
>  do I
>  use it to implement an IO"
> 
>  Etienne
> 
> 
>  Le 24/04/2017 à 14:19, Jean-Baptiste Onofré a écrit :
> 
>  Hi Etienne,
> >
> > What about the contribution guide ? I think it's covered in the
> > IntelliJ
> > and
> > Eclipse setup sections.
> >
> > Regards
> > JB
> >
> > On 04/24/2017 02:12 PM, Etienne Chauchot wrote:
> >
> > Hi all,
> >>
> >> I definitely agree with everything that is said in this thread.
> >>
> >> I might suggest another good to have:
> >>
> >> to ease the work of a new contributor, it would be nice to have some
> >> sort of
> >> programming guide but not oriented to pipeline writers but to
> >> sdk/runner/io/...
> >> writers.
> >>
> >> I know that new contributors have the docs available in the google
> >> drive, the
> >> ML, the code base, and the availability of beamers, but maybe having
> >> key points
> >> in a common place (like FAQ for sdk/runner/io/... writers, for
> >> example)
> >> would be
> >> interesting.
> >>
> >> Best,
> >>
> >> Etienne
> >>
> >>
> >> Le 24/04/2017 à 09:14, Jean-Baptiste Onofré a écrit :
> >>
> >> Hi,
> >>>
> >>> I think we already tag the newbie jira ("low hanging fruit" ;)).
> >>>
> >>> Good idea for domain of interest/concept.
> >>>
> >>> Regards
> >>> JB
> >>>
> >>> On 04/24/2017 09:01 AM, Ankur Chauhan wrote:
> >>>
> >>> Might I suggest adding tags to projects based on area of intetest,
>  concept
>  and if it's a good "first bug".
> 
>  Sent from my iPhone
> 
>  On Apr 23, 2017, at 23:03, Davor Bonaci  wrote:
> 
> 
>  1. Have people unassign themselves from issues they're not
> actively
> >> working on.
> >> 2. Have the community engage more in triage, improving tickets
> >> descriptions and raising concerns.
> >> 3. Clean house - apply (2) to currently open issues (over 800).
> >> Perhaps
> >> some can be closed.
> >>
> >>
> >> +1 on all three of these, and will do my part shortly!
> >
> > Also, it is worth noting that we have improved as a project in
> > tracking
> > issues in the last 1-2 months. There are

Re: [DISCUSSION] Encouraging more contributions

2017-04-24 Thread Neelesh Salian
Agreed. I have some old JIRAs that I am cleaning up.

Thank you for bringing this up.

On Mon, Apr 24, 2017 at 9:29 AM, Jean-Baptiste Onofré 
wrote:

> Same also for Slack, github comments, etc.
>
> From a Apache perspective, it should happen on the mailing list,
> eventually referencing a central wiki/faq/whatever.
>
> Regards
> JB
>
>
> On 04/24/2017 06:23 PM, Mingmin Xu wrote:
>
>> many design documents are mixed in maillist, jira comments, it would be a
>> big help to put them in a centralized list. Also I would expect more
>> wiki/blogs to provide in-depth analysis, like the translation from
>> pipeline
>> to runner specified topology, window/trigger implementation. Without these
>> knowledge, it's hard to touch the core concepts.
>>
>> On Mon, Apr 24, 2017 at 6:03 AM, Jean-Baptiste Onofré 
>> wrote:
>>
>> Got it. By experience on other Apache projects, it's really hard to
>>> maintain ;)
>>>
>>> Regards
>>> JB
>>>
>>>
>>> On 04/24/2017 02:56 PM, Etienne Chauchot wrote:
>>>
>>> Hi JB,

 I was proposing a FAQ (or another form), not something about IDE setup.
 The FAQ
 could group in the same place Q/A like for example "what is a source,
 how
 do I
 use it to implement an IO"

 Etienne


 Le 24/04/2017 à 14:19, Jean-Baptiste Onofré a écrit :

 Hi Etienne,
>
> What about the contribution guide ? I think it's covered in the
> IntelliJ
> and
> Eclipse setup sections.
>
> Regards
> JB
>
> On 04/24/2017 02:12 PM, Etienne Chauchot wrote:
>
> Hi all,
>>
>> I definitely agree with everything that is said in this thread.
>>
>> I might suggest another good to have:
>>
>> to ease the work of a new contributor, it would be nice to have some
>> sort of
>> programming guide but not oriented to pipeline writers but to
>> sdk/runner/io/...
>> writers.
>>
>> I know that new contributors have the docs available in the google
>> drive, the
>> ML, the code base, and the availability of beamers, but maybe having
>> key points
>> in a common place (like FAQ for sdk/runner/io/... writers, for
>> example)
>> would be
>> interesting.
>>
>> Best,
>>
>> Etienne
>>
>>
>> Le 24/04/2017 à 09:14, Jean-Baptiste Onofré a écrit :
>>
>> Hi,
>>>
>>> I think we already tag the newbie jira ("low hanging fruit" ;)).
>>>
>>> Good idea for domain of interest/concept.
>>>
>>> Regards
>>> JB
>>>
>>> On 04/24/2017 09:01 AM, Ankur Chauhan wrote:
>>>
>>> Might I suggest adding tags to projects based on area of intetest,
 concept
 and if it's a good "first bug".

 Sent from my iPhone

 On Apr 23, 2017, at 23:03, Davor Bonaci  wrote:


 1. Have people unassign themselves from issues they're not actively
>> working on.
>> 2. Have the community engage more in triage, improving tickets
>> descriptions and raising concerns.
>> 3. Clean house - apply (2) to currently open issues (over 800).
>> Perhaps
>> some can be closed.
>>
>>
>> +1 on all three of these, and will do my part shortly!
>
> Also, it is worth noting that we have improved as a project in
> tracking
> issues in the last 1-2 months. There are more resolved issues than
> opened
> in this period, whereas in the past we'd have a hundred more opened
> than
> resolved.
>
> I would also propose to not assign new Jira automatically: now, the
> Jira is
>
> automatically assigned to the Jira component leader.
>>
>>
>> Imagine a user discovering an issue and filing a new JIRA issue.
> It
> wouldn't be assigned to anyone, significantly reducing the chance
> somebody
> will actually help.
>
> Of course, somebody could search for new issues periodically, etc.
> -- but
> that just won't happen. The final outcome would be -- instead of a
> lot of
> issues assigned to component leads, we'd have (much) more
> unassigned
> issues, which were *never* looked at. Assigning an issue just sets
> a
> community expectation that a committer should look -- and it does
> help move
> things along!
>
> I think a better approach of addressing the current state would be
> increase
> the number of components / component leads. With more people
> involved and
> lower per-person load, I think we'd be more effective.
>
>

>>>
>>
>
 --
>>> Jean-Baptiste Onofré
>>> jbono...@apache.org
>>> http://blog.nanthrax.net
>>> Talend - http://www.talend.com
>>>
>>>
>>
>>
>>
> --
> Jean-Baptiste Onofré
> jb

Re: [DISCUSSION] Encouraging more contributions

2017-04-24 Thread Jean-Baptiste Onofré

Same also for Slack, github comments, etc.

From a Apache perspective, it should happen on the mailing list, eventually 
referencing a central wiki/faq/whatever.


Regards
JB

On 04/24/2017 06:23 PM, Mingmin Xu wrote:

many design documents are mixed in maillist, jira comments, it would be a
big help to put them in a centralized list. Also I would expect more
wiki/blogs to provide in-depth analysis, like the translation from pipeline
to runner specified topology, window/trigger implementation. Without these
knowledge, it's hard to touch the core concepts.

On Mon, Apr 24, 2017 at 6:03 AM, Jean-Baptiste Onofré 
wrote:


Got it. By experience on other Apache projects, it's really hard to
maintain ;)

Regards
JB


On 04/24/2017 02:56 PM, Etienne Chauchot wrote:


Hi JB,

I was proposing a FAQ (or another form), not something about IDE setup.
The FAQ
could group in the same place Q/A like for example "what is a source, how
do I
use it to implement an IO"

Etienne


Le 24/04/2017 à 14:19, Jean-Baptiste Onofré a écrit :


Hi Etienne,

What about the contribution guide ? I think it's covered in the IntelliJ
and
Eclipse setup sections.

Regards
JB

On 04/24/2017 02:12 PM, Etienne Chauchot wrote:


Hi all,

I definitely agree with everything that is said in this thread.

I might suggest another good to have:

to ease the work of a new contributor, it would be nice to have some
sort of
programming guide but not oriented to pipeline writers but to
sdk/runner/io/...
writers.

I know that new contributors have the docs available in the google
drive, the
ML, the code base, and the availability of beamers, but maybe having
key points
in a common place (like FAQ for sdk/runner/io/... writers, for example)
would be
interesting.

Best,

Etienne


Le 24/04/2017 à 09:14, Jean-Baptiste Onofré a écrit :


Hi,

I think we already tag the newbie jira ("low hanging fruit" ;)).

Good idea for domain of interest/concept.

Regards
JB

On 04/24/2017 09:01 AM, Ankur Chauhan wrote:


Might I suggest adding tags to projects based on area of intetest,
concept
and if it's a good "first bug".

Sent from my iPhone

On Apr 23, 2017, at 23:03, Davor Bonaci  wrote:



1. Have people unassign themselves from issues they're not actively
working on.
2. Have the community engage more in triage, improving tickets
descriptions and raising concerns.
3. Clean house - apply (2) to currently open issues (over 800).
Perhaps
some can be closed.



+1 on all three of these, and will do my part shortly!

Also, it is worth noting that we have improved as a project in
tracking
issues in the last 1-2 months. There are more resolved issues than
opened
in this period, whereas in the past we'd have a hundred more opened
than
resolved.

I would also propose to not assign new Jira automatically: now, the
Jira is


automatically assigned to the Jira component leader.



Imagine a user discovering an issue and filing a new JIRA issue. It
wouldn't be assigned to anyone, significantly reducing the chance
somebody
will actually help.

Of course, somebody could search for new issues periodically, etc.
-- but
that just won't happen. The final outcome would be -- instead of a
lot of
issues assigned to component leads, we'd have (much) more unassigned
issues, which were *never* looked at. Assigning an issue just sets a
community expectation that a committer should look -- and it does
help move
things along!

I think a better approach of addressing the current state would be
increase
the number of components / component leads. With more people
involved and
lower per-person load, I think we'd be more effective.












--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com







--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com


Re: [DISCUSSION] Encouraging more contributions

2017-04-24 Thread Mingmin Xu
many design documents are mixed in maillist, jira comments, it would be a
big help to put them in a centralized list. Also I would expect more
wiki/blogs to provide in-depth analysis, like the translation from pipeline
to runner specified topology, window/trigger implementation. Without these
knowledge, it's hard to touch the core concepts.

On Mon, Apr 24, 2017 at 6:03 AM, Jean-Baptiste Onofré 
wrote:

> Got it. By experience on other Apache projects, it's really hard to
> maintain ;)
>
> Regards
> JB
>
>
> On 04/24/2017 02:56 PM, Etienne Chauchot wrote:
>
>> Hi JB,
>>
>> I was proposing a FAQ (or another form), not something about IDE setup.
>> The FAQ
>> could group in the same place Q/A like for example "what is a source, how
>> do I
>> use it to implement an IO"
>>
>> Etienne
>>
>>
>> Le 24/04/2017 à 14:19, Jean-Baptiste Onofré a écrit :
>>
>>> Hi Etienne,
>>>
>>> What about the contribution guide ? I think it's covered in the IntelliJ
>>> and
>>> Eclipse setup sections.
>>>
>>> Regards
>>> JB
>>>
>>> On 04/24/2017 02:12 PM, Etienne Chauchot wrote:
>>>
 Hi all,

 I definitely agree with everything that is said in this thread.

 I might suggest another good to have:

 to ease the work of a new contributor, it would be nice to have some
 sort of
 programming guide but not oriented to pipeline writers but to
 sdk/runner/io/...
 writers.

 I know that new contributors have the docs available in the google
 drive, the
 ML, the code base, and the availability of beamers, but maybe having
 key points
 in a common place (like FAQ for sdk/runner/io/... writers, for example)
 would be
 interesting.

 Best,

 Etienne


 Le 24/04/2017 à 09:14, Jean-Baptiste Onofré a écrit :

> Hi,
>
> I think we already tag the newbie jira ("low hanging fruit" ;)).
>
> Good idea for domain of interest/concept.
>
> Regards
> JB
>
> On 04/24/2017 09:01 AM, Ankur Chauhan wrote:
>
>> Might I suggest adding tags to projects based on area of intetest,
>> concept
>> and if it's a good "first bug".
>>
>> Sent from my iPhone
>>
>> On Apr 23, 2017, at 23:03, Davor Bonaci  wrote:
>>
>>
 1. Have people unassign themselves from issues they're not actively
 working on.
 2. Have the community engage more in triage, improving tickets
 descriptions and raising concerns.
 3. Clean house - apply (2) to currently open issues (over 800).
 Perhaps
 some can be closed.


>>> +1 on all three of these, and will do my part shortly!
>>>
>>> Also, it is worth noting that we have improved as a project in
>>> tracking
>>> issues in the last 1-2 months. There are more resolved issues than
>>> opened
>>> in this period, whereas in the past we'd have a hundred more opened
>>> than
>>> resolved.
>>>
>>> I would also propose to not assign new Jira automatically: now, the
>>> Jira is
>>>
 automatically assigned to the Jira component leader.


>>> Imagine a user discovering an issue and filing a new JIRA issue. It
>>> wouldn't be assigned to anyone, significantly reducing the chance
>>> somebody
>>> will actually help.
>>>
>>> Of course, somebody could search for new issues periodically, etc.
>>> -- but
>>> that just won't happen. The final outcome would be -- instead of a
>>> lot of
>>> issues assigned to component leads, we'd have (much) more unassigned
>>> issues, which were *never* looked at. Assigning an issue just sets a
>>> community expectation that a committer should look -- and it does
>>> help move
>>> things along!
>>>
>>> I think a better approach of addressing the current state would be
>>> increase
>>> the number of components / component leads. With more people
>>> involved and
>>> lower per-person load, I think we'd be more effective.
>>>
>>
>

>>>
>>
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>



-- 

Mingmin


Re: Apache: Big Data North America 2017 conference

2017-04-24 Thread Aljoscha Krettek
I’ll also be there. Looking forward to meeting all of you!

Best,
Aljoscha
> On 24. Apr 2017, at 09:11, Davor Bonaci  wrote:
> 
> Apache Beam will be prominently featured at the upcoming Apache: Big Data
> North America 2017 conference in Miami, FL [1].
> 
> Scheduled talks:
> * Using Apache Beam for Batch, Streaming, and Everything in Between
>  Speakers: Frances Perry, and Dan Halperin
> 
> * Apache Beam: Integrating the Big Data Ecosystem Up, Down, and Sideways
>  Speakers: Davor Bonaci, and Jean-Baptiste Onofré
> 
> * Concrete Big Data Use Cases Implemented with Apache Beam
>  Speaker: Jean-Baptiste Onofré
> 
> * Nexmark, a Unified Framework to Evaluate Big Data Processing Systems with
> Apache Beam
>  Speakers: Ismael Mejia, and Etienne Chauchot
> 
> In addition to talks, we plan to organize a birds-of-a-feather session and
> social activities. Details TBD.
> 
> Everybody is welcome -- users and contributors alike! Feel free to use code
> ABDSP20 to get 20% off registration fee and/or your committer code (if
> applicable).
> 
> I hope to see many of your there!
> 
> Davor
> 
> [1] http://events.linuxfoundation.org/events/apache-big-data-north-america/



Re: [DISCUSSION] Encouraging more contributions

2017-04-24 Thread Jean-Baptiste Onofré

Got it. By experience on other Apache projects, it's really hard to maintain ;)

Regards
JB

On 04/24/2017 02:56 PM, Etienne Chauchot wrote:

Hi JB,

I was proposing a FAQ (or another form), not something about IDE setup. The FAQ
could group in the same place Q/A like for example "what is a source, how do I
use it to implement an IO"

Etienne


Le 24/04/2017 à 14:19, Jean-Baptiste Onofré a écrit :

Hi Etienne,

What about the contribution guide ? I think it's covered in the IntelliJ and
Eclipse setup sections.

Regards
JB

On 04/24/2017 02:12 PM, Etienne Chauchot wrote:

Hi all,

I definitely agree with everything that is said in this thread.

I might suggest another good to have:

to ease the work of a new contributor, it would be nice to have some sort of
programming guide but not oriented to pipeline writers but to sdk/runner/io/...
writers.

I know that new contributors have the docs available in the google drive, the
ML, the code base, and the availability of beamers, but maybe having key points
in a common place (like FAQ for sdk/runner/io/... writers, for example) would be
interesting.

Best,

Etienne


Le 24/04/2017 à 09:14, Jean-Baptiste Onofré a écrit :

Hi,

I think we already tag the newbie jira ("low hanging fruit" ;)).

Good idea for domain of interest/concept.

Regards
JB

On 04/24/2017 09:01 AM, Ankur Chauhan wrote:

Might I suggest adding tags to projects based on area of intetest, concept
and if it's a good "first bug".

Sent from my iPhone

On Apr 23, 2017, at 23:03, Davor Bonaci  wrote:



1. Have people unassign themselves from issues they're not actively
working on.
2. Have the community engage more in triage, improving tickets
descriptions and raising concerns.
3. Clean house - apply (2) to currently open issues (over 800). Perhaps
some can be closed.



+1 on all three of these, and will do my part shortly!

Also, it is worth noting that we have improved as a project in tracking
issues in the last 1-2 months. There are more resolved issues than opened
in this period, whereas in the past we'd have a hundred more opened than
resolved.

I would also propose to not assign new Jira automatically: now, the Jira is

automatically assigned to the Jira component leader.



Imagine a user discovering an issue and filing a new JIRA issue. It
wouldn't be assigned to anyone, significantly reducing the chance somebody
will actually help.

Of course, somebody could search for new issues periodically, etc. -- but
that just won't happen. The final outcome would be -- instead of a lot of
issues assigned to component leads, we'd have (much) more unassigned
issues, which were *never* looked at. Assigning an issue just sets a
community expectation that a committer should look -- and it does help move
things along!

I think a better approach of addressing the current state would be increase
the number of components / component leads. With more people involved and
lower per-person load, I think we'd be more effective.










--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com


Re: [DISCUSSION] Encouraging more contributions

2017-04-24 Thread Etienne Chauchot

Hi JB,

I was proposing a FAQ (or another form), not something about IDE setup. 
The FAQ could group in the same place Q/A like for example "what is a 
source, how do I use it to implement an IO"


Etienne


Le 24/04/2017 à 14:19, Jean-Baptiste Onofré a écrit :

Hi Etienne,

What about the contribution guide ? I think it's covered in the 
IntelliJ and Eclipse setup sections.


Regards
JB

On 04/24/2017 02:12 PM, Etienne Chauchot wrote:

Hi all,

I definitely agree with everything that is said in this thread.

I might suggest another good to have:

to ease the work of a new contributor, it would be nice to have some 
sort of
programming guide but not oriented to pipeline writers but to 
sdk/runner/io/...

writers.

I know that new contributors have the docs available in the google 
drive, the
ML, the code base, and the availability of beamers, but maybe having 
key points
in a common place (like FAQ for sdk/runner/io/... writers, for 
example) would be

interesting.

Best,

Etienne


Le 24/04/2017 à 09:14, Jean-Baptiste Onofré a écrit :

Hi,

I think we already tag the newbie jira ("low hanging fruit" ;)).

Good idea for domain of interest/concept.

Regards
JB

On 04/24/2017 09:01 AM, Ankur Chauhan wrote:
Might I suggest adding tags to projects based on area of intetest, 
concept

and if it's a good "first bug".

Sent from my iPhone

On Apr 23, 2017, at 23:03, Davor Bonaci  wrote:



1. Have people unassign themselves from issues they're not actively
working on.
2. Have the community engage more in triage, improving tickets
descriptions and raising concerns.
3. Clean house - apply (2) to currently open issues (over 800). 
Perhaps

some can be closed.



+1 on all three of these, and will do my part shortly!

Also, it is worth noting that we have improved as a project in 
tracking
issues in the last 1-2 months. There are more resolved issues than 
opened
in this period, whereas in the past we'd have a hundred more 
opened than

resolved.

I would also propose to not assign new Jira automatically: now, 
the Jira is

automatically assigned to the Jira component leader.



Imagine a user discovering an issue and filing a new JIRA issue. It
wouldn't be assigned to anyone, significantly reducing the chance 
somebody

will actually help.

Of course, somebody could search for new issues periodically, etc. 
-- but
that just won't happen. The final outcome would be -- instead of a 
lot of

issues assigned to component leads, we'd have (much) more unassigned
issues, which were *never* looked at. Assigning an issue just sets a
community expectation that a committer should look -- and it does 
help move

things along!

I think a better approach of addressing the current state would be 
increase
the number of components / component leads. With more people 
involved and

lower per-person load, I think we'd be more effective.










Re: [DISCUSSION] Encouraging more contributions

2017-04-24 Thread Jean-Baptiste Onofré

Hi Etienne,

What about the contribution guide ? I think it's covered in the IntelliJ and 
Eclipse setup sections.


Regards
JB

On 04/24/2017 02:12 PM, Etienne Chauchot wrote:

Hi all,

I definitely agree with everything that is said in this thread.

I might suggest another good to have:

to ease the work of a new contributor, it would be nice to have some sort of
programming guide but not oriented to pipeline writers but to sdk/runner/io/...
writers.

I know that new contributors have the docs available in the google drive, the
ML, the code base, and the availability of beamers, but maybe having key points
in a common place (like FAQ for sdk/runner/io/... writers, for example) would be
interesting.

Best,

Etienne


Le 24/04/2017 à 09:14, Jean-Baptiste Onofré a écrit :

Hi,

I think we already tag the newbie jira ("low hanging fruit" ;)).

Good idea for domain of interest/concept.

Regards
JB

On 04/24/2017 09:01 AM, Ankur Chauhan wrote:

Might I suggest adding tags to projects based on area of intetest, concept
and if it's a good "first bug".

Sent from my iPhone

On Apr 23, 2017, at 23:03, Davor Bonaci  wrote:



1. Have people unassign themselves from issues they're not actively
working on.
2. Have the community engage more in triage, improving tickets
descriptions and raising concerns.
3. Clean house - apply (2) to currently open issues (over 800). Perhaps
some can be closed.



+1 on all three of these, and will do my part shortly!

Also, it is worth noting that we have improved as a project in tracking
issues in the last 1-2 months. There are more resolved issues than opened
in this period, whereas in the past we'd have a hundred more opened than
resolved.

I would also propose to not assign new Jira automatically: now, the Jira is

automatically assigned to the Jira component leader.



Imagine a user discovering an issue and filing a new JIRA issue. It
wouldn't be assigned to anyone, significantly reducing the chance somebody
will actually help.

Of course, somebody could search for new issues periodically, etc. -- but
that just won't happen. The final outcome would be -- instead of a lot of
issues assigned to component leads, we'd have (much) more unassigned
issues, which were *never* looked at. Assigning an issue just sets a
community expectation that a committer should look -- and it does help move
things along!

I think a better approach of addressing the current state would be increase
the number of components / component leads. With more people involved and
lower per-person load, I think we'd be more effective.






--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com


Re: [DISCUSSION] Encouraging more contributions

2017-04-24 Thread Etienne Chauchot

Hi all,

I definitely agree with everything that is said in this thread.

I might suggest another good to have:

to ease the work of a new contributor, it would be nice to have some 
sort of programming guide but not oriented to pipeline writers but to 
sdk/runner/io/... writers.


I know that new contributors have the docs available in the google 
drive, the ML, the code base, and the availability of beamers, but maybe 
having key points in a common place (like FAQ for sdk/runner/io/... 
writers, for example) would be interesting.


Best,

Etienne


Le 24/04/2017 à 09:14, Jean-Baptiste Onofré a écrit :

Hi,

I think we already tag the newbie jira ("low hanging fruit" ;)).

Good idea for domain of interest/concept.

Regards
JB

On 04/24/2017 09:01 AM, Ankur Chauhan wrote:
Might I suggest adding tags to projects based on area of intetest, 
concept and if it's a good "first bug".


Sent from my iPhone

On Apr 23, 2017, at 23:03, Davor Bonaci  wrote:



1. Have people unassign themselves from issues they're not actively
working on.
2. Have the community engage more in triage, improving tickets
descriptions and raising concerns.
3. Clean house - apply (2) to currently open issues (over 800). 
Perhaps

some can be closed.



+1 on all three of these, and will do my part shortly!

Also, it is worth noting that we have improved as a project in tracking
issues in the last 1-2 months. There are more resolved issues than 
opened
in this period, whereas in the past we'd have a hundred more opened 
than

resolved.

I would also propose to not assign new Jira automatically: now, the 
Jira is

automatically assigned to the Jira component leader.



Imagine a user discovering an issue and filing a new JIRA issue. It
wouldn't be assigned to anyone, significantly reducing the chance 
somebody

will actually help.

Of course, somebody could search for new issues periodically, etc. 
-- but
that just won't happen. The final outcome would be -- instead of a 
lot of

issues assigned to component leads, we'd have (much) more unassigned
issues, which were *never* looked at. Assigning an issue just sets a
community expectation that a committer should look -- and it does 
help move

things along!

I think a better approach of addressing the current state would be 
increase
the number of components / component leads. With more people 
involved and

lower per-person load, I think we'd be more effective.






Re: Community hackathon

2017-04-24 Thread JingsongLee
+1
best,
Jingsonglee
--From:Ted Yu 
Time:2017 Apr 24 (Mon) 17:29To:dev 
Subject:Re: Community hackathon
+1

> On Apr 24, 2017, at 12:51 AM, Jean-Baptiste Onofré  wrote:
> 
> That's a wonderful idea !
> 
> I think the easiest way to organize this event is using the Slack channels to 
>discuss, help each other, and sync together.
> 
> Regards
> JB
> 
>> On 04/24/2017 09:48 AM, Davor Bonaci wrote:
>> We've been working as a community towards the first stable release for a
>> while now, and I think we made a ton of progress across the board over the
>> last few weeks.
>> 
>> We could try to organize a community-wide hackathon to identify and fix
>> those last few issues, as well as to get a better sense of the overall
>> project quality as it stands right now.
>> 
>> This could be a self-organized event, and coordinated via the Slack
>> channel. For example, we (as a community and participants) can try out the
>> project in various ways -- quickstart, examples, different runners,
>> different platforms -- immediately fixing issues as we run into them. It
>> could last, say, 24 hours, with people from different time zones
>> participating at the time of their choosing.
>> 
>> Thoughts?
>> 
>> Davor
> 
> -- 
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com


Re: Community hackathon

2017-04-24 Thread Ted Yu
+1

> On Apr 24, 2017, at 12:51 AM, Jean-Baptiste Onofré  wrote:
> 
> That's a wonderful idea !
> 
> I think the easiest way to organize this event is using the Slack channels to 
> discuss, help each other, and sync together.
> 
> Regards
> JB
> 
>> On 04/24/2017 09:48 AM, Davor Bonaci wrote:
>> We've been working as a community towards the first stable release for a
>> while now, and I think we made a ton of progress across the board over the
>> last few weeks.
>> 
>> We could try to organize a community-wide hackathon to identify and fix
>> those last few issues, as well as to get a better sense of the overall
>> project quality as it stands right now.
>> 
>> This could be a self-organized event, and coordinated via the Slack
>> channel. For example, we (as a community and participants) can try out the
>> project in various ways -- quickstart, examples, different runners,
>> different platforms -- immediately fixing issues as we run into them. It
>> could last, say, 24 hours, with people from different time zones
>> participating at the time of their choosing.
>> 
>> Thoughts?
>> 
>> Davor
> 
> -- 
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com


Re: [DISCUSSION] Encouraging more contributions

2017-04-24 Thread Jean-Baptiste Onofré

Hi Ismaël,

Honestly, for 4, I think it's not so bad and we clearly improved in the past 
months. It's definitely an area where we have to keep improving, but I think we 
do a good job (especially comparing to other projects).


For 5, agree. For example, I limit myself to 3 or 4 pull requests: that's why I 
have more than 10 local branches waiting.


Regards
JB

On 04/23/2017 10:16 PM, Ismaël Mejía wrote:

+1 Great idea Aviem, thanks for bringing this subject to the mailing list.

I agree in particular with the freeing JIRA part, I think we shouldn’t
keep assigned JIRAs that are things that we don’t expect to solve in
the next weeks. (note the exception for this are the long features).

I would add two more issues.

4. We need to react and review code faster for new contributors and
belp them as much as we can.

I know that this one implies extra work but I have seen many times
people asking for reviews days after they create a PR and even worse,
people who have not been able to merge their changes because they were
dealing with a long code review and then a different PR already
included changes that fixed the same issue.

5. We should try to keep the number of open pull requests low.

Our average number of open Pull Requests is continuously increasing
(current average is 70), There are some PRs in open discussion but
some are clearly stagnated , maybe we should have like a deadline,
like if no discussions or improvements were done in the last month we
must close them and if there is still interest well they will be
re-opened in that case.

The ‘good news’ is that we have 350 unassigned unresolved issues that
anyone can take this is a good improvement but I agree that we can do
better.

Ismaël


On Sun, Apr 23, 2017 at 6:32 AM, Jean-Baptiste Onofré  wrote:

Hi,

as we already discussed about that, +1.

I would also propose to not assign new Jira automatically: now, the Jira is
automatically assigned to the Jira component leader.

Regards
JB


On 04/22/2017 04:31 PM, Aviem Zur wrote:


Hi all,

I wanted to start a discussion about actions we can take to encourage more
contributions to the project.

A few points I've been thinking of:

1. Have people unassign themselves from issues they're not actively
working
on.
2. Have the community engage more in triage, improving tickets
descriptions
and raising concerns.
3. Clean house - apply (2) to currently open issues (over 800). Perhaps
some can be closed.

Thoughts? Ideas?



--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com


--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com


Re: [DISCUSSION] Encouraging more contributions

2017-04-24 Thread Ismaël Mejía
+1 Great idea Aviem, thanks for bringing this subject to the mailing list.

I agree in particular with the freeing JIRA part, I think we shouldn’t
keep assigned JIRAs that are things that we don’t expect to solve in
the next weeks. (note the exception for this are the long features).

I would add two more issues.

4. We need to react and review code faster for new contributors and
belp them as much as we can.

I know that this one implies extra work but I have seen many times
people asking for reviews days after they create a PR and even worse,
people who have not been able to merge their changes because they were
dealing with a long code review and then a different PR already
included changes that fixed the same issue.

5. We should try to keep the number of open pull requests low.

Our average number of open Pull Requests is continuously increasing
(current average is 70), There are some PRs in open discussion but
some are clearly stagnated , maybe we should have like a deadline,
like if no discussions or improvements were done in the last month we
must close them and if there is still interest well they will be
re-opened in that case.

The ‘good news’ is that we have 350 unassigned unresolved issues that
anyone can take this is a good improvement but I agree that we can do
better.

Ismaël


On Sun, Apr 23, 2017 at 6:32 AM, Jean-Baptiste Onofré  wrote:
> Hi,
>
> as we already discussed about that, +1.
>
> I would also propose to not assign new Jira automatically: now, the Jira is
> automatically assigned to the Jira component leader.
>
> Regards
> JB
>
>
> On 04/22/2017 04:31 PM, Aviem Zur wrote:
>>
>> Hi all,
>>
>> I wanted to start a discussion about actions we can take to encourage more
>> contributions to the project.
>>
>> A few points I've been thinking of:
>>
>> 1. Have people unassign themselves from issues they're not actively
>> working
>> on.
>> 2. Have the community engage more in triage, improving tickets
>> descriptions
>> and raising concerns.
>> 3. Clean house - apply (2) to currently open issues (over 800). Perhaps
>> some can be closed.
>>
>> Thoughts? Ideas?
>>
>
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com


Re: Community hackathon

2017-04-24 Thread Jean-Baptiste Onofré

That's a wonderful idea !

I think the easiest way to organize this event is using the Slack channels to 
discuss, help each other, and sync together.


Regards
JB

On 04/24/2017 09:48 AM, Davor Bonaci wrote:

We've been working as a community towards the first stable release for a
while now, and I think we made a ton of progress across the board over the
last few weeks.

We could try to organize a community-wide hackathon to identify and fix
those last few issues, as well as to get a better sense of the overall
project quality as it stands right now.

This could be a self-organized event, and coordinated via the Slack
channel. For example, we (as a community and participants) can try out the
project in various ways -- quickstart, examples, different runners,
different platforms -- immediately fixing issues as we run into them. It
could last, say, 24 hours, with people from different time zones
participating at the time of their choosing.

Thoughts?

Davor



--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com


Community hackathon

2017-04-24 Thread Davor Bonaci
We've been working as a community towards the first stable release for a
while now, and I think we made a ton of progress across the board over the
last few weeks.

We could try to organize a community-wide hackathon to identify and fix
those last few issues, as well as to get a better sense of the overall
project quality as it stands right now.

This could be a self-organized event, and coordinated via the Slack
channel. For example, we (as a community and participants) can try out the
project in various ways -- quickstart, examples, different runners,
different platforms -- immediately fixing issues as we run into them. It
could last, say, 24 hours, with people from different time zones
participating at the time of their choosing.

Thoughts?

Davor


Re: [DISCUSSION] Encouraging more contributions

2017-04-24 Thread Jean-Baptiste Onofré

Hi,

I think we already tag the newbie jira ("low hanging fruit" ;)).

Good idea for domain of interest/concept.

Regards
JB

On 04/24/2017 09:01 AM, Ankur Chauhan wrote:

Might I suggest adding tags to projects based on area of intetest, concept and if it's a 
good "first bug".

Sent from my iPhone

On Apr 23, 2017, at 23:03, Davor Bonaci  wrote:



1. Have people unassign themselves from issues they're not actively
working on.
2. Have the community engage more in triage, improving tickets
descriptions and raising concerns.
3. Clean house - apply (2) to currently open issues (over 800). Perhaps
some can be closed.



+1 on all three of these, and will do my part shortly!

Also, it is worth noting that we have improved as a project in tracking
issues in the last 1-2 months. There are more resolved issues than opened
in this period, whereas in the past we'd have a hundred more opened than
resolved.

I would also propose to not assign new Jira automatically: now, the Jira is

automatically assigned to the Jira component leader.



Imagine a user discovering an issue and filing a new JIRA issue. It
wouldn't be assigned to anyone, significantly reducing the chance somebody
will actually help.

Of course, somebody could search for new issues periodically, etc. -- but
that just won't happen. The final outcome would be -- instead of a lot of
issues assigned to component leads, we'd have (much) more unassigned
issues, which were *never* looked at. Assigning an issue just sets a
community expectation that a committer should look -- and it does help move
things along!

I think a better approach of addressing the current state would be increase
the number of components / component leads. With more people involved and
lower per-person load, I think we'd be more effective.


--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com


Apache: Big Data North America 2017 conference

2017-04-24 Thread Davor Bonaci
Apache Beam will be prominently featured at the upcoming Apache: Big Data
North America 2017 conference in Miami, FL [1].

Scheduled talks:
* Using Apache Beam for Batch, Streaming, and Everything in Between
  Speakers: Frances Perry, and Dan Halperin

* Apache Beam: Integrating the Big Data Ecosystem Up, Down, and Sideways
  Speakers: Davor Bonaci, and Jean-Baptiste Onofré

* Concrete Big Data Use Cases Implemented with Apache Beam
  Speaker: Jean-Baptiste Onofré

* Nexmark, a Unified Framework to Evaluate Big Data Processing Systems with
Apache Beam
  Speakers: Ismael Mejia, and Etienne Chauchot

In addition to talks, we plan to organize a birds-of-a-feather session and
social activities. Details TBD.

Everybody is welcome -- users and contributors alike! Feel free to use code
ABDSP20 to get 20% off registration fee and/or your committer code (if
applicable).

I hope to see many of your there!

Davor

[1] http://events.linuxfoundation.org/events/apache-big-data-north-america/


Re: [DISCUSSION] Encouraging more contributions

2017-04-24 Thread Ankur Chauhan
Might I suggest adding tags to projects based on area of intetest, concept and 
if it's a good "first bug". 

Sent from my iPhone

On Apr 23, 2017, at 23:03, Davor Bonaci  wrote:

>> 
>> 1. Have people unassign themselves from issues they're not actively
>> working on.
>> 2. Have the community engage more in triage, improving tickets
>> descriptions and raising concerns.
>> 3. Clean house - apply (2) to currently open issues (over 800). Perhaps
>> some can be closed.
>> 
> 
> +1 on all three of these, and will do my part shortly!
> 
> Also, it is worth noting that we have improved as a project in tracking
> issues in the last 1-2 months. There are more resolved issues than opened
> in this period, whereas in the past we'd have a hundred more opened than
> resolved.
> 
> I would also propose to not assign new Jira automatically: now, the Jira is
>> automatically assigned to the Jira component leader.
>> 
> 
> Imagine a user discovering an issue and filing a new JIRA issue. It
> wouldn't be assigned to anyone, significantly reducing the chance somebody
> will actually help.
> 
> Of course, somebody could search for new issues periodically, etc. -- but
> that just won't happen. The final outcome would be -- instead of a lot of
> issues assigned to component leads, we'd have (much) more unassigned
> issues, which were *never* looked at. Assigning an issue just sets a
> community expectation that a committer should look -- and it does help move
> things along!
> 
> I think a better approach of addressing the current state would be increase
> the number of components / component leads. With more people involved and
> lower per-person load, I think we'd be more effective.