Re: [ANNOUNCEMENT] New Beam chair: Kenneth Knowles

2018-09-20 Thread Huygaa Batsaikhan
Congrats Kenn!

On Thu, Sep 20, 2018 at 8:47 AM Pablo Estrada  wrote:

> Thanks Davor for your work on Beam thus far! And congrats Kenn : )
> Best
> -P.
>
> On Thu, Sep 20, 2018 at 2:38 AM Robert Bradshaw 
> wrote:
>
>> Congratulations Kenn! And thank you, Davor, for the hard work you've put
>> in these last several years.
>>
>> On Thu, Sep 20, 2018 at 9:50 AM Tim Robertson 
>> wrote:
>>
>>> Thank you to Davor all the PMC - I can only imagine how much work it has
>>> been to get Beam to where it is today.
>>>
>>> Congratulations Kenn!
>>>
>>> On Thu, Sep 20, 2018 at 1:05 AM Tyler Akidau  wrote:
>>>
 Thanks Davor, and congrats Kenn!

 -Tyler

 On Wed, Sep 19, 2018 at 2:43 PM Yifan Zou  wrote:

> Congratulations Kenn!
>
> On Wed, Sep 19, 2018 at 2:36 PM Robert Burke 
> wrote:
>
>> Congrats Kenn! :D
>>
>> On Wed, Sep 19, 2018, 2:21 PM Ismaël Mejía  wrote:
>>
>>> Congratulations and welcome Kenn as new chair!
>>> Thanks Davor for your hard work too.
>>>
>>> On Wed, Sep 19, 2018 at 11:14 PM Rui Wang  wrote:
>>>
 Congrats!

 -Rui

 On Wed, Sep 19, 2018 at 2:12 PM Chamikara Jayalath <
 chamik...@google.com> wrote:

> Congrats!
>
> On Wed, Sep 19, 2018 at 2:05 PM Ahmet Altay 
> wrote:
>
>> Congratulations, Kenn! And thank you Davor.
>>
>> On Wed, Sep 19, 2018 at 1:44 PM, Anton Kedin 
>> wrote:
>>
>>> Congrats!
>>>
>>> On Wed, Sep 19, 2018 at 1:36 PM Ankur Goenka 
>>> wrote:
>>>
 Congrats Kenn!

 On Wed, Sep 19, 2018 at 1:35 PM Amit Sela 
 wrote:

> Well deserved! Congrats Kenn.
>
> On Wed, Sep 19, 2018 at 4:25 PM Kai Jiang 
> wrote:
>
>> Congrats, Kenn!
>> ᐧ
>>
>> On Wed, Sep 19, 2018 at 1:23 PM Alan Myrvold <
>> amyrv...@google.com> wrote:
>>
>>> Congrats, Kenn.
>>>
>>> On Wed, Sep 19, 2018 at 1:08 PM Maximilian Michels <
>>> m...@apache.org> wrote:
>>>
 Congrats!

 On 19.09.18 22:07, Robin Qiu wrote:
 > Congratulations, Kenn!
 >
 > On Wed, Sep 19, 2018 at 1:05 PM Lukasz Cwik <
 lc...@google.com
 > > wrote:
 >
 > Congrats Kenn.
 >
 > On Wed, Sep 19, 2018 at 12:54 PM Davor Bonaci <
 da...@apache.org
 > > wrote:
 >
 > Hi everyone --
 > It is with great pleasure that I announce that at
 today's
 > meeting of the Foundation's Board of Directors,
 the Board has
 > appointed Kenneth Knowles as the second chair of
 the Apache Beam
 > project.
 >
 > Kenn has served on the PMC since its inception,
 and is very
 > active and effective in growing the community.
 His exemplary
 > posts have been cited in other projects. I'm
 super happy to have
 > Kenn accepted the nomination, and I'm confident
 that he'll serve
 > with distinction.
 >
 > As for myself, I'm not going anywhere. I'm still
 around and will
 > be as active as I have recently been. Thrilled to
 be able to
 > pass the baton to such a key member of this
 community and to
 > have less administrative work to do ;-).
 >
 > Please join me in welcoming Kenn to his new role,
 and I ask that
 > you support him as much as possible. As always,
 please let me
 > know if you have any questions.
 >
 > Davor
 >

>>>
>>


Re: [Help wanted] Fixing beam_PerformanceTests_Python

2018-09-17 Thread Huygaa Batsaikhan
Thanks, please keep the bug updated.

On Mon, Sep 17, 2018 at 10:56 AM Ahmet Altay  wrote:

> I talked with Mark. His PR (#6392) might help. He also a few more ideas
> for debugging. If it does not work, I will work with Mark to resolve this.
>
> On Mon, Sep 17, 2018 at 10:45 AM, Huygaa Batsaikhan 
> wrote:
>
>>
>>
>> On Mon, Sep 17, 2018 at 10:42 AM Huygaa Batsaikhan 
>> wrote:
>>
>>> Hi devs, Python performance tests have been failing for a while due to
>>> incompatible packages in dependencies. Could anyone fix this issue? Here is
>>> the Jira
>>> <https://issues.apache.org/jira/browse/BEAM-5334?jql=project%20%3D%20BEAM%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%20AND%20resolution%20%3D%20Unresolved%20AND%20component%20%3D%20test-failures%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC>
>>>  link.
>>> Thanks.
>>>
>>> Failure link:
>>> https://builds.apache.org/job/beam_PerformanceTests_Python/1436/console
>>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "DataPLS Programmability Team" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to datapls-programmability-team+unsubscr...@google.com.
>> To post to this group, send email to
>> datapls-programmability-t...@google.com.
>> To view this discussion on the web visit
>> https://groups.google.com/a/google.com/d/msgid/datapls-programmability-team/CABJ_Qd3cwT6GTAXc%2BKDA%2B_VQTpUQeXfOs0vsMgcjfUD2e2syEw%40mail.gmail.com
>> <https://groups.google.com/a/google.com/d/msgid/datapls-programmability-team/CABJ_Qd3cwT6GTAXc%2BKDA%2B_VQTpUQeXfOs0vsMgcjfUD2e2syEw%40mail.gmail.com?utm_medium=email_source=footer>
>> .
>>
>
>


Re: [Help wanted] Fixing beam_PerformanceTests_Python

2018-09-17 Thread Huygaa Batsaikhan
On Mon, Sep 17, 2018 at 10:42 AM Huygaa Batsaikhan 
wrote:

> Hi devs, Python performance tests have been failing for a while due to
> incompatible packages in dependencies. Could anyone fix this issue? Here is
> the Jira
> <https://issues.apache.org/jira/browse/BEAM-5334?jql=project%20%3D%20BEAM%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%20AND%20resolution%20%3D%20Unresolved%20AND%20component%20%3D%20test-failures%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC>
>  link.
> Thanks.
>
> Failure link:
> https://builds.apache.org/job/beam_PerformanceTests_Python/1436/console
>


[Help wanted] Fixing beam_PerformanceTests_Python

2018-09-17 Thread Huygaa Batsaikhan
Hi devs, Python performance tests have been failing for a while due to
incompatible packages in dependencies. Could anyone fix this issue? Here is
the Jira

link.
Thanks.

Failure link:
https://builds.apache.org/job/beam_PerformanceTests_Python/1436/console


Python performance tests have been broken for a while. [BEAM-5334]

2018-09-14 Thread Huygaa Batsaikhan
Hi devs, regarding BEAM-5334
,
beam_PerformanceTests_Python has been broken for a really long time. Anyone
interested in picking it up? Here is the history
 of the test.
Thanks


How do we run pipeline using gradle?

2018-08-15 Thread Huygaa Batsaikhan
When we run wordcount using maven, we pass "-P dataflow-runner" profile to
set the runner. What is the equivalent of this in gradle? In other words,
how can I run wordcount straight from my Beam repo code?


Re: Implementing @OnWindowExpiration in StatefulParDo [BEAM-1589]

2018-08-14 Thread Huygaa Batsaikhan
Finally, I have a PR <https://github.com/apache/beam/pull/4482> for the
annotation itself. Anyone up for reviewing it? Ken has been helping me, but
he is going to be OOO for a while.

On Tue, Mar 20, 2018 at 4:23 PM Huygaa Batsaikhan  wrote:

> As echauchot@ mentioned, it will make it easier and error-free.
>
>
> On Mon, Mar 19, 2018 at 11:59 PM Romain Manni-Bucau 
> wrote:
>
>> Hi Huygaa,
>>
>> Cant it be predefined timers?
>>
>> Romain
>>
>> Le 20 mars 2018 00:52, "Huygaa Batsaikhan"  a écrit :
>>
>> Hi everyone, I am working on BEAM-1589
>> <https://issues.apache.org/jira/browse/BEAM-1589>. In short, currently,
>> there is no default way of saving/flushing state before a window is garbage
>> collected.
>>
>> My current plan is to provide a method annotation, @OnWindowExpiration,
>> which allows user-provided callback function to be executed before garbage
>> collection. This annotation behaves very similar to @OnTimer, therefore,
>> implementation will mostly be a copy of OnTimer code. Let me know if you
>> have any considerations and suggestions.
>>
>> Here is an example usage:
>> ```
>> @OnWindowExpiration
>> public void myCleanupFunction(OnWindowExpirationContext c, State state) {
>>   c.output(state.read());
>> }
>> ```
>>
>> Thanks, Huygaa
>>
>>
>>


Re: [VOTE] Community Examples Repository

2018-08-08 Thread Huygaa Batsaikhan
2 - I like the idea of having a separate repo where we can have more
freedom to check in examples. However, we benefit from having immediate
core examples in Beam for testing purposes.

On Wed, Aug 8, 2018 at 9:38 AM David Cavazos  wrote:

> Hi everyone!
>
> We discussed several options as well as some of the implications of each
> option. Please vote for your favorite option, feel free to back it up with
> any reasons that make you feel that way.
>
> 1) Move *all* samples to a *new *examples* repository*
> 2) Move *some* samples to a *new *examples* repository*
> 3) Leave samples where they are
>
> Some implications to creating a new repository:
> - Every example would be independent from every other example, so tests
> can be run in parallel
> - Examples would now show how to use Beam *externally*
> - The examples repository would need a testing infrastructure
> - Decoupling makes examples easier to test on different versions
> - Easier to copy-paste an existing example and start from there, almost
> like a template
> - Smaller size for the core Beam library
> - Two different repositories to maintain
> - Versioning could mirror Beam's current version
>
> Link to proposal
> 
>


Re: [DISCUSSION] Tracking & Visualizing various metrics of the Beam community

2018-08-07 Thread Huygaa Batsaikhan
Thanks Udi, I am working on spinning up a test instance.

On Tue, Aug 7, 2018 at 11:31 AM Udi Meiri  wrote:

> These tables look very cool!
>
> I certainly don't object to using tools made by another organization. The
> only issue might be compatibility with our process.
>
> From my understanding of the Kubernetes review process, they use Github's
> tagging feature to specify PR statuses such as LGTM and approval.
> Random example: https://github.com/kubernetes/kubernetes/pull/67089
> I'm not sure that statistics such as pr-time-to-approve-and-merge would be
> captured at all on apache/beam since we don't use these tags.
>
> I would definitely like to see how useful this tool is regardless, and
> since it doesn't seem to require any special permissions it should be
> simple to set up a test instance.
>
>
> On Tue, Aug 7, 2018 at 11:24 AM Huygaa Batsaikhan 
> wrote:
>
>> tl;dr - is there any objections to trying out DevStats tool created by
>> CNCF?
>>
>> On Mon, Aug 6, 2018 at 3:43 PM Huygaa Batsaikhan 
>> wrote:
>>
>>> Continuing the discussion
>>> <https://lists.apache.org/thread.html/6138d08c551e254b5f13b26c6ba06579a49a4694f4d13ad6d164689a@%3Cdev.beam.apache.org%3E>
>>> about improving Beam code review, I am looking into visualizing various
>>> helpful Beam community metrics such as code velocity, reviewer load, and
>>> new contributor's engagement.
>>>
>>> So far, I found DevStats
>>> <https://k8s.devstats.cncf.io/d/12/dashboards?refresh=15m=1>, an
>>> open source (github <https://github.com/cncf/devstats>) dashboarding
>>> tool used by Kubernetes, seems to provide almost everything we need. For
>>> example, they have dashboards for metrics such as:
>>>
>>>-
>>>
>>>Time to approve or merge
>>><https://k8s.devstats.cncf.io/d/44/pr-time-to-approve-and-merge?orgId=1>
>>>-
>>>
>>>PR Time to engagement
>>><https://k8s.devstats.cncf.io/d/14/pr-time-to-engagment?orgId=1>
>>>-
>>>
>>>New and Episodic PR contributors
>>>
>>> <https://prometheus.devstats.cncf.io/d/14/new-and-episodic-pr-contributors?orgId=1>
>>>-
>>>
>>>PR reviews by contributor
>>><https://k8s.devstats.cncf.io/d/46/pr-reviews-by-contributor?orgId=1>
>>>-
>>>
>>>Company statistics
>>>
>>> It would be really cool if we can try it out for Beam. I don't have
>>> much experience using open source projects. From what I understand:
>>> DevStats is developed by CNCF <https://www.cncf.io/> and they manage
>>> their incubator projects' dashboard. Since Beam is not part of the CNCF, in
>>> order to use DevStats, we have to fork the project and maintain it
>>> ourselves.
>>>
>>> 1. What do you think about using DevStats for Beam? Do you know how it
>>> is usually done?
>>> 2. If you are not sure about DevStats, do you know any other tool which
>>> could help us track & visualize Beam metrics?
>>>
>>> Thanks, Huygaa
>>>
>>


Re: [DISCUSSION] Tracking & Visualizing various metrics of the Beam community

2018-08-07 Thread Huygaa Batsaikhan
tl;dr - is there any objections to trying out DevStats tool created by CNCF?

On Mon, Aug 6, 2018 at 3:43 PM Huygaa Batsaikhan  wrote:

> Continuing the discussion
> <https://lists.apache.org/thread.html/6138d08c551e254b5f13b26c6ba06579a49a4694f4d13ad6d164689a@%3Cdev.beam.apache.org%3E>
> about improving Beam code review, I am looking into visualizing various
> helpful Beam community metrics such as code velocity, reviewer load, and
> new contributor's engagement.
>
> So far, I found DevStats
> <https://k8s.devstats.cncf.io/d/12/dashboards?refresh=15m=1>, an
> open source (github <https://github.com/cncf/devstats>) dashboarding
> tool used by Kubernetes, seems to provide almost everything we need. For
> example, they have dashboards for metrics such as:
>
>-
>
>Time to approve or merge
><https://k8s.devstats.cncf.io/d/44/pr-time-to-approve-and-merge?orgId=1>
>-
>
>PR Time to engagement
><https://k8s.devstats.cncf.io/d/14/pr-time-to-engagment?orgId=1>
>-
>
>New and Episodic PR contributors
>
> <https://prometheus.devstats.cncf.io/d/14/new-and-episodic-pr-contributors?orgId=1>
>-
>
>PR reviews by contributor
><https://k8s.devstats.cncf.io/d/46/pr-reviews-by-contributor?orgId=1>
>-
>
>Company statistics
>
> It would be really cool if we can try it out for Beam. I don't have much
> experience using open source projects. From what I understand: DevStats is
> developed by CNCF <https://www.cncf.io/> and they manage their incubator
> projects' dashboard. Since Beam is not part of the CNCF, in order to use
> DevStats, we have to fork the project and maintain it ourselves.
>
> 1. What do you think about using DevStats for Beam? Do you know how it is
> usually done?
> 2. If you are not sure about DevStats, do you know any other tool which
> could help us track & visualize Beam metrics?
>
> Thanks, Huygaa
>


[DISCUSSION] Tracking & Visualizing various metrics of the Beam community

2018-08-06 Thread Huygaa Batsaikhan
Continuing the discussion

about improving Beam code review, I am looking into visualizing various
helpful Beam community metrics such as code velocity, reviewer load, and
new contributor's engagement.

So far, I found DevStats
, an open
source (github ) dashboarding tool used
by Kubernetes, seems to provide almost everything we need. For example,
they have dashboards for metrics such as:

   -

   Time to approve or merge
   
   -

   PR Time to engagement
   
   -

   New and Episodic PR contributors
   

   -

   PR reviews by contributor
   
   -

   Company statistics

It would be really cool if we can try it out for Beam. I don't have much
experience using open source projects. From what I understand: DevStats is
developed by CNCF  and they manage their incubator
projects' dashboard. Since Beam is not part of the CNCF, in order to use
DevStats, we have to fork the project and maintain it ourselves.

1. What do you think about using DevStats for Beam? Do you know how it is
usually done?
2. If you are not sure about DevStats, do you know any other tool which
could help us track & visualize Beam metrics?

Thanks, Huygaa


Re: Proof-of-concept Beam PR dashboard (based off of Spark's PR dashboard) to improve discoverability

2018-07-24 Thread Huygaa Batsaikhan
This is great. From previous thread
,
"whose turn" feature was a popular request for the dashboard because it is
hard to know whose attention is needed at any moment.
How much effort is needed to implement such feature on top of the dashboard?

On Fri, Jul 13, 2018 at 5:56 PM Holden Karau  wrote:

> Took me waaay longer than planed, and the regexes and components could use
> some work, but I've got a quick Beam PR dashboard up at
> https://boos-demo-projects-are-rad.appspot.com/. The code is a fork of
> the Spark one, and its at
> https://github.com/holdenk/spark-pr-dashboard/tree/support-beam in the
> beam support branch. I don't how useful this will be for folks, but given
> the discussion going on around CODEOWNERS I figured people were feeling the
> pain of trying to keep on top of reviews.
>
> I'm still working on trying to get mentionbot working (its being a bit
> frustrating to upgrade to recent version of dependencies as a non-JS
> programmer), but hopefully I can do something there too.
>
> If anyone has thoughts about what good tags would be for the review
> dashboard let me know, I just kicked it off with some tabs which I
> personally care about.
>
> Twitter: https://twitter.com/holdenkarau
>


Re: CODEOWNERS for apache/beam repo

2018-07-16 Thread Huygaa Batsaikhan
+1. This is great.

On Sat, Jul 14, 2018 at 7:44 AM Udi Meiri  wrote:

> Mention bot looks cool, as it tries to guess the reviewer using blame.
> I've written a quick and dirty script that uses only CODEOWNERS.
>
> Its output looks like:
> $ python suggest_reviewers.py --pr 5940
> INFO:root:Selected reviewer @lukecwik for:
> /runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/PTransformMatchers.java
> (path_pattern: /runners/core-construction-java*)
> INFO:root:Selected reviewer @lukecwik for:
> /runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/SplittableParDoNaiveBounded.java
> (path_pattern: /runners/core-construction-java*)
> INFO:root:Selected reviewer @echauchot for:
> /runners/core-java/src/main/java/org/apache/beam/runners/core/SplittableParDoViaKeyedWorkItems.java
> (path_pattern: /runners/core-java*)
> INFO:root:Selected reviewer @lukecwik for: /runners/flink/build.gradle
> (path_pattern: */build.gradle*)
> INFO:root:Selected reviewer @lukecwik for:
> /runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkTransformOverrides.java
> (path_pattern: *.java)
> INFO:root:Selected reviewer @pabloem for:
> /runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
> (path_pattern: /runners/google-cloud-dataflow-java*)
> INFO:root:Selected reviewer @lukecwik for:
> /sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/SplittableDoFnTest.java
> (path_pattern: /sdks/java/core*)
> Suggested reviewers: @echauchot, @lukecwik, @pabloem
>
> Script is in: https://github.com/apache/beam/pull/5951
>
>
> What does the community think? Do you prefer blame-based or rules-based
> reviewer suggestions?
>
> On Fri, Jul 13, 2018 at 11:13 AM Holden Karau 
> wrote:
>
>> I'm looking at something similar in the Spark project, and while it's now
>> archived by FB it seems like something like
>> https://github.com/facebookarchive/mention-bot might do what we want.
>> I'm going to spin up a version on my K8 cluster and see if I can ask infra
>> to add a webhook and if it works for Spark we could ask INFRA to add a
>> second webhook for Beam. (Or if the Beam folks are more interested in
>> experimenting I can do Beam first as a smaller project and roll with that).
>>
>> Let me know :)
>>
>> On Fri, Jul 13, 2018 at 10:53 AM, Eugene Kirpichov 
>> wrote:
>>
>>> Sounds reasonable for now, thanks!
>>> It's unfortunate that Github's CODEOWNERS feature appears to be
>>> effectively unusable for Beam but I'd hope that Github might pay attention
>>> and fix things if we submit feedback, with us being one of the most active
>>> Apache projects - did anyone do this yet / planning to?
>>>
>>> On Fri, Jul 13, 2018 at 10:23 AM Udi Meiri  wrote:
>>>
 While I like the idea of having a CODEOWNERS file, the Github
 implementation is lacking:
 1. Reviewers are automatically assigned at each push.
 2. Reviewer assignment can be excessive (e.g. 5 reviewers in Eugene's
 PR 5940).
 3. Non-committers aren't assigned as reviewers.
 4. Non-committers can't change the list of reviewers.

 I propose renaming the file to disable the auto-reviewer assignment
 feature.
 In its place I'll add a script that suggests reviewers.

 On Fri, Jul 13, 2018 at 9:09 AM Udi Meiri  wrote:

> Hi Etienne,
>
> Yes you could be as precise as you want. The paths I listed are just
> suggestions. :)
>
>
> On Fri, Jul 13, 2018 at 1:12 AM Jean-Baptiste Onofré 
> wrote:
>
>> Hi,
>>
>> I think it's already do-able just providing the expected path.
>>
>> It's a good idea especially for the core.
>>
>> Regards
>> JB
>>
>> On 13/07/2018 09:51, Etienne Chauchot wrote:
>> > Hi Udi,
>> >
>> > I also have a question, related to what Eugene asked : I see that
>> the
>> > code paths are the ones of the modules. Can we be more precise than
>> that
>> > to assign reviewers ? As an example, I added myself to runner/core
>> > because I wanted to take a look at the PRs related to
>> > runner/core/metrics but I'm getting assigned to all runner-core
>> PRs. Can
>> > we specify paths like
>> >
>> runners/core-java/src/main/java/org/apache/beam/runners/core/metrics ?
>> > I know it is a bit too precise so a bit risky, but in that
>> particular
>> > case, I doubt that the path will change.
>> >
>> > Etienne
>> >
>> > Le jeudi 12 juillet 2018 à 16:49 -0700, Eugene Kirpichov a écrit :
>> >> Hi Udi,
>> >>
>> >> I see that the PR was merged - thanks! However it seems to have
>> some
>> >> unintended effects.
>> >>
>> >> On my PR https://github.com/apache/beam/pull/5940 , I assigned a
>> >> reviewer manually, but the moment I pushed a new commit, it
>> >> auto-assigned a lot of other people to it, and I had to remove
>> them.

Re: [Design Proposal] Improving Beam code review

2018-06-26 Thread Huygaa Batsaikhan
Reuven, that's great. In this thread, we can continue discussing the usage
of review tools, dashboards, and metrics.

On Tue, Jun 26, 2018 at 5:27 PM Reuven Lax  wrote:

> So I suggested a while ago that we create a code-review guidelines doc,
> and in fact I was coincidentally just now drafting up a proposal doc. I'll
> share my proposal doc with the dev list soon.
>
> On Tue, Jun 26, 2018 at 5:18 PM Huygaa Batsaikhan 
> wrote:
>
>> Hi, I've been looking into ways to improve Beam's code review process
>> based on previous discussions on dev list and summits, and I would like to
>> propose improvement ideas. Please take a look at:
>> https://s.apache.org/beam-code-review.
>>
>> Main proposals suggested in the doc are:
>>
>>1. Create a code review guideline document.
>>2. Build/setup code review tools and dashboards for Beam.
>>3. Collect metrics to monitor Beam's code review health.
>>
>> Feel free to add comments in the doc. I am looking for all sorts of
>> suggestions including existing code review guidelines, potential code
>> review tools etc.
>>
>> Thanks so much,
>> Huygaa
>>
>


[Design Proposal] Improving Beam code review

2018-06-26 Thread Huygaa Batsaikhan
Hi, I've been looking into ways to improve Beam's code review process based
on previous discussions on dev list and summits, and I would like to
propose improvement ideas. Please take a look at:
https://s.apache.org/beam-code-review.

Main proposals suggested in the doc are:

   1. Create a code review guideline document.
   2. Build/setup code review tools and dashboards for Beam.
   3. Collect metrics to monitor Beam's code review health.

Feel free to add comments in the doc. I am looking for all sorts of
suggestions including existing code review guidelines, potential code
review tools etc.

Thanks so much,
Huygaa


Re: Building and visualizing the Beam SQL graph

2018-06-11 Thread Huygaa Batsaikhan
I was also wondering the same thing. I don't think there is any
visualization tool for Beam. :(

On Mon, Jun 11, 2018 at 11:39 AM Andrew Pilloud  wrote:

> We are currently converting the Calcite Rel tree to Beam by recursively
> building a tree of nested PTransforms. This results in a weird nested graph
> in the dataflow UI where each node contains its inputs nested inside of it.
> I'm going to change the internal data structure for converting the tree
> from a PTransform to a PCollection, which will result in a more accurate
> representation of the tree structure being built and should simplify the
> code as well. This will not change the public interface to SQL, which will
> remain a PTransform. Any thoughts or objections?
>
> I was also wondering if there are tools for visualizing the Beam graph
> aside from the dataflow runner UI. What other tools exist?
>
> Andrew
>


Re: [VOTE] Code Review Response-time SLO

2018-06-05 Thread Huygaa Batsaikhan
Thanks Ken for pointing out that vote is a two step process. I will write
an extended written doc about pros and cons of the SLO and any related
topics. In the meantime, feel free to use this thread to express any
suggestions and concerns.

Thanks, Huygaa


Re: [VOTE] Code Review Response-time SLO

2018-06-04 Thread Huygaa Batsaikhan
Proposal 1: +1
Proposal 2: +1
Additional Comments: This is an example vote

On Mon, Jun 4, 2018 at 3:15 PM Huygaa Batsaikhan  wrote:

> A few months ago, Reuven sent out an email
> <https://lists.apache.org/thread.html/6c213d28c8e8c1a23614fb4d1837744bd044b6a68f3c47975333e71b@%3Cdev.beam.apache.org%3E>
> about improvements to Beam's code review process. Because the email covered
> multiple issues, we did not really dig deep into each of them. One of the
> suggestions was to agree on a code review response turnaround time (SLO
> <https://en.wikipedia.org/wiki/Service_level_objective>). Here is the
> direct quote:
>
> It would be great if we could agree on a response-time SLA for Beam code
> reviews. The response might be “I am unable to do the review until next
> week,” however even that is better than getting no response.
>
>
> All the comments on the original thread supported having an agreed upon
> SLO. Therefore, I would like to discuss possible response-time SLO and
> finalize it within this thread. For the purpose of this discussion, let's
> put aside related topics such as the need of tooling support like PR
> dashboard or reviewer availability for future discussions.
>
> *My proposals*
>
> *Proposal 1*
> I propose having a *Default* review response time as *3 business days*.
> This aligns with the frequency we consider most developers are checking the
> dev list. My reasoning is, if one is checking the dev list, they could also
> check their PR review queue.
>
> *Proposal 2*
> I propose having an *Opt-in* review response time as *24 hours*.
> Contributors are happy when reviewers respond swiftly to their PRs.
> Specially, when we are making multiple small changes to Beam, waiting for
> even a few days is frustrating. I understand that not all the reviewers can
> review PRs daily. However, if some of us can incorporate half an hour of
> beam review to our schedule, it could improve contributors' experience
> drastically. Therefore, I suggest us having opt-in response time of 24
> hours. We can discuss how we can communicate this SLO to contributors and
> reviewers in a separate thread.
>
> Please vote on these 2 proposals and propose any other solutions using
> within this template:
>
> Template:
> Proposal 1: <+-1> 
> Proposal 2: <+-1> 
> Additional Comments: 
>
> Example answer:
> Proposal 1: +1 Great idea
> Proposal 2: +1
> Additional Comments: I have this idea foobar 
>
> Thank you,
> Huygaa
>
>


[VOTE] Code Review Response-time SLO

2018-06-04 Thread Huygaa Batsaikhan
A few months ago, Reuven sent out an email

about improvements to Beam's code review process. Because the email covered
multiple issues, we did not really dig deep into each of them. One of the
suggestions was to agree on a code review response turnaround time (SLO
). Here is the
direct quote:

It would be great if we could agree on a response-time SLA for Beam code
reviews. The response might be “I am unable to do the review until next
week,” however even that is better than getting no response.


All the comments on the original thread supported having an agreed upon
SLO. Therefore, I would like to discuss possible response-time SLO and
finalize it within this thread. For the purpose of this discussion, let's
put aside related topics such as the need of tooling support like PR
dashboard or reviewer availability for future discussions.

*My proposals*

*Proposal 1*
I propose having a *Default* review response time as *3 business days*.
This aligns with the frequency we consider most developers are checking the
dev list. My reasoning is, if one is checking the dev list, they could also
check their PR review queue.

*Proposal 2*
I propose having an *Opt-in* review response time as *24 hours*.
Contributors are happy when reviewers respond swiftly to their PRs.
Specially, when we are making multiple small changes to Beam, waiting for
even a few days is frustrating. I understand that not all the reviewers can
review PRs daily. However, if some of us can incorporate half an hour of
beam review to our schedule, it could improve contributors' experience
drastically. Therefore, I suggest us having opt-in response time of 24
hours. We can discuss how we can communicate this SLO to contributors and
reviewers in a separate thread.

Please vote on these 2 proposals and propose any other solutions using
within this template:

Template:
Proposal 1: <+-1> 
Proposal 2: <+-1> 
Additional Comments: 

Example answer:
Proposal 1: +1 Great idea
Proposal 2: +1
Additional Comments: I have this idea foobar 

Thank you,
Huygaa


Re: [VOTE] Use probot/stale to automatically manage stale pull requests

2018-06-01 Thread Huygaa Batsaikhan
+1

On Fri, Jun 1, 2018 at 1:17 PM Henning Rohde  wrote:

> +1
>
> On Fri, Jun 1, 2018 at 10:16 AM Chamikara Jayalath 
> wrote:
>
>> +1 (non-binding).
>>
>> Thanks,
>> Cham
>>
>> On Fri, Jun 1, 2018 at 10:05 AM Kenneth Knowles  wrote:
>>
>>> +1
>>>
>>> On Fri, Jun 1, 2018 at 9:54 AM Scott Wegner  wrote:
>>>
 +1 (non-binding)

 On Fri, Jun 1, 2018 at 9:39 AM Ahmet Altay  wrote:

> +1
>
> On Fri, Jun 1, 2018, 9:32 AM Jason Kuster 
> wrote:
>
>> +1 (non-binding): automating policy ensures it is applied fairly and
>> evenly and lessens the load on project maintainers; hearty agreement.
>>
>> On Fri, Jun 1, 2018 at 9:25 AM Alan Myrvold 
>> wrote:
>>
>>> +1 (non-binding) I updated the pull request to be 60 days (instead
>>> of 90) to match the contribute policy.
>>>
>>> On Fri, Jun 1, 2018 at 9:21 AM Kenneth Knowles 
>>> wrote:
>>>
 Hi all,

 Following the discussion, please vote on the move to activate
 probot/stale [3] to notify authors of stale PRs per current policy
 and then close them after a 7 day grace period.

 For more details, see:

  - our stale PR policy [1]
  - the discussion thread [2]
  - Probot stale [3]
  - BEAM ticket summarizing discussion [4]
  - INFRA ticket to activate probot/stale [5]
  - Example PR that would activate it [6]

 Please vote:
 [ ] +1, Approve that we activate probot/stale
 [ ] -1, Do not approve (please provide specific comments)

 Kenn

 [1] https://beam.apache.org/contribute/#stale-pull-requests
 [2]
 https://lists.apache.org/thread.html/bda552ea7073ca165aaf47034610afafe22d589e386525023d33609e@%3Cdev.beam.apache.org%3E
 [3] https://github.com/probot/stale
 [4] https://issues.apache.org/jira/browse/BEAM-4423
 [5] https://issues.apache.org/jira/browse/INFRA-16589
 [6] https://github.com/apache/beam/pull/5532

>>>
>>
>> --
>> ---
>> Jason Kuster
>> Apache Beam / Google Cloud Dataflow
>>
>> See something? Say something. go/jasonkuster-feedback
>> 
>>
>


Re: [VOTE] Code Review Process

2018-06-01 Thread Huygaa Batsaikhan
+1. This will allow non-committers to be actively involved in code reviews
and reduce committer load.

On Fri, Jun 1, 2018 at 11:28 AM Charles Chen  wrote:

> +1
>
> On Fri, Jun 1, 2018 at 11:20 AM Valentyn Tymofieiev 
> wrote:
>
>> +1
>>
>> On Fri, Jun 1, 2018 at 10:40 AM, Ahmet Altay  wrote:
>>
>>> +1
>>>
>>> On Fri, Jun 1, 2018 at 10:37 AM, Kenneth Knowles  wrote:
>>>
 +1

 On Fri, Jun 1, 2018 at 10:25 AM Thomas Groh  wrote:

> As we seem to largely have consensus in "Reducing Committer Load for
> Code Reviews"[1], this is a vote to change the Beam policy on Code Reviews
> to require that
>
> (1) At least one committer is involved with the code review, as either
> a reviewer or as the author
> (2) A contributor has approved the change
>
> prior to merging any change.
>
> This changes our policy from its current requirement that at least one
> committer *who is not the author* has approved the change prior to 
> merging.
> We believe that changing this process will improve code review throughput,
> reduce committer load, and engage more of the community in the code review
> process.
>
> Please vote:
> [ ] +1: Accept the above proposal to change the Beam code review/merge
> policy
> [ ] -1: Leave the Code Review policy unchanged
>
> Thanks,
>
> Thomas
>
> [1]
> https://lists.apache.org/thread.html/7c1fde3884fbefacc252b6d4b434f9a9c2cf024f381654aa3e47df18@%3Cdev.beam.apache.org%3E
>

>>>
>>


Re: [ANNOUNCEMENT] New committers, May 2018 edition!

2018-06-01 Thread Huygaa Batsaikhan
Congrats!

On Fri, Jun 1, 2018 at 10:26 AM Thomas Groh  wrote:

> Congrats, you three!
>
> On Thu, May 31, 2018 at 7:09 PM Davor Bonaci  wrote:
>
>> Please join me and the rest of Beam PMC in welcoming the following
>> contributors as our newest committers. They have significantly contributed
>> to the project in different ways, and we look forward to many more
>> contributions in the future.
>>
>> * Griselda Cuevas
>> * Pablo Estrada
>> * Jason Kuster
>>
>> (Apologizes for a delayed announcement, and the lack of the usual
>> paragraph summarizing individual contributions.)
>>
>> Congratulations to all three! Welcome!
>>
>


Re: Java Direct Runner technical documentation is coming soon!

2018-05-23 Thread Huygaa Batsaikhan
On Wed, May 23, 2018 at 11:20 AM Huygaa Batsaikhan <bat...@google.com>
wrote:

> Hi devs,
>
> Robin Qu and I, both new Beam contributors, have been working on adding
> new features in Java Direct Runner. However, our experience was not that
> smooth because there were no technical documents describing the overall
> design of the direct runner.
>
> As the Direct Runner is supposed to be the easiest runner to develop with,
> we find it extremely useful to document the technical details of it so that
> it can be the first step for any developers who wants to add new features
> to Java SDK, or simply understand the basics of Beam. Also, the
> documentation will make maintenance and debugging much easier.
>
> We have just started documenting the overall architecture of the Java
> Direct Runner with the help of Thomas Groh, and looking forward to publish
> it once we have a draft ready. In the meantime, let us know if you have any
> suggestions on what you would like to see in the documentation.
>
> Thanks,
> Huygaa
>


Java Direct Runner technical documentation is coming soon!

2018-05-23 Thread Huygaa Batsaikhan
Hi devs,

Robin Qu and I, both new Beam contributors, have been working on adding new
features in Java Direct Runner. However, our experience was not that smooth
because there were no technical documents describing the overall design of
the direct runner.

As the Direct Runner is supposed to be the easiest runner to develop with,
we find it extremely useful to document the technical details of it so that
it can be the first step for any developers who wants to add new features
to Java SDK, or simply understand the basics of Beam. Also, the
documentation will make maintenance and debugging much easier.

We have just started documenting the overall architecture of the Java
Direct Runner with the help of Thomas Groh, and looking forward to publish
it once we have a draft ready. In the meantime, let us know if you have any
suggestions on what you would like to see in the documentation.

Thanks,
Huygaa


Re: The full list of proposals / prototype documents

2018-05-23 Thread Huygaa Batsaikhan
+1. That is great, Alexey. Robin and I are working on documenting some
missing pieces of Java SDK. We will let you know when we create polished
documents.

On Wed, May 23, 2018 at 9:28 AM Ismaël Mejía  wrote:

> +1 and thanks for volunteering for this Alexey.
> We really need to make this more accesible.
> On Wed, May 23, 2018 at 6:00 PM Alexey Romanenko  >
> wrote:
>
> > Joseph, Eugene - thank you very much for the links!
>
> > All, regarding one common entry point for all design documents. Could we
> just have a dedicated page on Beam web site with a list of links to every
> proposed document? Every entry (optionally) might contain, in addition,
> short abstract and list of author(s). In this case, it would be easily
> searchable and available for those who are interested in this.
>
> > In the same time, using a Google doc for writing/discussing the documents
> seems more than reasonable since it’s quite native and easy to use. I only
> propose to have a common entry point to fall of them.
>
> > If this idea looks feasible, I’d propose myself to collect the links to
> already created documents, create such page and update this list in the
> future.
>
> > WBR,
> > Alexey
>
> > On 22 May 2018, at 21:34, Eugene Kirpichov  wrote:
>
> > Making it easier to manage indeed would be good. Could someone from PMC
> please add the following documents of mine to it?
>
> > SDF related documents:
> > http://s.apache.org/splittable-do-fn
> > http://s.apache.org/sdf-via-source
> > http://s.apache.org/textio-sdf
> > http://s.apache.org/beam-watch-transform
> > http://s.apache.org/beam-breaking-fusion
>
> > Non SDF related:
> > http://s.apache.org/context-fn
> > http://s.apache.org/fileio-write
>
> > A suggestion: maybe we can establish a convention to send design document
> proposals to dev+desi...@beam.apache.org? Does the Apache mailing list
> management software support this kind of stuff? Then they'd be quite easy
> to find and filter.
>
> > On Tue, May 22, 2018 at 10:57 AM Kenneth Knowles  wrote:
>
> >> It is owned by the Beam PMC collectively. Any PMC member can add things
> to it. Ideas for making it easy to manage are welcome.
>
> >> Probably easier to have a markdown file somewhere with a list of docs so
> we can issue and review PRs. Not sure the web site is the right place for
> it - we have a history of porting docs to markdown but really that is high
> overhead and users/community probably don't gain from it so much. Some have
> suggested a wiki.
>
> >> Kenn
>
> >> On Tue, May 22, 2018 at 10:22 AM Scott Wegner 
> wrote:
>
> >>> Thanks for the links. Any details on that Google drive folder? Who
> maintains it? Is it possible for any contributor to add their design doc?
>
> >>> On Mon, May 21, 2018 at 8:15 AM Joseph PENG 
> wrote:
>
>  Alexey,
>
>  I do not know where you can find all design docs, but I know a blog
> that has collected some of the major design docs. Hope it helps.
>
>  https://wtanaka.com/beam/design-doc
>
>  https://drive.google.com/drive/folders/0B-IhJZh9Ab52OFBVZHpsNjc4eXc
>
>  On Mon, May 21, 2018 at 9:28 AM Alexey Romanenko <
> aromanenko@gmail.com> wrote:
>
> > Hi all,
>
> > Is it possible to obtain somewhere a list of all proposals /
> prototype documents that have been published as a technical / design
> documents for new features? I have links to only some of them (found in
> mail list discussions by chance) but I’m not aware of others.
>
> > If yes, could someone share it or point me out where it is located in
> case if I missed this?
>
> > If not, don’t you think it would make sense to have such index of
> these documents? I believe it can be useful for Beam contributors since
> these proposals contain information which is absent or not so detailed on
> Beam web site documentation.
>
> > WBR,
> > Alexey
>


Re: [VOTE] Go SDK

2018-05-22 Thread Huygaa Batsaikhan
+1 (non-binding). Great news!

On Tue, May 22, 2018 at 11:49 AM Chamikara Jayalath 
wrote:

> +1 (non-binding). Great to know that our third SDK will be
> released/supported officially.
>
> On Tue, May 22, 2018 at 11:38 AM Eugene Kirpichov 
> wrote:
>
>> +1!
>>
>> It is particularly exciting to me that the Go support is
>> "portability-first" and does everything in the proper "portability way"
>> from the start, free of legacy non-portable runner support code.
>>
>> On Tue, May 22, 2018 at 11:32 AM Scott Wegner  wrote:
>>
>>> +1 (non-binding)
>>>
>>> Having a third language will really force us to design Beam constructs
>>> in a language-agnostic way, and achieve the goals of portability. Thanks to
>>> all that have helped reach this milestone.
>>>
>>> On Tue, May 22, 2018 at 10:19 AM Ahmet Altay  wrote:
>>>
 +1 (binding)

 Congratulations to the team!

 On Tue, May 22, 2018 at 10:13 AM, Alan Myrvold 
 wrote:

> +1 (non-binding)
> Nice work!
>
> On Tue, May 22, 2018 at 9:18 AM Pablo Estrada 
> wrote:
>
>> +1 (binding)
>> Very excited to see this!
>>
>> On Tue, May 22, 2018 at 9:09 AM Thomas Weise  wrote:
>>
>>> +1 and congrats!
>>>
>>>
>>> On Tue, May 22, 2018 at 8:48 AM, Rafael Fernandez <
>>> rfern...@google.com> wrote:
>>>
 +1 !

 On Tue, May 22, 2018 at 7:54 AM Lukasz Cwik 
 wrote:

> +1 (binding)
>
> On Tue, May 22, 2018 at 6:16 AM Robert Burke 
> wrote:
>
>> +1 (non-binding)
>>
>> I'm looking forward to helping gophers solve their big data
>> problems in their language of choice, and runner of choice!
>>
>> Next stop, a non-java portability runner?
>>
>> On Tue, May 22, 2018, 6:08 AM Kenneth Knowles 
>> wrote:
>>
>>> +1 (binding)
>>>
>>> This is great. Feels like a phase change in the life of Apache
>>> Beam, having three languages, with multiple portable runners on the 
>>> horizon.
>>>
>>> Kenn
>>>
>>> On Tue, May 22, 2018 at 2:50 AM Ismaël Mejía 
>>> wrote:
>>>
 +1 (binding)

 Go SDK brings new language support for a community not well
 supported in
 the Big Data world the Go developers, so this is a great. Also
 the fact
 that this is the first SDK integrated with the portability work
 makes it an
 interesting project to learn lessons from for future languages.

 Now it is the time to start building a community around the Go
 SDK this is
 the most important task now, and the only way to do it is to
 have the SDK
 as an official part of Beam so +1.

 Congrats to Henning and all the other contributors for this
 important
 milestone.
 On Tue, May 22, 2018 at 10:21 AM Holden Karau <
 hol...@pigscanfly.ca> wrote:

 > +1 (non-binding), I've had a chance to work with the SDK and
 it's pretty
 neat to see Beam add support for a language before the most of
 the big data
 ecosystem.

 > On Mon, May 21, 2018 at 10:29 PM, Jean-Baptiste Onofré <
 j...@nanthrax.net>
 wrote:

 >> Hi Henning,

 >> SGA has been filed for the entire project during the
 incubation period.

 >> Here, we have to check if SGA/IP donation is clean for the
 Go SDK.

 >> We don't have a lot to do, just checked that we are clean on
 this front.

 >> Regards
 >> JB

 >> On 22/05/2018 06:42, Henning Rohde wrote:

 >>> Thanks everyone!

 >>> Davor -- regarding your two comments:
 >>> * Robert mentioned that "SGA should have probably
 already been
 filed" in the previous thread. I got the impression that
 nothing further
 was needed. I'll follow up.
 >>> * The standard Go tooling basically always pulls
 directly from
 github, so there is no real urgency here.

 >>> Thanks,
 >>>Henning


 >>> On Mon, May 21, 2018 at 9:30 PM Jean-Baptiste Onofré <
 j...@nanthrax.net
 > 

Re: I'm back and ready to help grow our community!

2018-05-17 Thread Huygaa Batsaikhan
Welcome back, Gris! Congratulations!

On Thu, May 17, 2018 at 4:24 PM Robert Bradshaw  wrote:

> Congratulations, Gris! And welcome back!
> On Thu, May 17, 2018 at 3:30 PM Robin Qiu  wrote:
>
> > Congratulations! Welcome back!
>
> > On Thu, May 17, 2018 at 3:23 PM Reuven Lax  wrote:
>
> >> Congratulations! Good to see you back!
>
> >> Reuven
>
> >> On Thu, May 17, 2018 at 2:24 PM Griselda Cuevas 
> wrote:
>
> >>> Hi Everyone,
>
>
> >>> I was absent from the mailing list, slack channel and our Beam
> community for the past six weeks, the reason was that I took a leave to
> focus on finishing my Masters Degree, which I finally did on May 15th.
>
>
> >>> I graduated as a Masters of Engineering in Operations Research with a
> concentration in Data Science from UC Berkeley. I'm glad to be part of this
> community and I'd like to share this accomplishment with you so I'm adding
> two pictures of that day :)
>
>
> >>> Given that I've seen so many new folks around, I'd like to use this
> opportunity to re-introduce myself. I'm Gris Cuevas and I work at Google.
> Now that I'm back, I'll continue to work on supporting our community in two
> main streams: Contribution Experience & Events, Meetups, and Conferences.
>
>
> >>> It's good to be back and I look forward to collaborating with you.
>
>
> >>> Cheers,
>
> >>> Gris
>


Documenting Github PR jenkins trigger phrases

2018-05-10 Thread Huygaa Batsaikhan
Hi devs,

We can run various jenkins commands (precommit, postcommit, performance
tests) directly from Github Pull Request UI by commenting phrases such as
"retest this please". Unfortunately, this tool is not documented. I am
adding a brief documentation in https://beam.apache.org/contribute/testing/
and I need some help.

   1. What are the most common phrases used?
   2. Can anyone run these commands? Are there any permission issues?
   3. Does it make sense to categorize the commands as Performance tests,
   Precommit, Postcommit, and Release Validation?

Let me know what you think,

Thanks,
Huygaa


Re: Implementing @OnWindowExpiration in StatefulParDo [BEAM-1589]

2018-03-20 Thread Huygaa Batsaikhan
As echauchot@ mentioned, it will make it easier and error-free.


On Mon, Mar 19, 2018 at 11:59 PM Romain Manni-Bucau <rmannibu...@gmail.com>
wrote:

> Hi Huygaa,
>
> Cant it be predefined timers?
>
> Romain
>
> Le 20 mars 2018 00:52, "Huygaa Batsaikhan" <bat...@google.com> a écrit :
>
> Hi everyone, I am working on BEAM-1589
> <https://issues.apache.org/jira/browse/BEAM-1589>. In short, currently,
> there is no default way of saving/flushing state before a window is garbage
> collected.
>
> My current plan is to provide a method annotation, @OnWindowExpiration,
> which allows user-provided callback function to be executed before garbage
> collection. This annotation behaves very similar to @OnTimer, therefore,
> implementation will mostly be a copy of OnTimer code. Let me know if you
> have any considerations and suggestions.
>
> Here is an example usage:
> ```
> @OnWindowExpiration
> public void myCleanupFunction(OnWindowExpirationContext c, State state) {
>   c.output(state.read());
> }
> ```
>
> Thanks, Huygaa
>
>
>


Implementing @OnWindowExpiration in StatefulParDo [BEAM-1589]

2018-03-19 Thread Huygaa Batsaikhan
Hi everyone, I am working on BEAM-1589
. In short, currently,
there is no default way of saving/flushing state before a window is garbage
collected.

My current plan is to provide a method annotation, @OnWindowExpiration,
which allows user-provided callback function to be executed before garbage
collection. This annotation behaves very similar to @OnTimer, therefore,
implementation will mostly be a copy of OnTimer code. Let me know if you
have any considerations and suggestions.

Here is an example usage:
```
@OnWindowExpiration
public void myCleanupFunction(OnWindowExpirationContext c, State state) {
  c.output(state.read());
}
```

Thanks, Huygaa