[ANNOUNCE] Beam 2.32.0 Released

2021-08-30 Thread Ankur Goenka
The Apache Beam team is pleased to announce the release of version 2.32.0.

Apache Beam is an open source unified programming model to define and
execute data processing pipelines, including ETL, batch and stream
(continuous) processing.
See https://beam.apache.org

You can download the release here:
https://beam.apache.org/get-started/downloads/

This release includes bug fixes, features, and improvements detailed
on the Beam blog: https://beam.apache.org/blog/beam-2.32.0/

Thank you to everyone who contributed to this release, and we hope you
enjoy using Beam 2.32.0

-Ankur, on behalf of the Apache Beam community.


Re: [ANNOUNCE] New committer: Yichi Zhang

2021-04-28 Thread Ankur Goenka
Congrats Yichi!

On Thu, Apr 22, 2021 at 10:13 AM Yichi Zhang  wrote:

> Thanks everyone! It's my honor and I hope I can make more contributions in
> the future!
>
> On Thu, Apr 22, 2021 at 10:11 AM Yichi Zhang  wrote:
>
>> Thanks, Brian!
>>
>> On Thu, Apr 22, 2021 at 9:11 AM Brian Hulette 
>> wrote:
>>
>>> Congratulations Yichi!
>>>
>>> On Thu, Apr 22, 2021 at 8:05 AM Robert Burke  wrote:
>>>
 Congratulations Yichi!

 On Thu, Apr 22, 2021, 7:17 AM Alexey Romanenko <
 aromanenko@gmail.com> wrote:

> Congratulations, well deserved!
>
> On 22 Apr 2021, at 10:03, Jan Lukavský  wrote:
>
> Congrats Yichi!
> On 4/22/21 4:58 AM, Ahmet Altay wrote:
>
> Congratulations Yichi! 
>
> On Wed, Apr 21, 2021 at 6:48 PM Chamikara Jayalath <
> chamik...@google.com> wrote:
>
>> Congrats Yichi!
>>
>> On Wed, Apr 21, 2021 at 6:14 PM Heejong Lee 
>> wrote:
>>
>>> Congratulations :)
>>>
>>> On Wed, Apr 21, 2021 at 5:20 PM Tomo Suzuki 
>>> wrote:
>>>
 Congratulations!

 On Wed, Apr 21, 2021 at 7:48 PM Tyson Hamilton 
 wrote:

> Congrats!
>
> On Wed, Apr 21, 2021 at 4:37 PM Valentyn Tymofieiev <
> valen...@google.com> wrote:
>
>> Well deserved and congrats, Yichi!
>>
>> On Wed, Apr 21, 2021 at 4:23 PM Pablo Estrada 
>> wrote:
>>
>>> Hi all,
>>>
>>> Please join me and the rest of the Beam PMC in welcoming a new
>>> committer: Yichi Zhang
>>>
>>> Yichi has been working in Beam for a while. He has contributed
>>> to various areas, including Nexmark tests, test health, Python's 
>>> streaming
>>> capabilities, he has answered questions on StackOverflow, and 
>>> helped with
>>> release validations, among many other things that Yichi has 
>>> contributed to
>>> the Beam community.
>>>
>>> Considering these contributions, the Beam PMC trusts Yichi with
>>> the responsibilities of a Beam committer.[1]
>>>
>>> Thanks Yichi!
>>> -P.
>>>
>>> [1]
>>> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>>>
>>

 --
 Regards,
 Tomo

>>>
>


Re: Java Tests are failing on Github checks

2021-03-08 Thread Ankur Goenka
The fix PR https://github.com/apache/beam/pull/14148 which upgrades the
error prone version to 2.3.4 with failing checks disabled seems to be
working fine and is merged.

Please sync to head and try again.

On Mon, Mar 8, 2021 at 8:26 AM Daniel Collins  wrote:

> Hello all,
>
> Can we make progress on either updating the version or rolling back
> https://github.com/apache/beam/commit/dfeda2ab1dbebd8446766fae0cafb314ec29920f?
> This is blocking any and all PRs from being submitted.
>
> Thanks,
>
> Daniel
>
> On Thu, Mar 4, 2021 at 6:52 PM Shehzaad Nakhoda 
> wrote:
>
>> Hi Ankur
>>
>> I've created a PR https://github.com/apache/beam/pull/14148 to see if
>> this upgrade might help.
>>
>> Would be great to know if you can repro locally on Ubuntu. I'm on a Mac.
>>
>> thanks
>> --shehzaad
>>
>>
>>
>> On Thu, Mar 4, 2021 at 3:32 PM Ankur Goenka  wrote:
>>
>>> The :sdks:java:core:compileJava is only failing for ubuntu on error
>>> prone.
>>> I am trying to repro it locally and can try updating the error prone
>>> version to see if it helps.
>>>
>>> On Thu, Mar 4, 2021 at 3:26 PM Shehzaad Nakhoda <
>>> shehz...@venturedive.com> wrote:
>>>
>>>> It appears this is an issue in errorprone:2.3.1 which has been fixed in
>>>> 2.3.2.
>>>>
>>>> https://github.com/google/error-prone/issues/1091
>>>>
>>>> We currently use 2.3.1 (set in BeamModulePlugin.groovy). Should we
>>>> upgrade to 2.3.2?
>>>>
>>>>
>>>> On Thu, Mar 4, 2021 at 10:03 AM Kyle Weaver 
>>>> wrote:
>>>>
>>>>> I don't think this is just a flake; it seems Github actions Java tests
>>>>> are permared right now. I filed a JIRA for it:
>>>>> https://issues.apache.org/jira/browse/BEAM-11921
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Mar 4, 2021 at 9:19 AM Robert Bradshaw 
>>>>> wrote:
>>>>>
>>>>>> I've noticed this sometimes for Python as well: Jenkins is happy
>>>>>> with the exact same tests that Github checks fails on.
>>>>>>
>>>>>> On Thu, Mar 4, 2021 at 8:40 AM Alexey Romanenko <
>>>>>> aromanenko@gmail.com> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> Does anyone know why some Java Tests, that run as Github checks,
>>>>>>> fail? For example for this PR [1], this [2] and this [3] checks, 
>>>>>>> whenever
>>>>>>> other passed and there are no fails for Jenkins jobs.
>>>>>>> I noticed that the same checks fail for some other latest PRs as
>>>>>>> well. Is it coincidence?
>>>>>>>
>>>>>>> [1] https://github.com/apache/beam/pull/13914
>>>>>>> [2]
>>>>>>> https://github.com/apache/beam/pull/13914/checks?check_run_id=2029368988
>>>>>>> [3]
>>>>>>> https://github.com/apache/beam/pull/13914/checks?check_run_id=2029369164
>>>>>>
>>>>>>
>>>>
>>>>
>>>>
>>
>> --
>>
>> *Shehzaad Nakhoda*
>> CTO
>> USA/WhatsApp: +1 6502085107 <(650)%20208-5107>
>> PAK: +92 3082654179 <+92%20308%202654179>
>>
>> <http://venturedive.com/>
>>
>


Re: Java Tests are failing on Github checks

2021-03-04 Thread Ankur Goenka
The :sdks:java:core:compileJava is only failing for ubuntu on error prone.
I am trying to repro it locally and can try updating the error prone
version to see if it helps.

On Thu, Mar 4, 2021 at 3:26 PM Shehzaad Nakhoda 
wrote:

> It appears this is an issue in errorprone:2.3.1 which has been fixed in
> 2.3.2.
>
> https://github.com/google/error-prone/issues/1091
>
> We currently use 2.3.1 (set in BeamModulePlugin.groovy). Should we upgrade
> to 2.3.2?
>
>
> On Thu, Mar 4, 2021 at 10:03 AM Kyle Weaver  wrote:
>
>> I don't think this is just a flake; it seems Github actions Java tests
>> are permared right now. I filed a JIRA for it:
>> https://issues.apache.org/jira/browse/BEAM-11921
>>
>>
>>
>>
>> On Thu, Mar 4, 2021 at 9:19 AM Robert Bradshaw 
>> wrote:
>>
>>> I've noticed this sometimes for Python as well: Jenkins is happy
>>> with the exact same tests that Github checks fails on.
>>>
>>> On Thu, Mar 4, 2021 at 8:40 AM Alexey Romanenko <
>>> aromanenko@gmail.com> wrote:
>>>
 Hi,

 Does anyone know why some Java Tests, that run as Github checks, fail?
 For example for this PR [1], this [2] and this [3] checks, whenever other
 passed and there are no fails for Jenkins jobs.
 I noticed that the same checks fail for some other latest PRs as well.
 Is it coincidence?

 [1] https://github.com/apache/beam/pull/13914
 [2]
 https://github.com/apache/beam/pull/13914/checks?check_run_id=2029368988
 [3]
 https://github.com/apache/beam/pull/13914/checks?check_run_id=2029369164
>>>
>>>
>
>
>


Re: [ANNOUNCE] New PMC Member: Chamikara Jayalath

2021-01-21 Thread Ankur Goenka
Congrats Cham!

On Thu, Jan 21, 2021 at 2:57 PM Ahmet Altay  wrote:

> Hi all,
>
> Please join me and the rest of Beam PMC in welcoming Chamikara Jayalath as
> our
> newest PMC member.
>
> Cham has been part of the Beam community from its early days and
> contributed to the project in significant ways, including contributing new
> features and improvements especially related Beam IOs, advocating for
> users, and mentoring new community members.
>
> Congratulations Cham! And thanks for being a part of Beam!
>
> Ahmet
>


Re: [PROPOSAL] Preparing for Beam 2.28.0 release

2021-01-14 Thread Ankur Goenka
Thanks Cham!

On Thu, Jan 14, 2021 at 12:26 PM Yichi Zhang  wrote:

> Thank you, Cham!
>
> On Wed, Jan 13, 2021 at 12:58 PM Ahmet Altay  wrote:
>
>> Thank you Cham!
>>
>> On Wed, Jan 13, 2021 at 12:44 PM Rui Wang  wrote:
>>
>>> Thanks Cham for working on this!
>>>
>>>
>>> -Rui
>>>
>>> On Wed, Jan 13, 2021 at 11:32 AM Kyle Weaver 
>>> wrote:
>>>
 Thanks for stepping up Cham!

 Remember to mark critical JIRA issues as release blockers everybody!

 On Wed, Jan 13, 2021 at 11:25 AM Chamikara Jayalath <
 chamik...@google.com> wrote:

> Hi All,
>
> Beam 2.28.0 release is scheduled to be cut on January 27th according
> to the release calendar [1]
>
> I'd like to volunteer myself to be the release manager for this
> release. I plan on cutting the release branch on the scheduled date.
>
> Any comments or objections ?
>
> Thanks,
> Cham
>
> [1]
> https://calendar.google.com/calendar/u/0/embed?src=0p73sl034k80oob7seouani...@group.calendar.google.com=America/Los_Angeles
>



Re: [ANNOUNCE] New PMC Member: Alexey Romanenko

2020-06-16 Thread Ankur Goenka
Congratulations Alexey!

On Tue, Jun 16, 2020 at 2:41 PM Thomas Weise  wrote:

> Congratulations!
>
>
> On Tue, Jun 16, 2020 at 1:27 PM Valentyn Tymofieiev 
> wrote:
>
>> Congratulations!
>>
>> On Tue, Jun 16, 2020 at 11:41 AM Ahmet Altay  wrote:
>>
>>> Congratulations!
>>>
>>> On Tue, Jun 16, 2020 at 10:05 AM Pablo Estrada 
>>> wrote:
>>>
 Yooohooo! Thanks for all your contributions and hard work Alexey!:)

 On Tue, Jun 16, 2020, 8:57 AM Ismaël Mejía  wrote:

> Please join me and the rest of Beam PMC in welcoming Alexey Romanenko
> as our
> newest PMC member.
>
> Alexey has significantly contributed to the project in different ways:
> new
> features and improvements in the Spark runner(s) as well as
> maintenance of
> multiple IO connectors including some of our most used ones (Kafka and
> Kinesis/Aws). Alexey is also quite active helping new contributors and
> our user
> community in the mailing lists / slack and Stack overflow.
>
> Congratulations Alexey!  And thanks for being a part of Beam!
>
> Ismaël
>



Re: [PROPOSAL] Preparing for Beam 2.23.0 release

2020-06-15 Thread Ankur Goenka
Thanks Valentyn!

On Mon, Jun 15, 2020 at 12:41 PM Kyle Weaver  wrote:

> Sounds good, thanks Valentyn!
>
> On Mon, Jun 15, 2020 at 12:31 PM Valentyn Tymofieiev 
> wrote:
>
>> Hi all,
>>
>> According to the Beam release calendar [1], the next (2.23.0) release
>> branch cut is scheduled for July 1.
>>
>> I would be happy to help with this release and volunteer myself to be
>> the next release manager.
>>
>> As usual, the plan is to cut the branch on that date, and cherrypick
>> release-blocking fixes afterwards if any.
>>
>> Any unresolved release blocking JIRA issues for 2.23.0 should have their
>> "Fix Version/s" marked as "2.23.0".
>>
>> Any comments or objections?
>>
>> [1]
>> https://calendar.google.com/calendar/embed?src=0p73sl034k80oob7seouanigd0%40group.calendar.google.com
>>
>>


Re: CVE Audit Plugin for Python

2020-05-29 Thread Ankur Goenka
+1 for adding it to Python.
We can explore more as to how we can surface the findings as a health
signal.
It will also be good to apply it to our old releases for users to be aware
of.



On Fri, May 29, 2020 at 11:20 AM Luke Cwik  wrote:

> Past work added an audit plugin for Java[1]. I reached out to PyUp and
> they have a free tool to use which can check the set of Python dependencies
> we have for CVE errors. The tool works by scanning a text file of
> dependencies and checking it against a CVE database. There is also support
> for integration with various web based systems which we don't use. There is
> a paid version which gives you the same features but the CVE database you
> get access to is updated more frequently (free = monthly?, paid = daily?).
>
> Has anyone been using the integration for Java added in [1] and has it
> been generally useful?
> Should we try adding PyUp to validate Beam Python's dependencies?
>
> 1:
> https://lists.apache.org/thread.html/a3550051d1b7ce4454c586d1f806cd799a1ccc8776f3857308a2fc09%40%3Cdev.beam.apache.org%3E
>


Re: How to submit PRs for dependant changes?

2020-04-28 Thread Ankur Goenka
I would prefer c.
I suppose you already have 4 separate changes in 4 separate local branches,
Each of a,b and c will require merging locally. So there shouldn't be much
of a difference in effort in these 3 options.
c would be easiest to review and will preserve information in the most
meaningful way.

On Tue, Apr 28, 2020 at 10:48 AM Robert Bradshaw 
wrote:

> I prefer (c) as well, rebasing as things get merged. I would do (a) if
> they're really prerequisites for one another.
>
> On Tue, Apr 28, 2020 at 10:40 AM Udi Meiri  wrote:
>
>> (a) or (c) should work. (c) is preferred if you want faster reviews.
>>
>> For multiple JIRAs, I've seen both [BEAM-123,BEAM-456] and
>> [BEAM-123][BEAM-456] formats. One of them works but I'm not sure which. :D
>> You can always manually add a PR to a JIRA.
>>
>>
>>
>> On Sun, Apr 26, 2020 at 2:49 PM Reuven Lax  wrote:
>>
>>> For c), I don't think you need merge resolutions. You can submit each
>>> commit in a separate PR, and rebase your branch after each one.
>>>
>>> On Sun, Apr 26, 2020 at 10:25 AM Niel Markwick  wrote:
>>>

 Hey Beam devs...

 I have 4 changes to submit as PRs to fix 4 independent issues in the
 io.gcp.SpannerIO class.

 The PRs are notionally independent, but will cause merge conflicts if
 submitted separately, as the fix for each issue will change code related to
 the fix for some of the others.

 How do you prefer the PRs to be submitted?

 a) one single PR with 4 sequential commits within it
 b) one single PR with all changes squashed.
 c) 4 separate conflicting PRs which will have to be merged separately,
 and a merge conflict resolution after each one.

 a) is how it is in my repo.
 b) would be easy, but less clear what the changes were for.
 c) I guess would be clearest in the Beam changelog.

 If the answer is a) or b), how would I specify multiple JIRA tickets in
 the PR title?

 Thanks!

 --
 
 * •  **Niel Markwick*
 * •  *Cloud Solutions Architect
 
 * •  *Google Belgium
 * •  *ni...@google.com
 * •  *+32 2 894 6771


 Google Belgium NV/SA, Steenweg op Etterbeek 180, 1040 Brussel, Belgie. 
 RPR: 0878.065.378

 If you have received this communication by mistake, please don't
 forward it to anyone else (it may contain confidential or privileged
 information), please erase all copies of it, including all attachments, and
 please let the sender know it went to the wrong person. Thanks

>>>


Re: Default WindowFn for Unbounded source

2020-03-31 Thread Ankur Goenka
Hi Amit,

As you don't have any GroupByKey or trigger in your pipeline, you don't
need to do allowed lateness.
For unbounded source, Global window will never fire a trigger or emit
GroupByKey.
In the code you linked, a trigger is used which uses allowedLateness.

Thanks,
Ankur

On Tue, Mar 31, 2020 at 11:20 AM amit kumar  wrote:

> Thanks Jan!
> I have a question based on this on Global Window and allowed lateness,
> with default trigger for the following
>  scenarios:
>
> Case 1-
> TextIO.Read.
>  |. Bounded source
>  |. Global Window
>  |.  -infinity watermark
> apply
> WithTimeStamps (Based on a timestamp attribute in file)
>|.   timestamped elements (watermark starts from -infinity and follows
> the timestamp from timestamp attribute)
>|.   Global Window
>|. (Will I never need to do allowedLateness in this case with default
> trigger? Will there be any benefit since the window is global and watermark
> will pass the end of window when everything is processed ?  )
>
>
> Case 2 -
> KinesisIO.read
> | .Unbounded Source
> |. Default Global Window
> |. watermark based on arrival time
>  apply
> WithTimeStamps (Based on a timestamp attribute from the stream)
>|.   timestamped elements  ( watermark follows the timestamp from
> timestamp attribute)
>|.   Global Window
>|. Watermark based on event timestamp.
>| Same question here will there be any benefit of using
> allowedLateness since window is global ?
>
> In the code example below allowedLateness is used for global window ?
>
> https://github.com/apache/beam/blob/828b897a2439437d483b1bd7f2a04871f077bde0/examples/java/src/main/java/org/apache/beam/examples/complete/game/LeaderBoard.java#L307
>
> Regards,
> Amit
>
> On Tue, Mar 31, 2020 at 2:34 AM Jan Lukavský  wrote:
>
>> Hi Amit,
>>
>> the window function applied by default is
>> WindowingStrategy.globalDefault(), [1] - global window with zero allowed
>> lateness.
>>
>> Cheers,
>>
>>   Jan
>>
>> [1]
>>
>> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/values/WindowingStrategy.java#L105
>>
>> On 3/31/20 10:22 AM, amit kumar wrote:
>> > Hi All,
>> >
>> > Is there a default WindowFn that gets applied to elements of an
>> > unbounded source.
>> >
>> > For example, if I have a Kinesis input source ,for which all elements
>> > are timestamped with ArrivalTime, what will be the default windowing
>> > applied to the output of read transform ?
>> >
>> > Is this runner dependent ?
>> >
>> > Regards,
>> > Amit
>>
>


Re: Tests not getting triggered

2020-03-26 Thread Ankur Goenka
Seems to be running now with some delay.

On Thu, Mar 26, 2020 at 12:33 PM Luke Cwik  wrote:

> I saw upwards of a 20 min delay before anything was triggered on a couple
> of PRs.
>
> On Thu, Mar 26, 2020 at 12:32 PM Ankur Goenka  wrote:
>
>> Hi,
>>
>> I think the tests for PRs are not getting triggered example:
>> https://github.com/apache/beam/pull/10870
>>
>> Can someone take a look.
>>
>> Thanks,
>> Ankur
>>
>


Tests not getting triggered

2020-03-26 Thread Ankur Goenka
Hi,

I think the tests for PRs are not getting triggered example:
https://github.com/apache/beam/pull/10870

Can someone take a look.

Thanks,
Ankur


Re: [ANNOUNCE] New committer: Chad Dombrova

2020-02-24 Thread Ankur Goenka
Congratulations Chad!

On Mon, Feb 24, 2020 at 3:34 PM Ahmet Altay  wrote:

> Congratulations!
>
> On Mon, Feb 24, 2020 at 3:25 PM Sam Bourne  wrote:
>
>> Nice one Chad. Your typing efforts are very welcomed.
>>
>> On Tue, Feb 25, 2020 at 10:16 AM Yichi Zhang  wrote:
>>
>>> Congratulations, Chad!
>>>
>>> On Mon, Feb 24, 2020 at 3:10 PM Robert Bradshaw 
>>> wrote:
>>>
 Well deserved, Chad. Congratulations!

 On Mon, Feb 24, 2020 at 2:43 PM Reza Rokni  wrote:
 >
 > Congratulations! :-)
 >
 > On Tue, Feb 25, 2020 at 6:41 AM Chad Dombrova 
 wrote:
 >>
 >> Thanks, folks!  I'm very excited to "retest this" :)
 >>
 >> Especially big thanks to Robert and Udi for all their hard work
 reviewing my PRs.
 >>
 >> -chad
 >>
 >>
 >> On Mon, Feb 24, 2020 at 1:44 PM Brian Hulette 
 wrote:
 >>>
 >>> Congratulations Chad! Thanks for all your contributions :)
 >>>
 >>> On Mon, Feb 24, 2020 at 1:43 PM Kyle Weaver 
 wrote:
 
  Well-deserved, thanks for your dedication to the project Chad. :)
 
  On Mon, Feb 24, 2020 at 1:34 PM Udi Meiri 
 wrote:
 >
 > Congrats and welcome, Chad!
 >
 > On Mon, Feb 24, 2020 at 1:21 PM Pablo Estrada 
 wrote:
 >>
 >> Hi everyone,
 >>
 >> Please join me and the rest of the Beam PMC in welcoming a new
 committer: Chad Dombrova
 >>
 >> Chad has contributed to the project in multiple ways, including
 improvements to the testing infrastructure, and adding type annotations
 throughout the Python SDK, as well as working closely with the community on
 these improvements.
 >>
 >> In consideration of his contributions, the Beam PMC trusts him
 with the responsibilities of a Beam Committer[1].
 >>
 >> Thanks Chad for your contributions!
 >>
 >> -Pablo, on behalf of the Apache Beam PMC.
 >>
 >> [1]
 https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer

>>>


Re: [RELEASE VOTE RESULT] Release 2.19.0, release candidate #1

2020-02-03 Thread Ankur Goenka
Thanks Boyuan!
A record has been set :)

On Mon, Feb 3, 2020 at 5:05 PM Udi Meiri  wrote:

> Thank you Boyuan!
>
> On Mon, Feb 3, 2020 at 3:40 PM Ahmet Altay  wrote:
>
>> On Mon, Feb 3, 2020 at 1:22 PM Thomas Weise  wrote:
>>
>>> Impressive, probably the fastest/smoothest Beam release so far.
>>>
>>
>> I agree! Thank you, Boyuan!
>>
>>
>>>
>>> On Mon, Feb 3, 2020 at 10:45 AM Boyuan Zhang  wrote:
>>>
 I'm happy to announce that we have unanimously approved this release.

 There are 5 approving votes, 4 of which are binging:
 * Ahmet Altay
 * Ismaël Mejía
 * Jean-Baptiste Onofré
 * Robert Bradshaw

 There are no disapproving votes.

 Thanks for everyone's help! I'm going to finalize the release and send
 out the official release announcement later.

>>>


Re: [ANNOUNCE] New committer: Hannah Jiang

2020-01-28 Thread Ankur Goenka
Congrats Hannah!

On Tue, Jan 28, 2020 at 7:30 PM Reza Rokni  wrote:

> Congratz!
>
> On Wed, 29 Jan 2020 at 09:52, Valentyn Tymofieiev 
> wrote:
>
>> Congratulations, Hannah!
>>
>> On Tue, Jan 28, 2020 at 5:46 PM Udi Meiri  wrote:
>>
>>> Welcome and congrats Hannah!
>>>
>>> On Tue, Jan 28, 2020 at 4:52 PM Robin Qiu  wrote:
>>>
 Congratulations, Hannah!

 On Tue, Jan 28, 2020 at 4:50 PM Alan Myrvold 
 wrote:

> Congrats, Hannah
>
> On Tue, Jan 28, 2020 at 4:46 PM Connell O'Callaghan <
> conne...@google.com> wrote:
>
>> Thank you for sharing Luke!!!
>>
>> Well done and congratulations Hannah!!
>>
>> On Tue, Jan 28, 2020 at 4:45 PM Heejong Lee 
>> wrote:
>>
>>> Congratulations! :)
>>>
>>> On Tue, Jan 28, 2020 at 4:43 PM Yichi Zhang 
>>> wrote:
>>>
 Congrats Hannah!

 On Tue, Jan 28, 2020 at 3:57 PM Yifan Zou 
 wrote:

> Congratulations Hannah!!
>
> On Tue, Jan 28, 2020 at 3:55 PM Boyuan Zhang 
> wrote:
>
>> Thanks for all your contributions! Congratulations~
>>
>> On Tue, Jan 28, 2020 at 3:44 PM Pablo Estrada 
>> wrote:
>>
>>> yoooho : D
>>>
>>> On Tue, Jan 28, 2020 at 3:21 PM Luke Cwik 
>>> wrote:
>>>
 Hi everyone,

 Please join me and the rest of the Beam PMC in welcoming a new
 committer: Hannah Jiang

 Hannah has contributed to Beam in many ways, including work on
 building and releasing the Apache Beam SDK containers.

 In consideration of their contributions, the Beam PMC trusts
 them with the responsibilities of a Beam committer[1].

 Thanks for your contributions Hannah!

 Luke, on behalf of the Apache Beam PMC.

 [1]
 https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer

>>>
>
> --
>
> This email may be confidential and privileged. If you received this
> communication by mistake, please don't forward it to anyone else, please
> erase all copies and attachments, and please let me know that it has gone
> to the wrong person.
>
> The above terms reflect a potential business arrangement, are provided
> solely as a basis for further discussion, and are not intended to be and do
> not constitute a legally binding obligation. No legally binding obligations
> will be created, implied, or inferred until an agreement in final form is
> executed in writing by all parties involved.
>


Re: [ANNOUNCE] Beam 2.18.0 Released

2020-01-28 Thread Ankur Goenka
Thanks Udi!

On Tue, Jan 28, 2020 at 11:30 AM Yichi Zhang  wrote:

> Thanks Udi!
>
> On Tue, Jan 28, 2020 at 11:28 AM Hannah Jiang 
> wrote:
>
>> Thanks Udi!
>>
>>
>> On Tue, Jan 28, 2020 at 11:09 AM Pablo Estrada 
>> wrote:
>>
>>> Thanks Udi!
>>>
>>> On Tue, Jan 28, 2020 at 11:08 AM Rui Wang  wrote:
>>>
 Thank you Udi for taking care of Beam 2.18.0 release!



 -Rui

 On Tue, Jan 28, 2020 at 10:59 AM Udi Meiri  wrote:

> The Apache Beam team is pleased to announce the release of
> version 2.18.0.
>
> Apache Beam is an open source unified programming model to define and
> execute data processing pipelines, including ETL, batch and stream
> (continuous) processing. See https://beam.apache.org
>
> You can download the release here:
>
> https://beam.apache.org/get-started/downloads/
>
> This release includes bug fixes, features, and improvements detailed on
> the Beam blog:
> https://beam.apache.org/blog/2020/01/13/beam-2.18.0.html
>
> Thanks to everyone who contributed to this release, and we hope you
> enjoy
> using Beam 2.18.0.
> -- Udi Meiri, on behalf of The Apache Beam team
>



Re: Ordering of element timestamp change and window function

2020-01-21 Thread Ankur Goenka
On Thu, Jan 16, 2020 at 9:52 PM Kenneth Knowles  wrote:

>
>
> On Thu, Jan 16, 2020 at 11:38 AM Robert Bradshaw 
> wrote:
>
>> On Thu, Jan 16, 2020 at 11:00 AM Kenneth Knowles  wrote:
>> >
>> > IIRC in Java it is forbidden to output an element with a timestamp
>> outside its current window.
>>
>> I don't think this is checked anywhere. (Not sure how you would check
>> it, as there's not generic window containment function--I suppose you
>> could check if it's past the end of the window (and of course skew
>> limits how far you can go back). I suppose you could try re-windowing
>> and then fail if it didn't agree with what was already there.
>>
>
> I think you are right. This is governed by how a runner invoked utilities
> from runners-core (output ultimately reaches this point without validation:
> https://github.com/apache/beam/blob/master/runners/core-java/src/main/java/org/apache/beam/runners/core/SimpleDoFnRunner.java#L258
> )
>
>
>> > An exception is outputs from @FinishBundle, where the output timestamp
>> is required and the window is applied. TBH it seems more of an artifact of
>> a mismatch between the pre-windowing and post-windowing worlds.
>>
>> Elements are always in some window, even if just the global window.
>>
>
> I mean that the existence of a window-unaware @FinishBundle method is an
> artifact of the method existing prior to windowing as a concept. The idea
> that a user can use a DoFn's local variables to buffer stuff and then
> output in @FinishBundle predates the existence of windowing.
>
> > Most of the time, mixing processing across windows is simply wrong. But
>> there are fears that calling @FinishBundle once per window would be a
>> performance problem. On the other hand, don't most correct implementations
>> have to separate processing for each window anyhow?
>>
>> Processing needs to be done per window iff the result depends on the
>> window or if there are side effects.
>>
>> > Anyhow I think the Java behavior is better, so window assignment
>> happens exactly and only at window transforms.
>>
>> But then one ends up with timestamps that are unrelated to the windows,
>> right?
>>
>
> As far as the model goes, I think windows provide an upper bound but not a
> lower bound. If we take the approach that windows are a "secondary key with
> a max timestamp" then the timestamps should be related to the window in the
> sense that they are <= the window's max timestamp.
>
A window only makes sense when a trigger or timer is fired. And the
timestamp of the elements in the window should be within the window's time
range when a trigger is set. For consistency, I think element timestamp
should remain within the corresponding time range at every stage of the
graph.
IIUC based on the discussion, users can violate this requirement easily in
the pipeline code which might give inconsistent behavior across runners.

I think we should stick to a consistent behavior across languages and
runners. We have multiple options here like
1. Don't have any promised correlation between element timestamp and
window. Window will just behave like a secondary key for the element.
2. Making it explicit that the last window function can be applied out of
order anytime on the elements.
3. Not letting users change the timestamp without applying a windowing
function after the changed timestamp and before a trigger. Though, this can
only be validated at the runtime in python.
4. Revalidating the window after changing the timestamp. Also provide
additional methods to explicitly change the timestamp and window in oneshot.
5. etc


> Kenn
>
>
>
>> > Kenn
>> >
>> > On Wed, Jan 15, 2020 at 4:59 PM Ankur Goenka  wrote:
>> >>
>> >> The case where a plan vanilla value or a windowed value is emitted
>> seems as expected as the user intent is honored without any surprises.
>> >>
>> >> If I understand correctly in the case when timestamp is changed then
>> applying window function again can have unintended behavior in following
>> cases
>> >> * Custom windows: User code can be executed in unintended order.
>> >> * User emit a windowed value in a previous transform: Timestamping the
>> value in this case would overwrite the user assigned window in earlier step
>> even when the actual timestamp is the same. Semantically, emitting an
>> element or a timestamped value with the same timestamp should have the same
>> behaviour.
>> >>
>> >> What do you think?
>> >>
>> >>
>> >> On Wed, Jan 15, 2020 at 4:04 PM Robert Bradshaw

Re: Ordering of element timestamp change and window function

2020-01-15 Thread Ankur Goenka
The case where a plan vanilla value or a windowed value is emitted seems as
expected as the user intent is honored without any surprises.

If I understand correctly in the case when timestamp is changed then
applying window function again can have unintended behavior in following
cases
* Custom windows: User code can be executed in unintended order.
* User emit a windowed value in a previous transform: Timestamping the
value in this case would overwrite the user assigned window in earlier step
even when the actual timestamp is the same. Semantically, emitting an
element or a timestamped value with the same timestamp should have the same
behaviour.

What do you think?


On Wed, Jan 15, 2020 at 4:04 PM Robert Bradshaw  wrote:

> If an element is emitted with a timestamp, the window assignment is
> re-applied at that time. At least that's how it is in Python. You can
> emit the full windowed value (accepted without checking...), a
> timestamped value (in which case the window will be computed), or a
> plain old element (in which case the window and timestamp will be
> computed (really, propagated)).
>
> On Wed, Jan 15, 2020 at 3:51 PM Ankur Goenka  wrote:
> >
> > Yup, This might result in unintended behavior as timestamp is changed
> after the window assignment as elements in windows do not have timestamp in
> the window time range.
> >
> > Shall we start validating atleast one window assignment between
> timestamp assignment and GBK/triggers to avoid unintended behaviors
> mentioned above?
> >
> > On Wed, Jan 15, 2020 at 1:24 PM Luke Cwik  wrote:
> >>
> >> Window assignment happens at the point in the pipeline the WindowInto
> transform was applied. So in this case the window would have been assigned
> using the original timestamp.
> >>
> >> Grouping is by key and window.
> >>
> >> On Tue, Jan 14, 2020 at 7:30 PM Ankur Goenka  wrote:
> >>>
> >>> Hi,
> >>>
> >>> I am not sure about the effect of the order of element timestamp
> change and window association has on a group by key.
> >>> More specifically, what would be the behavior if we apply window ->
> change element timestamp -> Group By key.
> >>> I think we should always apply window function after changing the
> timestamp of elements. Though this is neither checked nor a recommended
> practice in Beam.
> >>>
> >>> Example pipeline would look like this:
> >>>
> >>>   def applyTimestamp(value):
> >>> return window.TimestampedValue((key, value),
> int(time.time())
> >>>
> >>> p \
> >>> | 'Create' >> beam.Create(range(0, 10)) \
> >>> | 'Fixed Window' >>
> beam.WindowInto(window.FixedWindows(5)) \
> >>> | 'Apply Timestamp' >> beam.Map(applyTimestamp) \ #
> Timestamp is changed after windowing and before GBK
> >>> | 'Group By Key' >> beam.GroupByKey() \
> >>> | 'Print' >> beam.Map(print)
> >>>
> >>> Thanks,
> >>> Ankur
>


Re: Ordering of element timestamp change and window function

2020-01-15 Thread Ankur Goenka
Yup, This might result in unintended behavior as timestamp is changed after
the window assignment as elements in windows do not have timestamp in the
window time range.

Shall we start validating atleast one window assignment between timestamp
assignment and GBK/triggers to avoid unintended behaviors mentioned above?

On Wed, Jan 15, 2020 at 1:24 PM Luke Cwik  wrote:

> Window assignment happens at the point in the pipeline the WindowInto
> transform was applied. So in this case the window would have been assigned
> using the original timestamp.
>
> Grouping is by key and window.
>
> On Tue, Jan 14, 2020 at 7:30 PM Ankur Goenka  wrote:
>
>> Hi,
>>
>> I am not sure about the effect of the order of element timestamp change
>> and window association has on a group by key.
>> More specifically, what would be the behavior if we apply window ->
>> change element timestamp -> Group By key.
>> I think we should always apply window function after changing the
>> timestamp of elements. Though this is neither checked nor a recommended
>> practice in Beam.
>>
>> Example pipeline would look like this:
>>
>>   def applyTimestamp(value):
>> return window.TimestampedValue((key, value), int(time.time())
>>
>> p \
>> | 'Create' >> beam.Create(range(0, 10)) \
>> | 'Fixed Window' >> beam.WindowInto(window.FixedWindows(5)) \
>> | 'Apply Timestamp' >> beam.Map(applyTimestamp) \ # Timestamp
>> is changed after windowing and before GBK
>> | 'Group By Key' >> beam.GroupByKey() \
>> | 'Print' >> beam.Map(print)
>>
>> Thanks,
>> Ankur
>>
>


Ordering of element timestamp change and window function

2020-01-14 Thread Ankur Goenka
Hi,

I am not sure about the effect of the order of element timestamp change and
window association has on a group by key.
More specifically, what would be the behavior if we apply window -> change
element timestamp -> Group By key.
I think we should always apply window function after changing the timestamp
of elements. Though this is neither checked nor a recommended practice in
Beam.

Example pipeline would look like this:

  def applyTimestamp(value):
return window.TimestampedValue((key, value), int(time.time())

p \
| 'Create' >> beam.Create(range(0, 10)) \
| 'Fixed Window' >> beam.WindowInto(window.FixedWindows(5)) \
| 'Apply Timestamp' >> beam.Map(applyTimestamp) \ # Timestamp
is changed after windowing and before GBK
| 'Group By Key' >> beam.GroupByKey() \
| 'Print' >> beam.Map(print)

Thanks,
Ankur


Re: [PROPOSAL] Transition released containers to the official ASF dockerhub organization

2020-01-10 Thread Ankur Goenka
Also curious to know if apache provide any infra support fro projects under
Apache umbrella and any quota limits they might have.

On Fri, Jan 10, 2020, 2:26 PM Robert Bradshaw  wrote:

> One downside is that, unlike many of these projects, we release a
> dozen or so containers. Is there exactly (and only) one level of
> namespacing/nesting we can leverage here? (This isn't a blocker, but
> something to consider.)
>
> On Fri, Jan 10, 2020 at 2:06 PM Hannah Jiang 
> wrote:
> >
> > Thanks Ahmet for proposing it.
> > I will take it and work towards v2.19.
> >
> > Hannah
> >
> > On Fri, Jan 10, 2020 at 1:50 PM Kyle Weaver  wrote:
> >>
> >> It'd be nice to have the clout/official sheen of apache attached to our
> containers. Although getting the required permissions might add some small
> overhead to the release process. For example, yesterday, when we needed to
> create new repositories (not just update existing ones), since we have
> top-level ownership of the apachebeam organization, it was quick and easy
> to add them. I imagine we'd have had to get approval from someone outside
> the project to do that under the apache org. But this won't need to happen
> very often, so it's probably not that big a deal.
> >>
> >> On Fri, Jan 10, 2020 at 1:40 PM Ahmet Altay  wrote:
> >>>
> >>> Hi all,
> >>>
> >>> I saw recent progress on the containers and wanted to bring this
> question to the attention of the dev list.
> >>>
> >>> Would it be possible to use the official ASF dockerhub organization
> for new Beam container releases? Concretely, starting from 2.19 could we
> release Beam containers to https://hub.docker.com/u/apache instead of
> https://hub.docker.com/u/apachebeam ?
> >>>
> >>> Ahmet
>


Re: [ANNOUNCE] Beam 2.17.0 Released!

2020-01-10 Thread Ankur Goenka
Thanks for persistent and powering through all the issues.

On Fri, Jan 10, 2020 at 10:23 AM Maximilian Michels  wrote:

> At last :) Thank you for making it happen Mikhail! Also thanks to
> everyone else who tested the release candidate.
>
> Cheers,
> Max
>
> On 10.01.20 19:01, Mikhail Gryzykhin wrote:
> > The Apache Beam team is pleased to announce the release of version
> 2.17.0.
> >
> > Apache Beam is an open source unified programming model to define and
> > execute data processing pipelines, including ETL, batch and stream
> > (continuous) processing. See https://beam.apache.org
> > 
> >
> > You can download the release here:
> >
> > https://beam.apache.org/get-started/downloads/
> >
> > This release includes bug fixes, features, and improvements detailed on
> > the Beam blog: https://beam.apache.org/blog/2020/01/06/beam-2.17.0.html
> > 
> >
> > Thanks to everyone who contributed to this release, and we hope you
> > enjoy using Beam 2.17.0.
>


Re: Cleaning up SDK docker image tagging

2020-01-09 Thread Ankur Goenka
>> For the released version of SDKs, the default tag will be version
number. (ex: 2.17.0)
+1

>> For the unreleased version of SDKs, the default tag will be version
number + '.dev'. (ex: 2.18.0.dev)
Shall we ALSO tag the image with git commit version for local build to keep
track of obsolete images.

On Thu, Jan 9, 2020 at 4:54 PM Kyle Weaver  wrote:

> > This has a minor downside for the users who are using unreleased
> versions. They need to build a local image first before using docker to run.
>
> Isn't that the current behavior?
>
> On Thu, Jan 9, 2020 at 4:48 PM Hannah Jiang 
> wrote:
>
>> Hi Community
>>
>> Now we are using different default tags for Python(version or version.dev),
>> Java(version-SNAPSHOT) and Go(latest). I would like to clean it up and make
>> it consistent for all languages and here is my proposal.
>>
>> For the released version of SDKs, the default tag will be version number.
>> (ex: 2.17.0)
>> For the unreleased version of SDKs, the default tag will be version
>> number + '.dev'. (ex: 2.18.0.dev)
>>
>> The default tag is used 1). when we build docker images without
>> specifying a tag. 2) when we run a job with runners running on dockers with
>> default docker images.
>>
>> Additionally, Beam will always lookup images locally before pulling one
>> from remote, so the images built locally will not be overwritten by remote
>> ones.
>>
>> This has a minor downside for the users who are using unreleased
>> versions. They need to build a local image first before using docker to
>> run. I will add a clear error message to show the problem and add a link to
>> a documentation of how to create images.
>>
>> I would like to collect feedback from whoever uses dockers. Does this
>> sound good? Is there anything I am missing?
>>
>> Thanks,
>> Hannah
>>
>>
>>
>>
>>
>>
>>
>>


Re: Java PortableRunner GBK load test fails

2019-12-17 Thread Ankur Goenka
The connection closing is a red herring as that error gets printed when SDK
Harness dies.
More logs/jenkins link would be useful to understand what going on with the
pipeline.

On Tue, Dec 17, 2019 at 5:53 AM Michał Walenia 
wrote:

> Hi there,
> I'm trying to add a Jenkins job for a load test of GBK on portable Flink
> in Java. I encountered a problem - the test fails with an exception that
> doesn't say much (Exception in thread "main" java.lang.RuntimeException:
> Invalid job state: FAILED.)
>
> After some investigation, I found where the exception originates: 
> SdkHarnessClient.java:320.
> The code there throws an exception related to aborting the processed bundle
> and to BEAM-3962 issue.
>
> I have no idea how to debug the problem (it appears when the test size is
> about 2GB) or how to fix the issue.
> Can anyone assist me? I created a JIRA for this:
> BEAM-8980 
>
> Thanks!
>
> Michal
> --
>
> Michał Walenia
> Polidea  | Software Engineer
>
> M: +48 791 432 002 <+48791432002>
> E: michal.wale...@polidea.com
>
> Unique Tech
> Check out our projects! 
>


Re: Pipeline parameters for running jobs in a cluster

2019-12-10 Thread Ankur Goenka
Hi Matthew,

For 1: Beam does not compute the right configuration for the pipeline so
its recommended to tune it manually as it's done in regular Spark jobs.

For 2: The recommendation is same as that for a regular Spark job.

Thanks,
Ankur

On Tue, Dec 10, 2019 at 2:46 PM Matthew K.  wrote:

> Hi,
>
> To run a beam job on a spark cluster with some number of nodes running:
>
> 1. Is it recommended to set pipeline parameters --num_workers,
> --max_num_workers, --autoscaling_algorithms, --worker_machine_type, etc, or
> beam (spark) will figure that out?
>
> 2. If that is recommended to set those params, what are the recommended
> values based on the machines and resources in the cluster?
>
> Thanks
>


Re: Version Beam Website Documentation

2019-12-04 Thread Ankur Goenka
I agree, having a single website showcase the latest beam versions and
encourages users to use the latest Beam version which is very useful.
Calling out version limitations are definitely makes users life easier.

The usecase I have in mind is more on the lines of best practices and
recommended way of doing things.
One such example is the way we recommend new users to try Portable Flink.
We are overhauling and simplifying the user onboarding experience. Though
the old way of doing things are still supported, the easier new
recommendation for onboarding will only apply from Beam 2.18.
We can ofcource create sections on documentation for this usecase but it
seems like a poor man's way of versioning :)

You also highlighted a great usecase about LTS release. Should we simply
separate out the documentations for LTS release and current version to make
it easy for the users to navigate the website and reduce management
overhead of updating specific sections.

A few areas which might benefit from having multiple versions are
compatibility matrix, Common pipeline patterns, transform catalog and
runner pages.


On Wed, Dec 4, 2019 at 6:19 AM Jeff Klukas  wrote:

> The API reference docs (Java and Python at least) are versioned, so we
> have a durable reference there and it's possible to link to particular
> sections of API docs for particular versions.
>
> For the major bits of introductory documentation (like the Beam
> Programming Guide), I think it's a good thing to have only a single
> version, so that people referencing it are always getting the most
> up-to-date wording and explanations, although it may be worth adding
> callouts there about minimum versions anywhere we discuss newer features.
> We should be encouraging the community to stay reasonably current, so I
> think any feature that's present in the latest LTS release should be fine
> to assume is available to users, although perhaps we should also state that
> explicitly on the website.
>
> Are there particular parts of the Beam website that you have in mind that
> would benefit from versioning? Are there specific cases you see where the
> current website would be confusing for someone using a Beam SDK that's a
> few versions old?
>
> On Tue, Dec 3, 2019 at 6:46 PM Ankur Goenka  wrote:
>
>> Hi,
>>
>> We are constantly adding features to Beam which makes each new Beam
>> version more feature rich and compelling.
>> This also means that the old Beam released don't have the new features
>> and might have different ways to do certain things.
>>
>> (I might be wrong here) - Our Beam website only publish a single version
>> which is the latest version of documentation.
>> This means that the users working with older SDK don't really have an
>> easy way to lookup documentation for old versions of Beam.
>>
>> Proposal: Shall we consider publishing versioned Beam website to help
>> users with old Beam version find the relevant information?
>>
>> Thanks,
>> Ankur
>>
>


Version Beam Website Documentation

2019-12-03 Thread Ankur Goenka
Hi,

We are constantly adding features to Beam which makes each new Beam version
more feature rich and compelling.
This also means that the old Beam released don't have the new features and
might have different ways to do certain things.

(I might be wrong here) - Our Beam website only publish a single version
which is the latest version of documentation.
This means that the users working with older SDK don't really have an easy
way to lookup documentation for old versions of Beam.

Proposal: Shall we consider publishing versioned Beam website to help users
with old Beam version find the relevant information?

Thanks,
Ankur


Re: [ANNOUNCE] New committer: Daniel Oliveira

2019-11-27 Thread Ankur Goenka
Congrats Daniel!

On Mon, Nov 25, 2019 at 10:02 PM Tanay Tummalapalli 
wrote:

> Congratulations!
>
> On Mon, Nov 25, 2019 at 11:12 PM Mark Liu  wrote:
>
>> Congratulations, Daniel!
>>
>> On Mon, Nov 25, 2019 at 9:31 AM Ahmet Altay  wrote:
>>
>>> Congratulations, Daniel!
>>>
>>> On Sat, Nov 23, 2019 at 3:47 AM jincheng sun 
>>> wrote:
>>>

 Congrats, Daniel!
 Best,
 Jincheng

 Alexey Romanenko  于2019年11月22日周五 下午5:47写道:

> Congratulations, Daniel!
>
> On 22 Nov 2019, at 09:18, Jan Lukavský  wrote:
>
> Congrats Daniel!
> On 11/21/19 10:11 AM, Gleb Kanterov wrote:
>
> Congratulations!
>
> On Thu, Nov 21, 2019 at 6:24 AM Thomas Weise  wrote:
>
>> Congratulations!
>>
>>
>> On Wed, Nov 20, 2019, 7:56 PM Chamikara Jayalath <
>> chamik...@google.com> wrote:
>>
>>> Congrats!!
>>>
>>> On Wed, Nov 20, 2019 at 5:21 PM Daniel Oliveira <
>>> danolive...@google.com> wrote:
>>>
 Thank you everyone! I won't let you down. o7

 On Wed, Nov 20, 2019 at 2:12 PM Ruoyun Huang 
 wrote:

> Congrats Daniel!
>
> On Wed, Nov 20, 2019 at 1:58 PM Robert Burke 
> wrote:
>
>> Congrats Daniel! Much deserved.
>>
>> On Wed, Nov 20, 2019, 12:49 PM Udi Meiri 
>> wrote:
>>
>>> Congrats Daniel!
>>>
>>> On Wed, Nov 20, 2019 at 12:42 PM Kyle Weaver <
>>> kcwea...@google.com> wrote:
>>>
 Congrats Dan! Keep up the good work :)

 On Wed, Nov 20, 2019 at 12:41 PM Cyrus Maden 
 wrote:

> Congratulations! This is great news.
>
> On Wed, Nov 20, 2019 at 3:24 PM Rui Wang 
> wrote:
>
>> Congrats!
>>
>>
>> -Rui
>>
>> On Wed, Nov 20, 2019 at 11:48 AM Valentyn Tymofieiev <
>> valen...@google.com> wrote:
>>
>>> Congrats, Daniel!
>>>
>>> On Wed, Nov 20, 2019 at 11:47 AM Kenneth Knowles <
>>> k...@apache.org> wrote:
>>>
 Hi all,

 Please join me and the rest of the Beam PMC in welcoming a
 new committer: Daniel Oliveira

 Daniel introduced himself to dev@ over two years ago and
 has contributed in many ways since then. Daniel has 
 contributed to general
 project health, the portability framework, and all three 
 languages: Java,
 Python SDK, and Go. I would like to particularly highlight how 
 he deleted
 12k lines of dead reference runner code [1].

 In consideration of Daniel's contributions, the Beam PMC
 trusts him with the responsibilities of a Beam committer [2].

 Thank you, Daniel, for your contributions and looking
 forward to many more!

 Kenn, on behalf of the Apache Beam PMC

 [1] https://github.com/apache/beam/pull/8380
 [2]
 https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer

>>>
>
> --
> 
> Ruoyun  Huang
>
>
>


Re: [ANNOUNCE] New committer: Brian Hulette

2019-11-15 Thread Ankur Goenka
Congrats Brian!

On Fri, Nov 15, 2019, 2:42 PM Jan Lukavský  wrote:

> Congrats Brian!
> On 11/15/19 9:58 AM, Reza Rokni wrote:
>
> Great news!
>
> On Fri, 15 Nov 2019 at 15:09, Gleb Kanterov  wrote:
>
>> Congratulations!
>>
>> On Fri, Nov 15, 2019 at 5:44 AM Valentyn Tymofieiev 
>> wrote:
>>
>>> Congratulations, Brian!
>>>
>>> On Thu, Nov 14, 2019 at 6:25 PM jincheng sun 
>>> wrote:
>>>
 Congratulation Brian!

 Best,
 Jincheng

 Kyle Weaver  于2019年11月15日周五 上午7:19写道:

> Thanks for your contributions and congrats Brian!
>
> On Thu, Nov 14, 2019 at 3:14 PM Kenneth Knowles 
> wrote:
>
>> Hi all,
>>
>> Please join me and the rest of the Beam PMC in welcoming a new
>> committer: Brian Hulette
>>
>> Brian introduced himself to dev@ earlier this year and has been
>> contributing since then. His contributions to Beam include explorations 
>> of
>> integration with Arrow, standardizing coders, portability for schemas, 
>> and
>> presentations at Beam events.
>>
>> In consideration of Brian's contributions, the Beam PMC trusts him
>> with the responsibilities of a Beam committer [1].
>>
>> Thank you, Brian, for your contributions and looking forward to many
>> more!
>>
>> Kenn, on behalf of the Apache Beam PMC
>>
>> [1]
>> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>>
>
>
> --
>
> This email may be confidential and privileged. If you received this
> communication by mistake, please don't forward it to anyone else, please
> erase all copies and attachments, and please let me know that it has gone
> to the wrong person.
>
> The above terms reflect a potential business arrangement, are provided
> solely as a basis for further discussion, and are not intended to be and do
> not constitute a legally binding obligation. No legally binding obligations
> will be created, implied, or inferred until an agreement in final form is
> executed in writing by all parties involved.
>
>


Re: [PROPOSAL] Preparing for Beam 2.17.0 release

2019-10-15 Thread Ankur Goenka
Thanks Mikhail!

+1

On Tue, Oct 15, 2019 at 11:38 AM Mikhail Gryzykhin 
wrote:

> Hi all,
>
> Beam 2.17 release branch cut is scheduled on Oct 23 according to the
> release calendar [1]. I would like to volunteer myself to do this release.
> The plan is to cut the branch on that date, and cherrypick release-blocking
> fixes afterwards if any.
>
> If you have release blocking issues for 2.17 please mark their "Fix
> Version" as 2.17.0 [2]. This tag is already created in JIRA in case you
> would like to move any non-blocking issues to that version.
>
> Any thoughts, comments, objections?
>
> Regards.
> Mikhail Gryzykhin
>
> [1]
> https://calendar.google.com/calendar/embed?src=0p73sl034k80oob7seouanigd0%40group.calendar.google.com
> [2]
> https://issues.apache.org/jira/browse/BEAM-8403?jql=project%20%3D%20BEAM%20AND%20status%20in%20(Reopened%2C%20Open%2C%20%22In%20Progress%22%2C%20%22Under%20Discussion%22%2C%20%22In%20Implementation%22%2C%20%22Triage%20Needed%22)%20AND%20fixVersion%20%3D%202.17.0
>


Re: [VOTE] Sign a pledge to discontinue support of Python 2 in 2020.

2019-10-01 Thread Ankur Goenka
+1

On Tue, Oct 1, 2019 at 4:27 PM Ruoyun Huang  wrote:

> +1
>
> On Tue, Oct 1, 2019 at 3:52 PM Rui Wang  wrote:
>
>> +1
>>
>> I needed to use https://python3statement.org to access the website BTW
>> (https, not http).
>>
>>
>> -Rui
>>
>> On Tue, Oct 1, 2019 at 3:29 PM Cam Mach  wrote:
>>
>>> +1
>>>
>>>
>>>
>>> On Tue, Oct 1, 2019 at 9:44 AM Udi Meiri  wrote:
>>>
 +1

 On Tue, Oct 1, 2019 at 3:22 AM Łukasz Gajowy 
 wrote:

> +1
>
> wt., 1 paź 2019 o 11:29 Maximilian Michels 
> napisał(a):
>
>> +1
>>
>> On 30.09.19 23:03, Reza Rokni wrote:
>> > +1
>> >
>> > On Tue, 1 Oct 2019 at 13:54, Tanay Tummalapalli <
>> ttanay...@gmail.com
>> > > wrote:
>> >
>> > +1
>> >
>> > On Tue, Oct 1, 2019 at 8:19 AM Suneel Marthi <
>> smar...@apache.org
>> > > wrote:
>> >
>> > +1
>> >
>> > On Mon, Sep 30, 2019 at 10:33 PM Manu Zhang
>> > mailto:owenzhang1...@gmail.com>>
>> wrote:
>> >
>> > +1
>> >
>> > On Tue, Oct 1, 2019 at 9:44 AM Austin Bennett
>> > > > > wrote:
>> >
>> > +1
>> >
>> > On Mon, Sep 30, 2019 at 5:22 PM Valentyn Tymofieiev
>> > mailto:valen...@google.com>>
>> wrote:
>> >
>> > Hi everyone,
>> >
>> > Please vote whether to sign a pledge on behalf
>> of
>> > Apache Beam to sunset Beam Python 2 offering
>> (in new
>> > releases) in 2020 on http://python3stament.org
>>  as
>> > follows:
>> >
>> > [ ] +1: Sign a pledge to discontinue support of
>> > Python 2 in Beam in 2020.
>> > [ ] -1: Do not sign a pledge to discontinue
>> support
>> > of Python 2 in Beam in 2020.
>> >
>> > The motivation and details for this vote were
>> > discussed in [1, 2]. Please follow up in [2] if
>> you
>> > have any questions.
>> >
>> > This is a procedural vote [3] that will follow
>> the
>> > majority approval rules and will be open for at
>> > least 72 hours.
>> >
>> > Thanks,
>> > Valentyn
>> >
>> > [1]
>> >
>> https://lists.apache.org/thread.html/eba6caa58ea79a7ecbc8560d1c680a366b44c531d96ce5c699d41535@%3Cdev.beam.apache.org%3E
>> > [2]
>> >
>> https://lists.apache.org/thread.html/456631fe1a696c537ef8ebfee42cd3ea8121bf7c639c52da5f7032e7@%3Cdev.beam.apache.org%3E
>> > [3]
>> https://www.apache.org/foundation/voting.html
>> >
>> >
>> >
>> > --
>> >
>> > This email may be confidential and privileged. If you received this
>> > communication by mistake, please don't forward it to anyone else,
>> please
>> > erase all copies and attachments, and please let me know that it
>> has
>> > gone to the wrong person.
>> >
>> > The above terms reflect a potential business arrangement, are
>> provided
>> > solely as a basis for further discussion, and are not intended to
>> be and
>> > do not constitute a legally binding obligation. No legally binding
>> > obligations will be created, implied, or inferred until an
>> agreement in
>> > final form is executed in writing by all parties involved.
>> >
>>
>
>
> --
> 
> Ruoyun  Huang
>
>


Re: [ANNOUNCE] New committer: Alan Myrvold

2019-09-27 Thread Ankur Goenka
Congratulations Alan!

On Fri, Sep 27, 2019 at 11:17 AM Yichi Zhang  wrote:

> Congrats, Alan!
>
> On Fri, Sep 27, 2019 at 10:26 AM Robin Qiu  wrote:
>
>> Congrats, Alan!
>>
>> On Fri, Sep 27, 2019 at 10:15 AM Hannah Jiang 
>> wrote:
>>
>>> Congrats Alan!
>>>
>>> On Fri, Sep 27, 2019 at 9:57 AM Ruoyun Huang  wrote:
>>>
 Congratulations, Alan!


 On Fri, Sep 27, 2019 at 9:55 AM Rui Wang  wrote:

> Congrats!
>
> -Rui
>
> On Fri, Sep 27, 2019 at 9:54 AM Pablo Estrada 
> wrote:
>
>> Yooh! : D
>>
>> On Fri, Sep 27, 2019 at 9:53 AM Yifan Zou 
>> wrote:
>>
>>> Congratulations, Alan!
>>>
>>> On Fri, Sep 27, 2019 at 9:18 AM Ahmet Altay 
>>> wrote:
>>>
 Hi,

 Please join me and the rest of the Beam PMC in welcoming a new
 committer: Alan Myrvold

 Alan has been a long time Beam contributor. His contributions made
 Beam more productive and friendlier [1] for all contributors with
 significant improvements to Beam release process, automation, and
 infrastructure.

 In consideration of Alan's contributions, the Beam PMC trusts him
 with the responsibilities of a Beam committer [2].

 Thank you, Alan, for your contributions and looking forward to many
 more!

 Ahmet, on behalf of the Apache Beam PMC

 [1]
 https://beam-summit-na-2019.firebaseapp.com/schedule/2019-09-11?sessionId=1126
 [2] https://beam.apache.org/contribute/become-a-committer
 /#an-apache-beam-committer

>>>

 --
 
 Ruoyun  Huang




Re: Collecting feedback for Beam usage

2019-09-23 Thread Ankur Goenka
I agree, these are the questions that need to be answered.
The data can be anonymize and stored as public data in BigQuery or some
other place.

The intent is to get the usage statistics so that we can get to know what
people are using Flink or Spark etc and not intended for discussion or a
help channel.
I also think that we don't need to monitor this actively as it's more like
a survey rather than active channel to get issues resolved.

If we think its useful for the community then we come up with the solution
as to how can we do this (similar to how we released the container images).



On Fri, Sep 20, 2019 at 4:38 PM Kyle Weaver  wrote:

> There are some logistics that would need worked out. For example, Where
> would the data go? Who would own it?
>
> Also, I'm not convinced we need yet another place to discuss Beam when we
> already have discussed the challenge of simultaneously monitoring mailing
> lists, Stack Overflow, Slack, etc. While "how do you use Beam" is certainly
> an interesting question, and I'd be curious to know that >= X many people
> use a certain runner, I'm not sure answers to these questions are as useful
> for guiding the future of Beam as discussions on the dev/users lists, etc.
> as the latter likely result in more depth/specific feedback.
>
> However, I do think it could be useful in general to include links
> directly in the console output. For example, maybe something along the
> lines of "Oh no, your Flink pipeline crashed! Check Jira/file a bug/ask the
> mailing list."
>
> Kyle Weaver | Software Engineer | github.com/ibzib | kcwea...@google.com
>
>
> On Fri, Sep 20, 2019 at 4:14 PM Ankur Goenka  wrote:
>
>> Hi,
>>
>> At the moment we don't really have a good way to collect any usage
>> statistics for Apache Beam. Like runner used etc. As many of the users
>> don't really have a way to report their usecase.
>> How about if we create a feedback page where users can add their pipeline
>> details and usecase.
>> Also, we can start printing the link to this page when user launch the
>> pipeline in the command line.
>> Example:
>> $ python my_pipeline.py --runner DirectRunner --input /tmp/abc
>>
>> Starting pipeline
>> Please use
>> http://feedback.beam.org?args=runner=DirectRunner,input=/tmp/abc
>> Pipeline started
>> ..
>>
>> Using a link and not publishing the data automatically will give user
>> control over what they publish and what they don't. We can enhance the text
>> and usage further but the basic idea is to ask for user feeback at each run
>> of the pipeline.
>> Let me know what you think.
>>
>>
>> Thanks,
>> Ankur
>>
>


Collecting feedback for Beam usage

2019-09-20 Thread Ankur Goenka
Hi,

At the moment we don't really have a good way to collect any usage
statistics for Apache Beam. Like runner used etc. As many of the users
don't really have a way to report their usecase.
How about if we create a feedback page where users can add their pipeline
details and usecase.
Also, we can start printing the link to this page when user launch the
pipeline in the command line.
Example:
$ python my_pipeline.py --runner DirectRunner --input /tmp/abc

Starting pipeline
Please use http://feedback.beam.org?args=runner=DirectRunner,input=/tmp/abc
Pipeline started
..

Using a link and not publishing the data automatically will give user
control over what they publish and what they don't. We can enhance the text
and usage further but the basic idea is to ask for user feeback at each run
of the pipeline.
Let me know what you think.


Thanks,
Ankur


Re: Improve container support

2019-09-05 Thread Ankur Goenka
Please ignore the previous email. I was looking at the older document in
the mail thread.

On Thu, Sep 5, 2019 at 4:58 PM Ankur Goenka  wrote:

> I think sdk in the name is obsolete as they are all under sdks name space.
>
> On Thu, Sep 5, 2019 at 3:26 PM Hannah Jiang 
> wrote:
>
>> Hi Team
>>
>> Thanks for all the comments about beam containers.
>> After considering various opinions and investigating gcr and docker hub,
>> we decided to push images to docker hub.
>>
>> Each image will have two tags, {version}_rc and {version}. {version} tag
>> will be added after the release candidate image is verified.
>> Meanwhile, we will have* latest* tag for each repository, which always
>> points to the most recent verified release image, so users can pull it by
>> default.
>>
>> Docker hub doesn't support leveled repository, which means we should
>> follow *repository:tag* format.
>> it's too general if we use {language_version} as repository for SDK
>> images. (version is added when we support multiple versions.)
>> So I would like to include *sdk* to repository. Images generated at
>> local will also have the same name.
>> Here are some examples:
>>
>>- python2.7_sdk:2.15.0
>>- java_sdk:2.15.0_rc
>>- go_sdk:latest
>>
>> I will proceed with this format if there is no strong opposition by
>> tomorrow noon(PST).
>>
>> *To PMC members*:
>> Permission control will follow the pypi model. All interested PMC members
>> will be added as admins and release managers will be granted push
>> permission.
>> Please let me know your *docker id* if you want to be added as an admin.
>>
>> Thanks,
>> Hannah
>>
>>
>>
>>
>>
>>
>>
>>
>> On Wed, Sep 4, 2019 at 3:47 PM Thomas Weise  wrote:
>>
>>> This will greatly simplify trying out portable runners:
>>> https://beam.apache.org/documentation/runners/flink/#executing-a-beam-pipeline-on-a-flink-cluster
>>>
>>> Can't wait for following to disappear from the instructions page: ./gradlew
>>> :sdks:python:container:docker
>>>
>>> On Wed, Sep 4, 2019 at 3:35 PM Thomas Weise  wrote:
>>>
>>>> Awesome, thank you!
>>>>
>>>>
>>>> On Wed, Sep 4, 2019 at 3:22 PM Hannah Jiang 
>>>> wrote:
>>>>
>>>>> Hi Thomas
>>>>>
>>>>> I created snapshot images from head as of around 2PM today.
>>>>> You can pull images from gcr.io/apache-beam-testing/beam/sdks/snapshot
>>>>> .
>>>>>
>>>>> Thanks,
>>>>> Hannah
>>>>>
>>>>> On Wed, Sep 4, 2019 at 1:41 PM Thomas Weise  wrote:
>>>>>
>>>>>> Hi Hannah,
>>>>>>
>>>>>> Thank you, I know how to build the containers locally, but not how to
>>>>>> publish them!
>>>>>>
>>>>>> The cwiki says "Publishing images to gcr.io/beam requires
>>>>>> permissions in apache-beam-testing project."
>>>>>>
>>>>>> Can I get access to the testing project (at least temporarily) and
>>>>>> what would I need to setup to run the publish target that is shown on 
>>>>>> cwiki?
>>>>>>
>>>>>> Thanks,
>>>>>> Thomas
>>>>>>
>>>>>>
>>>>>> On Wed, Sep 4, 2019 at 11:06 AM Hannah Jiang 
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Thomas
>>>>>>>
>>>>>>> I haven't uploaded any snapshot images yet. Here is how you can
>>>>>>> create one from head.
>>>>>>> > cd [...]/beam/
>>>>>>> # For Python
>>>>>>> > ./gradlew :sdks:python:container:py{version}:docker *where
>>>>>>> version is {2,35,36,37}*
>>>>>>> # For Java
>>>>>>> > ./gradlew -p sdks/java/container docker
>>>>>>> # For Go
>>>>>>> > ./gradlew -p sdks/go/container docker
>>>>>>>
>>>>>>> The 2.15 one is just for testing, not a real 2.15.0, nor a snapshot
>>>>>>> from head.
>>>>>>>
>>>>>>> Please let me know if you have any questions.
>>>>>>> Hannah
>>>>>>>
>>>>>>> On Wed, Sep 4, 2019 at 10:57 AM Th

Re: Improve container support

2019-09-05 Thread Ankur Goenka
I think sdk in the name is obsolete as they are all under sdks name space.

On Thu, Sep 5, 2019 at 3:26 PM Hannah Jiang  wrote:

> Hi Team
>
> Thanks for all the comments about beam containers.
> After considering various opinions and investigating gcr and docker hub,
> we decided to push images to docker hub.
>
> Each image will have two tags, {version}_rc and {version}. {version} tag
> will be added after the release candidate image is verified.
> Meanwhile, we will have* latest* tag for each repository, which always
> points to the most recent verified release image, so users can pull it by
> default.
>
> Docker hub doesn't support leveled repository, which means we should
> follow *repository:tag* format.
> it's too general if we use {language_version} as repository for SDK
> images. (version is added when we support multiple versions.)
> So I would like to include *sdk* to repository. Images generated at local
> will also have the same name.
> Here are some examples:
>
>- python2.7_sdk:2.15.0
>- java_sdk:2.15.0_rc
>- go_sdk:latest
>
> I will proceed with this format if there is no strong opposition by
> tomorrow noon(PST).
>
> *To PMC members*:
> Permission control will follow the pypi model. All interested PMC members
> will be added as admins and release managers will be granted push
> permission.
> Please let me know your *docker id* if you want to be added as an admin.
>
> Thanks,
> Hannah
>
>
>
>
>
>
>
>
> On Wed, Sep 4, 2019 at 3:47 PM Thomas Weise  wrote:
>
>> This will greatly simplify trying out portable runners:
>> https://beam.apache.org/documentation/runners/flink/#executing-a-beam-pipeline-on-a-flink-cluster
>>
>> Can't wait for following to disappear from the instructions page: ./gradlew
>> :sdks:python:container:docker
>>
>> On Wed, Sep 4, 2019 at 3:35 PM Thomas Weise  wrote:
>>
>>> Awesome, thank you!
>>>
>>>
>>> On Wed, Sep 4, 2019 at 3:22 PM Hannah Jiang 
>>> wrote:
>>>
 Hi Thomas

 I created snapshot images from head as of around 2PM today.
 You can pull images from gcr.io/apache-beam-testing/beam/sdks/snapshot.

 Thanks,
 Hannah

 On Wed, Sep 4, 2019 at 1:41 PM Thomas Weise  wrote:

> Hi Hannah,
>
> Thank you, I know how to build the containers locally, but not how to
> publish them!
>
> The cwiki says "Publishing images to gcr.io/beam requires permissions
> in apache-beam-testing project."
>
> Can I get access to the testing project (at least temporarily) and
> what would I need to setup to run the publish target that is shown on 
> cwiki?
>
> Thanks,
> Thomas
>
>
> On Wed, Sep 4, 2019 at 11:06 AM Hannah Jiang 
> wrote:
>
>> Hi Thomas
>>
>> I haven't uploaded any snapshot images yet. Here is how you can
>> create one from head.
>> > cd [...]/beam/
>> # For Python
>> > ./gradlew :sdks:python:container:py{version}:docker *where version
>> is {2,35,36,37}*
>> # For Java
>> > ./gradlew -p sdks/java/container docker
>> # For Go
>> > ./gradlew -p sdks/go/container docker
>>
>> The 2.15 one is just for testing, not a real 2.15.0, nor a snapshot
>> from head.
>>
>> Please let me know if you have any questions.
>> Hannah
>>
>> On Wed, Sep 4, 2019 at 10:57 AM Thomas Weise  wrote:
>>
>>> I actually found something in [1], but it is 2.15 unfortunately.
>>>
>>> [1]
>>> https://console.cloud.google.com/gcr/images/apache-beam-testing/GLOBAL/beam/sdks/release/python2.7?gcrImageListsize=30
>>>
>>> On Wed, Sep 4, 2019 at 10:35 AM Thomas Weise  wrote:
>>>
 Thanks for working on this. Do you happen to have publicly
 accessible snapshots published for your testing currently (even when 
 the
 final location isn't sorted out)?

 I would like to use a 2.16 based Python SDK image for working on my
 downstream project, but could not find anything in
 gcr.io/apache-beam-testing/beam/sdks/rc/snapshot

 Thanks,
 Thomas

 On Fri, Aug 30, 2019 at 10:56 AM Valentyn Tymofieiev <
 valen...@google.com> wrote:

> On Tue, Aug 27, 2019 at 3:35 PM Hannah Jiang <
> hannahji...@google.com> wrote:
>
>> Hi team
>>
>> I am working on improving docker container support for Beam. We
>> would like to publish prebuilt containers for each release version 
>> and
>> daily snapshot. Current work focuses on release images only and it 
>> would be
>> part of the release process.
>>
>> The release images will be pushed to GCR which is publicly
>> accessible(pullable). We will use the following locations.
>> *Repository*: gcr.io/beam
>> *Project*: apache-beam-testing
>> More details, including naming and tagging scheme, can be found
>> at 

Re: [ANNOUNCE] New committer: Valentyn Tymofieiev

2019-08-26 Thread Ankur Goenka
Congratulations Valentyn!

On Mon, Aug 26, 2019, 5:02 PM Yifan Zou  wrote:

> Congratulations, Valentyn! Well deserved!
>
> On Mon, Aug 26, 2019 at 3:31 PM Aizhamal Nurmamat kyzy <
> aizha...@google.com> wrote:
>
>> Congratulations! and thank you for your contributions, Valentyn!
>>
>> On Mon, Aug 26, 2019 at 3:26 PM Thomas Weise  wrote:
>>
>>> Congrats!
>>>
>>>
>>> On Mon, Aug 26, 2019 at 3:22 PM Heejong Lee  wrote:
>>>
 Congratulations! :)

 On Mon, Aug 26, 2019 at 2:44 PM Rui Wang  wrote:

> Congratulations!
>
>
> -Rui
>
> On Mon, Aug 26, 2019 at 2:36 PM Hannah Jiang 
> wrote:
>
>> Congratulations Valentyn, well deserved!
>>
>> On Mon, Aug 26, 2019 at 2:34 PM Chamikara Jayalath <
>> chamik...@google.com> wrote:
>>
>>> Congrats Valentyn!
>>>
>>> On Mon, Aug 26, 2019 at 2:32 PM Pablo Estrada 
>>> wrote:
>>>
 Thanks Valentyn!

 On Mon, Aug 26, 2019 at 2:29 PM Robin Qiu 
 wrote:

> Thank you Valentyn! Congratulations!
>
> On Mon, Aug 26, 2019 at 2:28 PM Robert Bradshaw <
> rober...@google.com> wrote:
>
>> Hi,
>>
>> Please join me and the rest of the Beam PMC in welcoming a new
>> committer: Valentyn Tymofieiev
>>
>> Valentyn has made numerous contributions to Beam over the last
>> several
>> years (including 100+ pull requests), most recently pushing
>> through
>> the effort to make Beam compatible with Python 3. He is also an
>> active
>> participant in design discussions on the list, participates in
>> release
>> candidate validation, and proactively helps keep our tests green.
>>
>> In consideration of Valentyn's contributions, the Beam PMC trusts
>> him
>> with the responsibilities of a Beam committer [1].
>>
>> Thank you, Valentyn, for your contributions and looking forward
>> to many more!
>>
>> Robert, on behalf of the Apache Beam PMC
>>
>> [1]
>> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>>
>


Re: Dataflow worker overview graphs

2019-08-08 Thread Ankur Goenka
Thanks Mikhail. This is really useful.
Do you also have something similar for Streaming use case. More
specifically for Portable (fn_api) based streaming pipelines.


On Thu, Aug 8, 2019 at 2:08 PM Mikhail Gryzykhin  wrote:

> Hello everybody,
>
> Just wanted to share that I have found some graphs for dataflow worker I
> created while starting working on it. They cover specific scenarios, but
> may be useful for newcomers, so I put them into this wiki page
> 
> .
>
> If you feel they belong to some other location, please let me know.
>
> Regards,
> Mikhail.
>


Re: [ANNOUNCE] New committer: Kyle Weaver

2019-08-06 Thread Ankur Goenka
Congratulations Kyle!

On Tue, Aug 6, 2019 at 9:35 AM Ahmet Altay  wrote:

> Hi,
>
> Please join me and the rest of the Beam PMC in welcoming a new committer: Kyle
> Weaver.
>
> Kyle has been contributing to Beam for a while now. And in that time
> period Kyle got the portable spark runner feature complete for batch
> processing. [1]
>
> In consideration of Kyle's contributions, the Beam PMC trusts him with the
> responsibilities of a Beam committer [2].
>
> Thank you, Kyle, for your contributions and looking forward to many more!
>
> Ahmet, on behalf of the Apache Beam PMC
>
> [1]
> https://lists.apache.org/thread.html/c43678fc24c9a1dc9f48c51c51950aedcb9bc0fd3b633df16c3d595a@%3Cuser.beam.apache.org%3E
> [2] https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-
> committer
>


Re: [ANNOUNCE] New committer: Rui Wang

2019-08-06 Thread Ankur Goenka
Congratulations Rui!
Well deserved 

On Tue, Aug 6, 2019 at 9:35 AM Ahmet Altay  wrote:

> Hi,
>
> Please join me and the rest of the Beam PMC in welcoming a new committer: Rui
> Wang.
>
> Rui has been an active contributor since May 2018. Rui has been very
> active in Beam SQL [1] and continues to help out on user@ and
> StackOverflow. Rui is one of the top answerers for apache-beam tag [2].
>
> In consideration of Rui's contributions, the Beam PMC trusts him with the
> responsibilities of a Beam committer [3].
>
> Thank you, Rui, for your contributions and looking forward to many more!
>
> Ahmet, on behalf of the Apache Beam PMC
>
> [1] https://github.com/apache/beam/pulls?q=is%3Apr+author%3Aamaliujia
> [2] https://stackoverflow.com/tags/apache-beam/topusers
> [3] https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-
> committer
>


Re: [ANNOUNCE] New committer: Jan Lukavský

2019-07-31 Thread Ankur Goenka
Congratulations Jan!

On Wed, Jul 31, 2019, 1:23 AM David Morávek  wrote:

> Congratulations Jan, well deserved! ;)
>
> D.
>
> On Wed, Jul 31, 2019 at 10:17 AM Ryan Skraba  wrote:
>
>> Congratulations Jan!
>>
>> On Wed, Jul 31, 2019 at 10:10 AM Ismaël Mejía  wrote:
>> >
>> > Hi,
>> >
>> > Please join me and the rest of the Beam PMC in welcoming a new
>> > committer: Jan Lukavský.
>> >
>> > Jan has been contributing to Beam for a while, he was part of the team
>> > that contributed the Euphoria DSL extension, and he has done
>> > interesting improvements for the Spark and Direct runner. He has also
>> > been active in the community discussions around the Beam model and
>> > other subjects.
>> >
>> > In consideration of Jan's contributions, the Beam PMC trusts him with
>> > the responsibilities of a Beam committer [1].
>> >
>> > Thank you, Jan, for your contributions and looking forward to many more!
>> >
>> > Ismaël, on behalf of the Apache Beam PMC
>> >
>> > [1] https://beam.apache.org/committer/committer
>>
>


Re: Docker Run Options in SDK Container

2019-07-16 Thread Ankur Goenka
Thanks for summarizing the discussion.

A few comments inline below:


On Mon, Jul 15, 2019 at 5:28 PM Sam Bourne  wrote:

> Hello Beam devs,
>
> I’ve opened a PR (https://github.com/apache/beam/pull/8982) to support
> passing options/flags to the docker run command executed as part of the
> portable environment workflow. I’m in need of providing specific volumes
> and possibly other docker run options as I refine our custom container and
> workflow.
>
> There were requests to bring this up in the mailing list to discuss
> possible ways to achieve this. There’s an existing PR #8828
>  but we took quite different
> approaches. #8828 is limited to only mounting /tmp/ directories with no
> support for other docker run options/flags so wouldn’t solve my needs.
>
> I chose to expand upon the existing flag environment_config and provide
> the additional docker run options there. This requires the SDK parse these
> out when building the DockerPayload protobuf. It’s worth noting that what
> is provided to environment_config changes depending on the
> environment_type. e.g. if environment_type is docker, environment_config
> is currently expected to be the docker container name, but other
> environment types have completely different expectations, and each uses its
> own protobuf message type.
>
> The current method (using python SDK) looks like this:
>
> python -m mymodule —runner PortableRunner —job_endpoint localhost:8099 
> —environment_type DOCKER —environment_config MY_CONTAINER_NAME
>
> My PR expects other run options to be provided before the container name -
> similar to how you would start the container locally:
>
> python -m mymodule —runner PortableRunner —job_endpoint localhost:8099 
> —environment_type DOCKER —environment_config “-v 
> /Volumes/mnt/foo:/Volumes/mnt/foo -v /Volumes/mnt/bar:/Volumes/mnt/bar —user 
> sambvfx MY_CONTAINER_NAME”
>
> The PR’s feedback raises some questions that some of you may have opinions
> about. A hopefully faithful summary of them and my commentary below:
>
> Should we require the environment_config be a json encoded string that
> mirrors the protobuf?
>
> e.g.
>
> --environment_config '{"image_name": "MY_CONTAINER_NAME", "run_options": “-v 
> /Volumes/mnt/foo:/Volumes/mnt/foo -v /Volumes/mnt/bar:/Volumes/mnt/bar —user 
> sambvfx"}'
>
> I’m not a fan due to it not being backwards compatible and difficult to
> provide to CLI. Users don’t want to type json into the shell.
>
I agree, typing JSON on command line is really messy. But I think having
meaningful parts in the config will be easier to maintain and compare.
Can we give a config file which can be read, parsed and delivered as
options to the docker environment.
Something like "--environment_config '~/my_docker_config.json/yaml'"

I think passing a user provided command to start docker might have security
issues as users might load mount an otherwise non accessible drive or
access prohibited port etc.

> Should we not assume docker run ... is the only way to start the
> container?
>
> I think any other method would likely require further changes to the
> protobuf or a completely new one.
>
Yes I think that makes sense. However, if we add more parameters to the
docker startup then the dockerpayload protobuf can be updated to have
those.

> Should we provide different args for mounting volume(s) and map that to
> the appropriate docker command within the beam code?
>
> This requires a lot of docker specific code to be included within beam.
>
> Any input would be appreciated.
>
> Cheers,
> Sam
>


Re: [ANNOUNCE] New committer: Robert Burke

2019-07-16 Thread Ankur Goenka
Congratulations Robert!

Go GO!

On Tue, Jul 16, 2019 at 10:34 AM Rui Wang  wrote:

> Congrats!
>
>
> -Rui
>
> On Tue, Jul 16, 2019 at 10:32 AM Udi Meiri  wrote:
>
>> Congrats Robert B.!
>>
>> On Tue, Jul 16, 2019 at 10:23 AM Ahmet Altay  wrote:
>>
>>> Hi,
>>>
>>> Please join me and the rest of the Beam PMC in welcoming a new committer: 
>>> Robert
>>> Burke.
>>>
>>> Robert has been contributing to Beam and actively involved in the
>>> community for over a year. He has been actively working on Go SDK, helping
>>> users, and making it easier for others to contribute [1].
>>>
>>> In consideration of Robert's contributions, the Beam PMC trusts him with
>>> the responsibilities of a Beam committer [2].
>>>
>>> Thank you, Robert, for your contributions and looking forward to many
>>> more!
>>>
>>> Ahmet, on behalf of the Apache Beam PMC
>>>
>>> [1]
>>> https://lists.apache.org/thread.html/8f729da2d3009059d7a8b2d8624446be161700dcfa953939dd3530c6@%3Cdev.beam.apache.org%3E
>>> [2] https://beam.apache.org/contribute/become-a-committer
>>> /#an-apache-beam-committer
>>>
>>


Re: [ANNOUNCE] New committer: Mikhail Gryzykhin

2019-06-21 Thread Ankur Goenka
Congrats Mikhail!

On Fri, Jun 21, 2019 at 11:55 AM Tanay Tummalapalli 
wrote:

> Congratulations!
>
> On Fri, Jun 21, 2019 at 10:35 PM Rui Wang  wrote:
>
>> Congrats!
>>
>>
>> -Rui
>>
>> On Fri, Jun 21, 2019 at 9:58 AM Robin Qiu  wrote:
>>
>>> Congrats, Mikhail!
>>>
>>> On Fri, Jun 21, 2019 at 9:12 AM Alexey Romanenko <
>>> aromanenko@gmail.com> wrote:
>>>
 Congrats, Mikhail!

 On 21 Jun 2019, at 18:01, Anton Kedin  wrote:

 Congrats!

 On Fri, Jun 21, 2019 at 3:55 AM Reza Rokni  wrote:

> Congratulations!
>
> On Fri, 21 Jun 2019, 12:37 Robert Burke,  wrote:
>
>> Congrats
>>
>> On Fri, Jun 21, 2019, 12:29 PM Thomas Weise  wrote:
>>
>>> Hi,
>>>
>>> Please join me and the rest of the Beam PMC in welcoming a new
>>> committer: Mikhail Gryzykhin.
>>>
>>> Mikhail has been contributing to Beam and actively involved in the
>>> community for over a year. He developed the community build dashboard 
>>> [1]
>>> and added substantial improvements to our build infrastructure. 
>>> Mikhail's
>>> work also covers metrics, contributor documentation, development process
>>> improvements and other areas.
>>>
>>> In consideration of Mikhail's contributions, the Beam PMC trusts him
>>> with the responsibilities of a Beam committer [2].
>>>
>>> Thank you, Mikhail, for your contributions and looking forward to
>>> many more!
>>>
>>> Thomas, on behalf of the Apache Beam PMC
>>>
>>> [1] https://s.apache.org/beam-community-metrics
>>> [2]
>>> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>>>
>>>



Re: Contributor Registration

2019-06-20 Thread Ankur Goenka
Welcome Matt!

On Thu, Jun 20, 2019 at 12:25 PM Gleb Kanterov  wrote:

> Welcome Matt!
>
> On Thu, Jun 20, 2019 at 11:09 AM Aizhamal Nurmamat kyzy <
> aizha...@google.com> wrote:
>
>> Welcome Matt!
>>
>> On Thu, Jun 20, 2019 at 11:06 AM Robert Bradshaw 
>> wrote:
>>
>>> Welcome! I added you to the contributors group.
>>>
>>> On Thu, Jun 20, 2019 at 11:03 AM Matt Helm  wrote:
>>> >
>>> > Hi Beam community,
>>> >
>>> > I'm Matt Helm, a Data Engineer at Shopify. I'm based in Vancouver,
>>> Canada. As part of Beam Summit I'm looking to please start taking issues
>>> from Jira. My username is matthelm.
>>> >
>>> > Thanks,
>>> > Matt
>>>
>>
>
> --
> Cheers,
> Gleb
>


Re: [PROPOSAL] Preparing for Beam 2.14.0 release

2019-06-06 Thread Ankur Goenka
+1

On Thu, Jun 6, 2019, 9:13 AM Ahmet Altay  wrote:

> +1, thank you for keeping the cadence.
>
> On Thu, Jun 6, 2019 at 9:04 AM Anton Kedin  wrote:
>
>> Hello Beam community!
>>
>> Beam 2.14 release branch cut date is June 19 according to the release
>> calendar [1]. I would like to volunteer myself to do this release. The plan
>> is to cut the branch on that date, and cherrypick fixes if needed.
>>
>> If you have release blocking issues for 2.14 please mark their "Fix
>> Version" as 2.14.0 [2]. Please use 2.15.0 release in JIRA in case you
>> would like to move any non-blocking issues to that version.
>>
>> And if we're doing a 2.7.1 release it should probably happen
>> independently and in parallel if we want to maintain the release cadence.
>>
>> Thoughts, comments, objections?
>>
>> Thanks,
>> Anton
>>
>> [1]
>> https://calendar.google.com/calendar/embed?src=0p73sl034k80oob7seouanigd0%40group.calendar.google.com
>> [2]
>> https://issues.apache.org/jira/browse/BEAM-7478?jql=project%20%3D%20BEAM%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%20AND%20fixVersion%20%3D%202.14.0
>>
>


Re: [VOTE] Release 2.13.0, release candidate #2

2019-06-05 Thread Ankur Goenka
Thanks Kenn, for identifying the issue.
If no artifacts affected artifacts are published then we should be good.
Let me know if we need to make any changes in 2.13.0

On Wed, Jun 5, 2019 at 10:53 AM Kenneth Knowles  wrote:

> Just discovered a potentially serious issue that was present during this
> RC: https://issues.apache.org/jira/browse/BEAM-7493. So far I have not
> discovered a truly user-facing impact, and example validation succeeded,
> but I wanted to alert the list.
>
> Summary: When rendering a published pom.xml the dependencies are always
> the path concatenated with dashes even when that is not the
> correct artifactId. For example sdks/java/testing/test-utils are resolved
> in the pom to beam-sdks-java-testing-test-utils. This does not exist; it
> manually sets the name to beam-sdks-java-test-utils, omitting the
> extraneous `testing` directory that exists only for taxonomy.
>
> There are a few other modules that manually set their archive name. From
> what I can tell, each of these is either (a) not published or (b) not
> depended upon. I am still checking.
>
> Kenn
>
>
> On Wed, Jun 5, 2019 at 9:22 AM Thomas Weise  wrote:
>
>> +1 and I think all of that can be covered with JIRA.
>>
>> Irrespective the release manager still needs to pay attention to the
>> communication on the VOTE thread.
>>
>> On Wed, Jun 5, 2019 at 9:19 AM Ahmet Altay  wrote:
>>
>>> Checking that JIRA link sounds reasonable as long as we can agree that
>>> it is single source of truth for cherry pick requests. I also agree with
>>> Cham, requests need to come with a reason.
>>>
>>> On Wed, Jun 5, 2019 at 7:38 AM Ismaël Mejía  wrote:
>>>
>>>> I don't think we need anything fancier or marking even as Blocker some
>>>> of this stuff, would not be enough just to monitor that [1] has no
>>>> issues? (of course if the interested party has not put the fix version
>>>> to the current ongoing vote one this is a mistake).
>>>>
>>>> [1]
>>>> https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%20AND%20fixVersion%20%3D%202.13.0
>>>>
>>>> On Wed, Jun 5, 2019 at 4:23 PM Chamikara Jayalath 
>>>> wrote:
>>>> >
>>>> >
>>>> >
>>>> > On Tue, Jun 4, 2019 at 5:02 PM Ahmet Altay  wrote:
>>>> >>
>>>> >> I would suggest have a single way of tracking cherry pick request to
>>>> an RC. Currently we use emails on the RC thread, open PRs, and Jiras tagged
>>>> for the release. This is confusing the person doing the release while they
>>>> are juggling multiple things. How about we ask all cherry pick requests to
>>>> have a JIRA filed against that release and marked as blockers?
>>>> >
>>>> >
>>>> > I agree with this and with what Ankur said. Release blockers should
>>>> be more explicit and should come with a reason. After voting thread start,
>>>> I would say this should include a mail to the voting thread as well as a
>>>> blocking JIRA. Other PRs opened against the branch may or may not get
>>>> merged at the discretion of the release manager.
>>>> >
>>>> > Thanks,
>>>> > Cham
>>>> >
>>>> >>
>>>> >>
>>>> >> On Tue, Jun 4, 2019 at 1:05 PM Ankur Goenka 
>>>> wrote:
>>>> >>>
>>>> >>> That makes sense.
>>>> >>> I would also like to add that the corresponding PR should be added
>>>> to an open blocking Jira for the release to keep a single source to check.
>>>> >>>
>>>> >>> On Tue, Jun 4, 2019 at 12:15 PM Kenneth Knowles 
>>>> wrote:
>>>> >>>>
>>>> >>>> I would actually suggest that the following search needs to be
>>>> triaged to zero before cutting an RC:
>>>> https://github.com/apache/beam/pulls?utf8=%E2%9C%93=is%3Apr+is%3Aopen+base%3Arelease-2.13.0
>>>> .
>>>> >>>>
>>>> >>>> On Tue, Jun 4, 2019 at 11:17 AM Ankur Goenka 
>>>> wrote:
>>>> >>>>>
>>>> >>>>> Sorry, I missed the comment for not including weekend's to 72
>>>> hours voting period.
>>>> >>>>>
>>>> >>>>> I meant to update the blog post
>>>> https://github.com/apache/beam/pull

Re: [VOTE] Release 2.13.0, release candidate #2

2019-06-04 Thread Ankur Goenka
Final few things remaining for the release
* Please review https://github.com/apache/beam/pull/8667

After which we can
* Release version finalized in JIRA (PMC help needed)
* Release version is listed at reporter.apache.org (PMC help needed)
* Promote the release.

On Tue, Jun 4, 2019 at 5:02 PM Ahmet Altay  wrote:

> I would suggest have a single way of tracking cherry pick request to an
> RC. Currently we use emails on the RC thread, open PRs, and Jiras tagged
> for the release. This is confusing the person doing the release while they
> are juggling multiple things. How about we ask all cherry pick requests to
> have a JIRA filed against that release and marked as blockers?
>
> On Tue, Jun 4, 2019 at 1:05 PM Ankur Goenka  wrote:
>
>> That makes sense.
>> I would also like to add that the corresponding PR should be added to an
>> open blocking Jira
>> <https://issues.apache.org/jira/projects/BEAM/versions/12345166> for the
>> release to keep a single source to check.
>>
>> On Tue, Jun 4, 2019 at 12:15 PM Kenneth Knowles  wrote:
>>
>>> I would actually suggest that the following search needs to be triaged
>>> to zero before cutting an RC:
>>> https://github.com/apache/beam/pulls?utf8=%E2%9C%93=is%3Apr+is%3Aopen+base%3Arelease-2.13.0
>>> .
>>>
>>> On Tue, Jun 4, 2019 at 11:17 AM Ankur Goenka  wrote:
>>>
>>>> Sorry, I missed the comment for not including weekend's to 72 hours
>>>> voting period.
>>>>
>>>> I meant to update the blog post
>>>> https://github.com/apache/beam/pull/8667/files once we have finalized
>>>> the RC so that it can be consistent. Please add any comments to PR and I
>>>> can incorporate them.
>>>>
>>>> As we did not go for 3rd RC and
>>>> https://github.com/apache/beam/pull/8714 was not blocking the 2.13
>>>> release, I went with the release.
>>>>
>>>> I have released the maven artifacts for beam. So I suppose, we can not
>>>> do another RC for 2.13.0.
>>>> If we need anything urgently in 2.13 then we can do a bug fix release
>>>> 2.13.1.
>>>>
>>>>
>>>> On Tue, Jun 4, 2019 at 8:59 AM Thomas Weise  wrote:
>>>>
>>>>> This seems a rushed and things fall through the cracks.
>>>>>
>>>>> Max had requested to not include the weekend into the voting period.
>>>>>
>>>>> Valentyn: I had the same question on the first RC. The PR should be
>>>>> included into the vote for review. You can find it here:
>>>>> https://github.com/apache/beam/pull/8667/files
>>>>>
>>>>> I had requested to include following backport PR before the RC:
>>>>> https://github.com/apache/beam/pull/8714  - It's not blocking but
>>>>> would be nice if someone can merge it for any future release from this
>>>>> branch.
>>>>>
>>>>> Thanks,
>>>>> Thomas
>>>>>
>>>>>
>>>>> On Tue, Jun 4, 2019 at 1:59 AM Maximilian Michels 
>>>>> wrote:
>>>>>
>>>>>> The summary is not correct. Binding votes (in order):
>>>>>>
>>>>>> Ahmet Altay
>>>>>> Robert Bradshaw
>>>>>> Maximilian Michels
>>>>>> Jean-Baptiste Onofré
>>>>>> Lukasz Cwik
>>>>>>
>>>>>> A total of 5 binding votes.
>>>>>>
>>>>>> On 04.06.19 02:37, Ankur Goenka wrote:
>>>>>> > +1
>>>>>> > Thanks for validating the release and voting.
>>>>>> > With 0(-1), 6(+1) and 3(+1 binding) votes, I am concluding the
>>>>>> voting
>>>>>> > process.
>>>>>> > I am going ahead with the release and will keep the community
>>>>>> posted
>>>>>> > with the updates.
>>>>>> >
>>>>>> > On Mon, Jun 3, 2019 at 1:57 PM Andrew Pilloud >>>>> > <mailto:apill...@google.com>> wrote:
>>>>>> >
>>>>>> > +1 Reviewed the Nexmark java and SQL perfkit graphs, no obvious
>>>>>> > regressions over the previous release.
>>>>>> >
>>>>>> > On Mon, Jun 3, 2019 at 1:15 PM Lukasz Cwik >>>>> > <mailto:lc...@google.com>> wrote:
>>>>>> >
>>>>>> >

Re: [VOTE] Release 2.13.0, release candidate #2

2019-06-04 Thread Ankur Goenka
That makes sense.
I would also like to add that the corresponding PR should be added to an
open blocking Jira
<https://issues.apache.org/jira/projects/BEAM/versions/12345166> for the
release to keep a single source to check.

On Tue, Jun 4, 2019 at 12:15 PM Kenneth Knowles  wrote:

> I would actually suggest that the following search needs to be triaged to
> zero before cutting an RC:
> https://github.com/apache/beam/pulls?utf8=%E2%9C%93=is%3Apr+is%3Aopen+base%3Arelease-2.13.0
> .
>
> On Tue, Jun 4, 2019 at 11:17 AM Ankur Goenka  wrote:
>
>> Sorry, I missed the comment for not including weekend's to 72 hours
>> voting period.
>>
>> I meant to update the blog post
>> https://github.com/apache/beam/pull/8667/files once we have finalized
>> the RC so that it can be consistent. Please add any comments to PR and I
>> can incorporate them.
>>
>> As we did not go for 3rd RC and https://github.com/apache/beam/pull/8714 was
>> not blocking the 2.13 release, I went with the release.
>>
>> I have released the maven artifacts for beam. So I suppose, we can not do
>> another RC for 2.13.0.
>> If we need anything urgently in 2.13 then we can do a bug fix release
>> 2.13.1.
>>
>>
>> On Tue, Jun 4, 2019 at 8:59 AM Thomas Weise  wrote:
>>
>>> This seems a rushed and things fall through the cracks.
>>>
>>> Max had requested to not include the weekend into the voting period.
>>>
>>> Valentyn: I had the same question on the first RC. The PR should be
>>> included into the vote for review. You can find it here:
>>> https://github.com/apache/beam/pull/8667/files
>>>
>>> I had requested to include following backport PR before the RC:
>>> https://github.com/apache/beam/pull/8714  - It's not blocking but would
>>> be nice if someone can merge it for any future release from this branch.
>>>
>>> Thanks,
>>> Thomas
>>>
>>>
>>> On Tue, Jun 4, 2019 at 1:59 AM Maximilian Michels 
>>> wrote:
>>>
>>>> The summary is not correct. Binding votes (in order):
>>>>
>>>> Ahmet Altay
>>>> Robert Bradshaw
>>>> Maximilian Michels
>>>> Jean-Baptiste Onofré
>>>> Lukasz Cwik
>>>>
>>>> A total of 5 binding votes.
>>>>
>>>> On 04.06.19 02:37, Ankur Goenka wrote:
>>>> > +1
>>>> > Thanks for validating the release and voting.
>>>> > With 0(-1), 6(+1) and 3(+1 binding) votes, I am concluding the voting
>>>> > process.
>>>> > I am going ahead with the release and will keep the community posted
>>>> > with the updates.
>>>> >
>>>> > On Mon, Jun 3, 2019 at 1:57 PM Andrew Pilloud >>> > <mailto:apill...@google.com>> wrote:
>>>> >
>>>> > +1 Reviewed the Nexmark java and SQL perfkit graphs, no obvious
>>>> > regressions over the previous release.
>>>> >
>>>> > On Mon, Jun 3, 2019 at 1:15 PM Lukasz Cwik >>> > <mailto:lc...@google.com>> wrote:
>>>> >
>>>> > Thanks for the clarification.
>>>> >
>>>> > On Mon, Jun 3, 2019 at 11:40 AM Ankur Goenka <
>>>> goe...@google.com
>>>> > <mailto:goe...@google.com>> wrote:
>>>> >
>>>> > Yes, i meant i will close the voting at 5pm and start the
>>>> > release process.
>>>> >
>>>> > On Mon, Jun 3, 2019, 10:59 AM Lukasz Cwik <
>>>> lc...@google.com
>>>> > <mailto:lc...@google.com>> wrote:
>>>> >
>>>> > Ankur, did you mean to say your going to close the
>>>> vote
>>>> > today at 5pm? (and then complete the release
>>>> afterwards)
>>>> >
>>>> > On Mon, Jun 3, 2019 at 10:54 AM Ankur Goenka
>>>> > mailto:goe...@google.com>> wrote:
>>>> >
>>>> > Thanks for validating and voting.
>>>> >
>>>> > We have 4 binding votes.
>>>> > I will complete the release today 5PM. Please
>>>> raise
>>>> > any concerns before that.
>>>> >
>>>> > Thanks,
>>>> >   

Re: [VOTE] Release 2.13.0, release candidate #2

2019-06-04 Thread Ankur Goenka
Sorry, I missed the comment for not including weekend's to 72 hours voting
period.

I meant to update the blog post
https://github.com/apache/beam/pull/8667/files once we have finalized the
RC so that it can be consistent. Please add any comments to PR and I can
incorporate them.

As we did not go for 3rd RC and https://github.com/apache/beam/pull/8714 was
not blocking the 2.13 release, I went with the release.

I have released the maven artifacts for beam. So I suppose, we can not do
another RC for 2.13.0.
If we need anything urgently in 2.13 then we can do a bug fix release
2.13.1.


On Tue, Jun 4, 2019 at 8:59 AM Thomas Weise  wrote:

> This seems a rushed and things fall through the cracks.
>
> Max had requested to not include the weekend into the voting period.
>
> Valentyn: I had the same question on the first RC. The PR should be
> included into the vote for review. You can find it here:
> https://github.com/apache/beam/pull/8667/files
>
> I had requested to include following backport PR before the RC:
> https://github.com/apache/beam/pull/8714  - It's not blocking but would
> be nice if someone can merge it for any future release from this branch.
>
> Thanks,
> Thomas
>
>
> On Tue, Jun 4, 2019 at 1:59 AM Maximilian Michels  wrote:
>
>> The summary is not correct. Binding votes (in order):
>>
>> Ahmet Altay
>> Robert Bradshaw
>> Maximilian Michels
>> Jean-Baptiste Onofré
>> Lukasz Cwik
>>
>> A total of 5 binding votes.
>>
>> On 04.06.19 02:37, Ankur Goenka wrote:
>> > +1
>> > Thanks for validating the release and voting.
>> > With 0(-1), 6(+1) and 3(+1 binding) votes, I am concluding the voting
>> > process.
>> > I am going ahead with the release and will keep the community posted
>> > with the updates.
>> >
>> > On Mon, Jun 3, 2019 at 1:57 PM Andrew Pilloud > > <mailto:apill...@google.com>> wrote:
>> >
>> > +1 Reviewed the Nexmark java and SQL perfkit graphs, no obvious
>> > regressions over the previous release.
>> >
>> > On Mon, Jun 3, 2019 at 1:15 PM Lukasz Cwik > > <mailto:lc...@google.com>> wrote:
>> >
>> > Thanks for the clarification.
>> >
>> > On Mon, Jun 3, 2019 at 11:40 AM Ankur Goenka > > <mailto:goe...@google.com>> wrote:
>> >
>> > Yes, i meant i will close the voting at 5pm and start the
>> > release process.
>> >
>> > On Mon, Jun 3, 2019, 10:59 AM Lukasz Cwik > > <mailto:lc...@google.com>> wrote:
>> >
>> > Ankur, did you mean to say your going to close the vote
>> > today at 5pm? (and then complete the release afterwards)
>> >
>> > On Mon, Jun 3, 2019 at 10:54 AM Ankur Goenka
>> > mailto:goe...@google.com>> wrote:
>> >
>> > Thanks for validating and voting.
>> >
>> > We have 4 binding votes.
>> > I will complete the release today 5PM. Please raise
>> > any concerns before that.
>> >
>> > Thanks,
>> > Ankur
>> >
>> > On Mon, Jun 3, 2019 at 8:36 AM Lukasz Cwik
>> > mailto:lc...@google.com>> wrote:
>> >
>> > Since the gearpump issue has been ongoing since
>> > 2.10, I can't consider it a blocker for this
>> > release and am voting +1.
>> >
>> > On Mon, Jun 3, 2019 at 7:13 AM Jean-Baptiste
>> > Onofré > > <mailto:j...@nanthrax.net>> wrote:
>> >
>> > +1 (binding)
>> >
>> > Quickly tested on beam-samples.
>> >
>> > Regards
>> > JB
>> >
>> > On 31/05/2019 04:52, Ankur Goenka wrote:
>> >  > Hi everyone,
>> >  >
>> >  > Please review and vote on the release
>> > candidate #2 for the version
>> >  > 2.13.0, as follows:
>> >  >
>> >   

Re: [VOTE] Release 2.13.0, release candidate #2

2019-06-03 Thread Ankur Goenka
+1
Thanks for validating the release and voting.
With 0(-1), 6(+1) and 3(+1 binding) votes, I am concluding the voting
process.
I am going ahead with the release and will keep the community posted with
the updates.

On Mon, Jun 3, 2019 at 1:57 PM Andrew Pilloud  wrote:

> +1 Reviewed the Nexmark java and SQL perfkit graphs, no obvious
> regressions over the previous release.
>
> On Mon, Jun 3, 2019 at 1:15 PM Lukasz Cwik  wrote:
>
>> Thanks for the clarification.
>>
>> On Mon, Jun 3, 2019 at 11:40 AM Ankur Goenka  wrote:
>>
>>> Yes, i meant i will close the voting at 5pm and start the release
>>> process.
>>>
>>> On Mon, Jun 3, 2019, 10:59 AM Lukasz Cwik  wrote:
>>>
>>>> Ankur, did you mean to say your going to close the vote today at 5pm?
>>>> (and then complete the release afterwards)
>>>>
>>>> On Mon, Jun 3, 2019 at 10:54 AM Ankur Goenka  wrote:
>>>>
>>>>> Thanks for validating and voting.
>>>>>
>>>>> We have 4 binding votes.
>>>>> I will complete the release today 5PM. Please raise any concerns
>>>>> before that.
>>>>>
>>>>> Thanks,
>>>>> Ankur
>>>>>
>>>>> On Mon, Jun 3, 2019 at 8:36 AM Lukasz Cwik  wrote:
>>>>>
>>>>>> Since the gearpump issue has been ongoing since 2.10, I can't
>>>>>> consider it a blocker for this release and am voting +1.
>>>>>>
>>>>>> On Mon, Jun 3, 2019 at 7:13 AM Jean-Baptiste Onofré 
>>>>>> wrote:
>>>>>>
>>>>>>> +1 (binding)
>>>>>>>
>>>>>>> Quickly tested on beam-samples.
>>>>>>>
>>>>>>> Regards
>>>>>>> JB
>>>>>>>
>>>>>>> On 31/05/2019 04:52, Ankur Goenka wrote:
>>>>>>> > Hi everyone,
>>>>>>> >
>>>>>>> > Please review and vote on the release candidate #2 for the version
>>>>>>> > 2.13.0, as follows:
>>>>>>> >
>>>>>>> > [ ] +1, Approve the release
>>>>>>> > [ ] -1, Do not approve the release (please provide specific
>>>>>>> comments)
>>>>>>> >
>>>>>>> > The complete staging area is available for your review, which
>>>>>>> includes:
>>>>>>> > * JIRA release notes [1],
>>>>>>> > * the official Apache source release to be deployed to
>>>>>>> dist.apache.org
>>>>>>> > <http://dist.apache.org> [2], which is signed with the key with
>>>>>>> > fingerprint 6356C1A9F089B0FA3DE8753688934A6699985948 [3],
>>>>>>> > * all artifacts to be deployed to the Maven Central Repository [4],
>>>>>>> > * source code tag "v2.13.0-RC2" [5],
>>>>>>> > * website pull request listing the release [6] and publishing the
>>>>>>> API
>>>>>>> > reference manual [7].
>>>>>>> > * Python artifacts are deployed along with the source release to
>>>>>>> the
>>>>>>> > dist.apache.org <http://dist.apache.org> [2].
>>>>>>> > * Validation sheet with a tab for 2.13.0 release to help with
>>>>>>> validation
>>>>>>> > [8].
>>>>>>> >
>>>>>>> > The vote will be open for at least 72 hours. It is adopted by
>>>>>>> majority
>>>>>>> > approval, with at least 3 PMC affirmative votes.
>>>>>>> >
>>>>>>> > Thanks,
>>>>>>> > Ankur
>>>>>>> >
>>>>>>> > [1]
>>>>>>> >
>>>>>>> https://jira.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527=12345166
>>>>>>> > [2] https://dist.apache.org/repos/dist/dev/beam/2.13.0/
>>>>>>> > [3] https://dist.apache.org/repos/dist/release/beam/KEYS
>>>>>>> > [4]
>>>>>>> https://repository.apache.org/content/repositories/orgapachebeam-1070/
>>>>>>> > [5] https://github.com/apache/beam/tree/v2.13.0-RC2
>>>>>>> > [6] https://github.com/apache/beam/pull/8645
>>>>>>> > [7] https://github.com/apache/beam-site/pull/589
>>>>>>> > [8]
>>>>>>> >
>>>>>>> https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1031196952
>>>>>>>
>>>>>>> --
>>>>>>> Jean-Baptiste Onofré
>>>>>>> jbono...@apache.org
>>>>>>> http://blog.nanthrax.net
>>>>>>> Talend - http://www.talend.com
>>>>>>>
>>>>>>


Re: [VOTE] Release 2.13.0, release candidate #2

2019-06-03 Thread Ankur Goenka
Yes, i meant i will close the voting at 5pm and start the release process.

On Mon, Jun 3, 2019, 10:59 AM Lukasz Cwik  wrote:

> Ankur, did you mean to say your going to close the vote today at 5pm? (and
> then complete the release afterwards)
>
> On Mon, Jun 3, 2019 at 10:54 AM Ankur Goenka  wrote:
>
>> Thanks for validating and voting.
>>
>> We have 4 binding votes.
>> I will complete the release today 5PM. Please raise any concerns before
>> that.
>>
>> Thanks,
>> Ankur
>>
>> On Mon, Jun 3, 2019 at 8:36 AM Lukasz Cwik  wrote:
>>
>>> Since the gearpump issue has been ongoing since 2.10, I can't consider
>>> it a blocker for this release and am voting +1.
>>>
>>> On Mon, Jun 3, 2019 at 7:13 AM Jean-Baptiste Onofré 
>>> wrote:
>>>
>>>> +1 (binding)
>>>>
>>>> Quickly tested on beam-samples.
>>>>
>>>> Regards
>>>> JB
>>>>
>>>> On 31/05/2019 04:52, Ankur Goenka wrote:
>>>> > Hi everyone,
>>>> >
>>>> > Please review and vote on the release candidate #2 for the version
>>>> > 2.13.0, as follows:
>>>> >
>>>> > [ ] +1, Approve the release
>>>> > [ ] -1, Do not approve the release (please provide specific comments)
>>>> >
>>>> > The complete staging area is available for your review, which
>>>> includes:
>>>> > * JIRA release notes [1],
>>>> > * the official Apache source release to be deployed to
>>>> dist.apache.org
>>>> > <http://dist.apache.org> [2], which is signed with the key with
>>>> > fingerprint 6356C1A9F089B0FA3DE8753688934A6699985948 [3],
>>>> > * all artifacts to be deployed to the Maven Central Repository [4],
>>>> > * source code tag "v2.13.0-RC2" [5],
>>>> > * website pull request listing the release [6] and publishing the API
>>>> > reference manual [7].
>>>> > * Python artifacts are deployed along with the source release to the
>>>> > dist.apache.org <http://dist.apache.org> [2].
>>>> > * Validation sheet with a tab for 2.13.0 release to help with
>>>> validation
>>>> > [8].
>>>> >
>>>> > The vote will be open for at least 72 hours. It is adopted by majority
>>>> > approval, with at least 3 PMC affirmative votes.
>>>> >
>>>> > Thanks,
>>>> > Ankur
>>>> >
>>>> > [1]
>>>> >
>>>> https://jira.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527=12345166
>>>> > [2] https://dist.apache.org/repos/dist/dev/beam/2.13.0/
>>>> > [3] https://dist.apache.org/repos/dist/release/beam/KEYS
>>>> > [4]
>>>> https://repository.apache.org/content/repositories/orgapachebeam-1070/
>>>> > [5] https://github.com/apache/beam/tree/v2.13.0-RC2
>>>> > [6] https://github.com/apache/beam/pull/8645
>>>> > [7] https://github.com/apache/beam-site/pull/589
>>>> > [8]
>>>> >
>>>> https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1031196952
>>>>
>>>> --
>>>> Jean-Baptiste Onofré
>>>> jbono...@apache.org
>>>> http://blog.nanthrax.net
>>>> Talend - http://www.talend.com
>>>>
>>>


Re: [VOTE] Release 2.13.0, release candidate #2

2019-06-03 Thread Ankur Goenka
Thanks for validating and voting.

We have 4 binding votes.
I will complete the release today 5PM. Please raise any concerns before
that.

Thanks,
Ankur

On Mon, Jun 3, 2019 at 8:36 AM Lukasz Cwik  wrote:

> Since the gearpump issue has been ongoing since 2.10, I can't consider it
> a blocker for this release and am voting +1.
>
> On Mon, Jun 3, 2019 at 7:13 AM Jean-Baptiste Onofré 
> wrote:
>
>> +1 (binding)
>>
>> Quickly tested on beam-samples.
>>
>> Regards
>> JB
>>
>> On 31/05/2019 04:52, Ankur Goenka wrote:
>> > Hi everyone,
>> >
>> > Please review and vote on the release candidate #2 for the version
>> > 2.13.0, as follows:
>> >
>> > [ ] +1, Approve the release
>> > [ ] -1, Do not approve the release (please provide specific comments)
>> >
>> > The complete staging area is available for your review, which includes:
>> > * JIRA release notes [1],
>> > * the official Apache source release to be deployed to dist.apache.org
>> > <http://dist.apache.org> [2], which is signed with the key with
>> > fingerprint 6356C1A9F089B0FA3DE8753688934A6699985948 [3],
>> > * all artifacts to be deployed to the Maven Central Repository [4],
>> > * source code tag "v2.13.0-RC2" [5],
>> > * website pull request listing the release [6] and publishing the API
>> > reference manual [7].
>> > * Python artifacts are deployed along with the source release to the
>> > dist.apache.org <http://dist.apache.org> [2].
>> > * Validation sheet with a tab for 2.13.0 release to help with validation
>> > [8].
>> >
>> > The vote will be open for at least 72 hours. It is adopted by majority
>> > approval, with at least 3 PMC affirmative votes.
>> >
>> > Thanks,
>> > Ankur
>> >
>> > [1]
>> >
>> https://jira.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527=12345166
>> > [2] https://dist.apache.org/repos/dist/dev/beam/2.13.0/
>> > [3] https://dist.apache.org/repos/dist/release/beam/KEYS
>> > [4]
>> https://repository.apache.org/content/repositories/orgapachebeam-1070/
>> > [5] https://github.com/apache/beam/tree/v2.13.0-RC2
>> > [6] https://github.com/apache/beam/pull/8645
>> > [7] https://github.com/apache/beam-site/pull/589
>> > [8]
>> >
>> https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1031196952
>>
>> --
>> Jean-Baptiste Onofré
>> jbono...@apache.org
>> http://blog.nanthrax.net
>> Talend - http://www.talend.com
>>
>


Re: [DISCUSS] Cookbooks for users with knowledge in other frameworks

2019-06-01 Thread Ankur Goenka
+1 for the proposal.
Compatibility Matrix
 can be a
good place to show case parity between different runners.
Do you think we should write 2 way examples [Spark, Flink, ..]<=>Beam?



On Sat, Jun 1, 2019 at 4:31 PM Reza Rokni  wrote:

> For layer 1, what about working through this link as a starting point :
> https://spark.apache.org/docs/latest/rdd-programming-guide.html#transformations
> ?
>
> On Sat, 1 Jun 2019 at 09:21, Ahmet Altay  wrote:
>
>> Thank you Reza. That separation makes sense to me.
>>
>> On Wed, May 29, 2019 at 6:26 PM Reza Rokni  wrote:
>>
>>> +1
>>>
>>> I think there will be at least two layers of this;
>>>
>>> Layer 1 - Using primitives : I do join, GBK, Aggregation... with system
>>> x this way, what is the canonical equivalent in Beam.
>>> Layer 2 - Patterns : I read and join Unbounded and Bounded Data in
>>> system x this way, what is the canonical equivalent in Beam.
>>>
>>> I suspect as a first pass Layer 1 is reasonably well bounded work, there
>>> would need to be agreement on "canonical" version of how to do something in
>>> Beam as this could be seen to be opinionated. As there are often a
>>> multitude of ways of doing x
>>>
>>
>> Once we identify a set of layer 1 items, we could crowd source the
>> canonical implementations. I believe we can use our usual code review
>> process to settle on a version that is agreeable. (Examples have the same
>> issue, they are probably opinionated today based on the author but it works
>> out.)
>>
>>
>>>
>>>
>>> On Thu, 30 May 2019 at 08:56, Ahmet Altay  wrote:
>>>
 Hi all,

 Inspired by the user asking about a Spark feature in Beam [1] in the
 release thread, I searched the user@ list and noticed a few instances
 of people asking for question like "I can do X in Spark, how can I do that
 in Beam?" Would it make sense to add documentation to explain how certain
 tasks that can be accomplished in Beam with side by side examples of doing
 the same task in Beam/Spark etc. It could help with on-boarding because it
 will be easier for people to leverage their existing knowledge. It could
 also help other frameworks as well, because it will serve as a Rosetta
 stone with two translations.

 Questions I have are:
 - Would such a thing be a helpful?
 - Is it feasible? Would a few pages worth of examples can cover enough
 use cases?

 Thank you!
 Ahmet

 [1]
 https://lists.apache.org/thread.html/b73a54aa1e6e9933628f177b04a8f907c26cac854745fa081c478eff@%3Cdev.beam.apache.org%3E

>>>
>>>
>>> --
>>>
>>> This email may be confidential and privileged. If you received this
>>> communication by mistake, please don't forward it to anyone else, please
>>> erase all copies and attachments, and please let me know that it has gone
>>> to the wrong person.
>>>
>>> The above terms reflect a potential business arrangement, are provided
>>> solely as a basis for further discussion, and are not intended to be and do
>>> not constitute a legally binding obligation. No legally binding obligations
>>> will be created, implied, or inferred until an agreement in final form is
>>> executed in writing by all parties involved.
>>>
>>
>
> --
>
> This email may be confidential and privileged. If you received this
> communication by mistake, please don't forward it to anyone else, please
> erase all copies and attachments, and please let me know that it has gone
> to the wrong person.
>
> The above terms reflect a potential business arrangement, are provided
> solely as a basis for further discussion, and are not intended to be and do
> not constitute a legally binding obligation. No legally binding obligations
> will be created, implied, or inferred until an agreement in final form is
> executed in writing by all parties involved.
>


Re: [VOTE] Release 2.13.0, release candidate #2

2019-06-01 Thread Ankur Goenka
Thanks Ahmet and Luke for validation.

If no one has objections then I am planning to move ahead without Gearpump
validation as it seems to be broken from past multiple releases.

Reminder: The voting closes on 2nd June so please validate and vote by then.

On Fri, May 31, 2019 at 10:43 AM Ahmet Altay  wrote:

> +1
>
> I validated python 2 quickstarts.
>
> On Fri, May 31, 2019 at 10:22 AM Lukasz Cwik  wrote:
>
>> I did the Java local quickstart for all the runners in the release
>> validation sheet and gearpump failed for me due to a missing dependency.
>> Even after I fixed up the dependency, the pipeline then got stuck. I filed
>> BEAM-7467 with all the details.
>>
>> Note that I tried the quickstart for 2.8.0 through 2.12.0
>> 2.8.0 and 2.9.0 failed due to a timeout (maybe I was using the wrong
>> command but this test[1] suggests that I was using a correct one)
>> 2.10.0 and higher fail due to the missing gs-collections dependency.
>>
>> Manu, could you help figure out what is going on?
>>
>> 1:
>> https://github.com/apache/beam/blob/2d3bcdc542536037c3e657a8b00ebc222487476b/release/src/main/groovy/quickstart-java-gearpump.groovy#L33
>>
>> On Thu, May 30, 2019 at 7:53 PM Ankur Goenka  wrote:
>>
>>> Hi everyone,
>>>
>>> Please review and vote on the release candidate #2 for the version
>>> 2.13.0, as follows:
>>>
>>> [ ] +1, Approve the release
>>> [ ] -1, Do not approve the release (please provide specific comments)
>>>
>>> The complete staging area is available for your review, which includes:
>>> * JIRA release notes [1],
>>> * the official Apache source release to be deployed to dist.apache.org
>>> [2], which is signed with the key with fingerprint
>>> 6356C1A9F089B0FA3DE8753688934A6699985948 [3],
>>> * all artifacts to be deployed to the Maven Central Repository [4],
>>> * source code tag "v2.13.0-RC2" [5],
>>> * website pull request listing the release [6] and publishing the API
>>> reference manual [7].
>>> * Python artifacts are deployed along with the source release to the
>>> dist.apache.org [2].
>>> * Validation sheet with a tab for 2.13.0 release to help with validation
>>> [8].
>>>
>>> The vote will be open for at least 72 hours. It is adopted by majority
>>> approval, with at least 3 PMC affirmative votes.
>>>
>>> Thanks,
>>> Ankur
>>>
>>> [1]
>>> https://jira.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527=12345166
>>> [2] https://dist.apache.org/repos/dist/dev/beam/2.13.0/
>>> [3] https://dist.apache.org/repos/dist/release/beam/KEYS
>>> [4]
>>> https://repository.apache.org/content/repositories/orgapachebeam-1070/
>>> [5] https://github.com/apache/beam/tree/v2.13.0-RC2
>>> [6] https://github.com/apache/beam/pull/8645
>>> [7] https://github.com/apache/beam-site/pull/589
>>> [8]
>>> https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1031196952
>>>
>>


Re: 1 Million Lines of Code (1 MLOC)

2019-05-31 Thread Ankur Goenka
Thanks for sharing.
This is really interesting metrics.
One use I can see is to track LOC vs Comments to make sure that we keep up
with the practice of writing maintainable code.

On Fri, May 31, 2019 at 3:04 PM Ismaël Mejía  wrote:

> I was checking some metrics in our codebase and found by chance that
> we have passed the 1 million lines of code (MLOC). Of course lines of
> code may not matter much but anyway it is interesting to see the size
> of our project at this moment.
>
> This is the detailed information returned by loc [1]:
>
>
> 
>  Language FilesLinesBlank  Comment
>  Code
>
> 
>  Java  3681   67300778265   140753
>  453989
>  Python 497   1310822256013378
> 95144
>  Go 333   1057751368111073
> 81021
>  Markdown   20531989 65260
> 25463
>  Plain Text  1121979 63590
> 15620
>  Sass92 9867 1434 1900
>  6533
>  JavaScript  19 5157 1197  467
>  3493
>  YAML14 4601  454 1104
>  3043
>  Bourne Shell30 3874  470 1028
>  2376
>  Protobuf17 4258  677 1373
>  2208
>  XML 17 2789  296  559
>  1934
>  Kotlin  19 3501  347 1370
>  1784
>  HTML60 2447  148  914
>  1385
>  Batch3  249   570
>   192
>  INI  1  206   21   16
>   169
>  C++  2   724   36
>32
>  Autoconf 1   211   16
> 4
>
> 
>  Total 5002  1000874   132497   173987
>  694390
>
> 
>
> [1] https://github.com/cgag/loc
>


[VOTE] Release 2.13.0, release candidate #2

2019-05-30 Thread Ankur Goenka
Hi everyone,

Please review and vote on the release candidate #2 for the version 2.13.0,
as follows:

[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release to be deployed to dist.apache.org [2],
which is signed with the key with fingerprint
6356C1A9F089B0FA3DE8753688934A6699985948 [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "v2.13.0-RC2" [5],
* website pull request listing the release [6] and publishing the API
reference manual [7].
* Python artifacts are deployed along with the source release to the
dist.apache.org [2].
* Validation sheet with a tab for 2.13.0 release to help with validation
[8].

The vote will be open for at least 72 hours. It is adopted by majority
approval, with at least 3 PMC affirmative votes.

Thanks,
Ankur

[1]
https://jira.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527=12345166
[2] https://dist.apache.org/repos/dist/dev/beam/2.13.0/
[3] https://dist.apache.org/repos/dist/release/beam/KEYS
[4] https://repository.apache.org/content/repositories/orgapachebeam-1070/
[5] https://github.com/apache/beam/tree/v2.13.0-RC2
[6] https://github.com/apache/beam/pull/8645
[7] https://github.com/apache/beam-site/pull/589
[8]
https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1031196952


Re: Proposal: Portability SDKHarness Docker Image Release with Beam Version Release.

2019-05-29 Thread Ankur Goenka
I agree, I think their are few things which have to be though through as
part of Portable image release.

* Where to host the images. We can ofcourse have an alias for the image
which can point to a different location but the hosting location have to be
sort through.
* Validation process for the images.
* Backward compatibility for the images. Though we can just tag them with
release name.

I might not have immediate bandwidth to do this so we need to prioritize
based on other items we have.


On Tue, May 28, 2019 at 5:24 PM Ahmet Altay  wrote:

> Could we first figure out the process (where to push, how to push,
> permissions needed, how to validate etc.) as part of the snapshots and
> update the release guide based on that?
>
> On Tue, May 28, 2019 at 2:43 AM Robert Bradshaw 
> wrote:
>
>> In the future (read, next release) the SDK will likely have reference
>> to the containers, so this will have to be part of the release.
>
>
> Who is working on this change? Could they help with figuring out the
> publishing the containers part?
>
>
>> But I
>> agree for 2.13 it should be more about figuring out the process and
>> not necessarily holding back.
>>
>> On Mon, May 27, 2019 at 7:42 PM Ankur Goenka  wrote:
>> >
>> > +1
>> > We can release the images with 2.13 but we should not block 2.13
>> release for this.
>> >
>> > On Mon, May 27, 2019, 8:39 AM Thomas Weise  wrote:
>> >>
>> >> +1
>> >>
>> >>
>> >> On Mon, May 27, 2019 at 6:56 AM Ismaël Mejía 
>> wrote:
>> >>>
>> >>> +1
>> >>>
>> >>> On Mon, May 27, 2019 at 3:35 PM Maximilian Michels 
>> wrote:
>> >>> >
>> >>> > +1
>> >>> >
>> >>> > On 27.05.19 14:04, Robert Bradshaw wrote:
>> >>> > > Sounds like everyone's onboard with the plan. Any chance we could
>> >>> > > publish these for the upcoming 2.13 release?
>> >>> > >
>> >>> > > On Wed, Feb 6, 2019 at 6:29 PM Łukasz Gajowy 
>> wrote:
>> >>> > >>
>> >>> > >> +1 to have a registry for images accessible to anyone. For
>> snapshot images, I agree that gcr + apache-beam-testing project seems a
>> good and easy way to start with.
>> >>> > >>
>> >>> > >> Łukasz
>> >>> > >>
>> >>> > >> wt., 22 sty 2019 o 19:43 Mark Liu 
>> napisał(a):
>> >>> > >>>
>> >>> > >>> +1 to have an official Beam released container image.
>> >>> > >>>
>> >>> > >>> Also I would propose to add a verification step to (or after)
>> the release process to do smoke check. Python have ValidatesContainer test
>> that runs basic pipeline using newly built container for verification.
>> Other sdk languages can do similar thing or add a common framework.
>> >>> > >>>
>> >>> > >>> Mark
>> >>> > >>>
>> >>> > >>> On Thu, Jan 17, 2019 at 5:56 AM Alan Myrvold <
>> amyrv...@google.com> wrote:
>> >>> > >>>>
>> >>> > >>>> +1 This would be great. gcr.io seems like a good option for
>> snapshots due to the permissions from jenkins to upload and ability to keep
>> snapshots around.
>> >>> > >>>>
>> >>> > >>>> On Wed, Jan 16, 2019 at 6:51 PM Ruoyun Huang <
>> ruo...@google.com> wrote:
>> >>> > >>>>>
>> >>> > >>>>> +1 This would be a great thing to have.
>> >>> > >>>>>
>> >>> > >>>>> On Wed, Jan 16, 2019 at 6:11 PM Ankur Goenka <
>> goe...@google.com> wrote:
>> >>> > >>>>>>
>> >>> > >>>>>> grc.io seems to be a good option. Given that we don't need
>> the hosting server name in the image name makes it easily changeable later.
>> >>> > >>>>>>
>> >>> > >>>>>> Docker container for Apache Flink is named "flink" and they
>> have different tags for different releases and configurations
>> https://hub.docker.com/_/flink .We can follow a similar model and can
>> name the image as "beam" (beam doesn't seem to be taken on docker hub) and
>> use tags to distinguish Java/Python/Go a

Re: [VOTE] Release 2.13.0, release candidate #1

2019-05-29 Thread Ankur Goenka
Thanks Valentyn for providing the fix for the BQ blocker.
All other cherrypicks are also done mentioned in the mail thread.
I will start RC2 now.

Thanks,
Ankur

On Wed, May 29, 2019 at 2:47 PM Valentyn Tymofieiev 
wrote:

> Hi Vasiullah,
>
> I am not aware of such function. I suggest that you start a new thread on
> u...@beam.apache.org mailing list for this question and describe your
> use-case there.
>
> On Wed, May 29, 2019 at 2:39 PM Vasiullah syed  wrote:
>
>> Hello All,
>>
>>
>>   Can anyone please help me out with one inbuilt function
>> which is there in apache spark with name as Monotonically increasing id is
>> there any smilar kind in apache beam if so please  revert it with more in
>> detail thanks in advance
>>
>> On Tue, May 28, 2019 at 9:49 PM Valentyn Tymofieiev 
>> wrote:
>>
>>> -1.
>>> I would like us to fix
>>> https://issues.apache.org/jira/browse/BEAM-7439 for 2.13.0. It is a
>>> regression that happened in 2.12.0, but was not caught by existing tests.
>>>
>>> Thanks,
>>> Valentyn
>>>
>>> On Wed, May 22, 2019, 4:30 PM Ankur Goenka  wrote:
>>>
>>>> Hi everyone,
>>>>
>>>> Please review and vote on the release candidate #1 for the version
>>>> 2.13.0, as follows:
>>>>
>>>> [ ] +1, Approve the release
>>>> [ ] -1, Do not approve the release (please provide specific comments)
>>>>
>>>> The complete staging area is available for your review, which includes:
>>>> * JIRA release notes [1],
>>>> * the official Apache source release to be deployed to dist.apache.org
>>>> [2], which is signed with the key with fingerprint
>>>> 6356C1A9F089B0FA3DE8753688934A6699985948 [3],
>>>> * all artifacts to be deployed to the Maven Central Repository [4],
>>>> * source code tag "v2.13.0-RC1" [5],
>>>> * website pull request listing the release [6] and publishing the API
>>>> reference manual [7].
>>>> * Python artifacts are deployed along with the source release to the
>>>> dist.apache.org [2].
>>>> * Validation sheet with a tab for 2.13.0 release to help with
>>>> validation [8].
>>>>
>>>> The vote will be open for at least 72 hours. It is adopted by majority
>>>> approval, with at least 3 PMC affirmative votes.
>>>>
>>>> Thanks,
>>>> Ankur
>>>>
>>>> [1]
>>>> https://jira.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527=12345166
>>>> [2] https://dist.apache.org/repos/dist/dev/beam/2.13.0/
>>>> [3] https://dist.apache.org/repos/dist/release/beam/KEYS
>>>> [4]
>>>> https://repository.apache.org/content/repositories/orgapachebeam-1069/
>>>> [5] https://github.com/apache/beam/tree/v2.13.0-RC1
>>>> [6] https://github.com/apache/beam/pull/8645
>>>> [7] https://github.com/apache/beam-site/pull/589
>>>> [8]
>>>> https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1031196952
>>>>
>>>


Re: [VOTE] Release 2.13.0, release candidate #1

2019-05-28 Thread Ankur Goenka
Open cherry pick PRs for spark runner
https://github.com/apache/beam/pull/8705
https://github.com/apache/beam/pull/8706

On Tue, May 28, 2019 at 3:42 PM Valentyn Tymofieiev 
wrote:

> Yes, looking into that.
>
> On Tue, May 28, 2019 at 3:37 PM Ankur Goenka  wrote:
>
>> Valentyn, Can you please send the cherry pick PR for
>> https://issues.apache.org/jira/browse/BEAM-7439
>>
>> On Tue, May 28, 2019 at 3:04 PM Ankur Goenka  wrote:
>>
>>> Sure, I will cherry pick those PRs.
>>>
>>> On Tue, May 28, 2019 at 2:19 PM Kyle Weaver  wrote:
>>>
>>>> Hi Ankur,
>>>>
>>>> It's not a blocker, but I'd like to see
>>>> https://github.com/apache/beam/pull/8558 and
>>>> https://github.com/apache/beam/pull/8569 be included so TFX examples
>>>> can be run without errors on the 2.13.0 Spark runner (
>>>> https://github.com/tensorflow/tfx/pull/84).
>>>>
>>>> Kyle Weaver | Software Engineer | github.com/ibzib |
>>>> kcwea...@google.com | +1650203
>>>>
>>>>
>>>> On Tue, May 28, 2019 at 11:53 AM Ankur Goenka 
>>>> wrote:
>>>>
>>>>> Thanks for the validation.
>>>>>
>>>>> I have marked fixed version of
>>>>> https://issues.apache.org/jira/browse/BEAM-7406
>>>>> https://issues.apache.org/jira/browse/BEAM-6380 to be 2.13.0 and will
>>>>> cherry pick the associated commits to the jira.
>>>>>
>>>>>
>>>>> On Tue, May 28, 2019 at 11:19 AM Lukasz Cwik  wrote:
>>>>>
>>>>>> I would also suggest to get https://github.com/apache/beam/pull/8668
>>>>>> in to 2.13.0 since it fixes a logging setup issue on Dataflow 
>>>>>> (BEAM-7406).
>>>>>>
>>>>>> On Tue, May 28, 2019 at 10:22 AM Chamikara Jayalath <
>>>>>> chamik...@google.com> wrote:
>>>>>>
>>>>>>> I would also like to get https://github.com/apache/beam/pull/8661 in
>>>>>>> to 2.13.0 that fixes https://issues.apache.org/jira/browse/BEAM-6380.
>>>>>>> It's not a new issue but has affected a number of users.
>>>>>>>
>>>>>>> - Cham
>>>>>>>
>>>>>>> On Tue, May 28, 2019 at 9:31 AM Valentyn Tymofieiev <
>>>>>>> valen...@google.com> wrote:
>>>>>>>
>>>>>>>> Thanks, Juta Staes, for reporting this issue.
>>>>>>>>
>>>>>>>> On Tue, May 28, 2019, 9:19 AM Valentyn Tymofieiev <
>>>>>>>> valen...@google.com> wrote:
>>>>>>>>
>>>>>>>>> -1.
>>>>>>>>> I would like us to fix
>>>>>>>>> https://issues.apache.org/jira/browse/BEAM-7439 for 2.13.0. It is
>>>>>>>>> a regression that happened in 2.12.0, but was not caught by existing 
>>>>>>>>> tests.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Valentyn
>>>>>>>>>
>>>>>>>>> On Wed, May 22, 2019, 4:30 PM Ankur Goenka 
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi everyone,
>>>>>>>>>>
>>>>>>>>>> Please review and vote on the release candidate #1 for the
>>>>>>>>>> version 2.13.0, as follows:
>>>>>>>>>>
>>>>>>>>>> [ ] +1, Approve the release
>>>>>>>>>> [ ] -1, Do not approve the release (please provide specific
>>>>>>>>>> comments)
>>>>>>>>>>
>>>>>>>>>> The complete staging area is available for your review, which
>>>>>>>>>> includes:
>>>>>>>>>> * JIRA release notes [1],
>>>>>>>>>> * the official Apache source release to be deployed to
>>>>>>>>>> dist.apache.org [2], which is signed with the key with
>>>>>>>>>> fingerprint 6356C1A9F089B0FA3DE8753688934A6699985948 [3],
>>>>>>>>>> * all artifacts to be deployed to the Maven Central Repository
>>>>>>>>>> [4],
>>>>>>>>>> * source code tag "v2.13.0-RC1" [5],
>>>>>>>>>> * website pull request listing the release [6] and publishing the
>>>>>>>>>> API reference manual [7].
>>>>>>>>>> * Python artifacts are deployed along with the source release to
>>>>>>>>>> the dist.apache.org [2].
>>>>>>>>>> * Validation sheet with a tab for 2.13.0 release to help with
>>>>>>>>>> validation [8].
>>>>>>>>>>
>>>>>>>>>> The vote will be open for at least 72 hours. It is adopted by
>>>>>>>>>> majority approval, with at least 3 PMC affirmative votes.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Ankur
>>>>>>>>>>
>>>>>>>>>> [1]
>>>>>>>>>> https://jira.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527=12345166
>>>>>>>>>> [2] https://dist.apache.org/repos/dist/dev/beam/2.13.0/
>>>>>>>>>> [3] https://dist.apache.org/repos/dist/release/beam/KEYS
>>>>>>>>>> [4]
>>>>>>>>>> https://repository.apache.org/content/repositories/orgapachebeam-1069/
>>>>>>>>>> [5] https://github.com/apache/beam/tree/v2.13.0-RC1
>>>>>>>>>> [6] https://github.com/apache/beam/pull/8645
>>>>>>>>>> [7] https://github.com/apache/beam-site/pull/589
>>>>>>>>>> [8]
>>>>>>>>>> https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1031196952
>>>>>>>>>>
>>>>>>>>>


Re: [VOTE] Release 2.13.0, release candidate #1

2019-05-28 Thread Ankur Goenka
Hi All,

In the meanwhile Please validate RC1 to catch anyother issues.

Thanks,
Ankur

On Tue, May 28, 2019 at 3:37 PM Ankur Goenka  wrote:

> Valentyn, Can you please send the cherry pick PR for
> https://issues.apache.org/jira/browse/BEAM-7439
>
> On Tue, May 28, 2019 at 3:04 PM Ankur Goenka  wrote:
>
>> Sure, I will cherry pick those PRs.
>>
>> On Tue, May 28, 2019 at 2:19 PM Kyle Weaver  wrote:
>>
>>> Hi Ankur,
>>>
>>> It's not a blocker, but I'd like to see
>>> https://github.com/apache/beam/pull/8558 and
>>> https://github.com/apache/beam/pull/8569 be included so TFX examples
>>> can be run without errors on the 2.13.0 Spark runner (
>>> https://github.com/tensorflow/tfx/pull/84).
>>>
>>> Kyle Weaver | Software Engineer | github.com/ibzib | kcwea...@google.com
>>> | +1650203
>>>
>>>
>>> On Tue, May 28, 2019 at 11:53 AM Ankur Goenka  wrote:
>>>
>>>> Thanks for the validation.
>>>>
>>>> I have marked fixed version of
>>>> https://issues.apache.org/jira/browse/BEAM-7406
>>>> https://issues.apache.org/jira/browse/BEAM-6380 to be 2.13.0 and will
>>>> cherry pick the associated commits to the jira.
>>>>
>>>>
>>>> On Tue, May 28, 2019 at 11:19 AM Lukasz Cwik  wrote:
>>>>
>>>>> I would also suggest to get https://github.com/apache/beam/pull/8668
>>>>> in to 2.13.0 since it fixes a logging setup issue on Dataflow (BEAM-7406).
>>>>>
>>>>> On Tue, May 28, 2019 at 10:22 AM Chamikara Jayalath <
>>>>> chamik...@google.com> wrote:
>>>>>
>>>>>> I would also like to get https://github.com/apache/beam/pull/8661 in
>>>>>> to 2.13.0 that fixes https://issues.apache.org/jira/browse/BEAM-6380.
>>>>>> It's not a new issue but has affected a number of users.
>>>>>>
>>>>>> - Cham
>>>>>>
>>>>>> On Tue, May 28, 2019 at 9:31 AM Valentyn Tymofieiev <
>>>>>> valen...@google.com> wrote:
>>>>>>
>>>>>>> Thanks, Juta Staes, for reporting this issue.
>>>>>>>
>>>>>>> On Tue, May 28, 2019, 9:19 AM Valentyn Tymofieiev <
>>>>>>> valen...@google.com> wrote:
>>>>>>>
>>>>>>>> -1.
>>>>>>>> I would like us to fix
>>>>>>>> https://issues.apache.org/jira/browse/BEAM-7439 for 2.13.0. It is
>>>>>>>> a regression that happened in 2.12.0, but was not caught by existing 
>>>>>>>> tests.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Valentyn
>>>>>>>>
>>>>>>>> On Wed, May 22, 2019, 4:30 PM Ankur Goenka 
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi everyone,
>>>>>>>>>
>>>>>>>>> Please review and vote on the release candidate #1 for the version
>>>>>>>>> 2.13.0, as follows:
>>>>>>>>>
>>>>>>>>> [ ] +1, Approve the release
>>>>>>>>> [ ] -1, Do not approve the release (please provide specific
>>>>>>>>> comments)
>>>>>>>>>
>>>>>>>>> The complete staging area is available for your review, which
>>>>>>>>> includes:
>>>>>>>>> * JIRA release notes [1],
>>>>>>>>> * the official Apache source release to be deployed to
>>>>>>>>> dist.apache.org [2], which is signed with the key with
>>>>>>>>> fingerprint 6356C1A9F089B0FA3DE8753688934A6699985948 [3],
>>>>>>>>> * all artifacts to be deployed to the Maven Central Repository [4],
>>>>>>>>> * source code tag "v2.13.0-RC1" [5],
>>>>>>>>> * website pull request listing the release [6] and publishing the
>>>>>>>>> API reference manual [7].
>>>>>>>>> * Python artifacts are deployed along with the source release to
>>>>>>>>> the dist.apache.org [2].
>>>>>>>>> * Validation sheet with a tab for 2.13.0 release to help with
>>>>>>>>> validation [8].
>>>>>>>>>
>>>>>>>>> The vote will be open for at least 72 hours. It is adopted by
>>>>>>>>> majority approval, with at least 3 PMC affirmative votes.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Ankur
>>>>>>>>>
>>>>>>>>> [1]
>>>>>>>>> https://jira.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527=12345166
>>>>>>>>> [2] https://dist.apache.org/repos/dist/dev/beam/2.13.0/
>>>>>>>>> [3] https://dist.apache.org/repos/dist/release/beam/KEYS
>>>>>>>>> [4]
>>>>>>>>> https://repository.apache.org/content/repositories/orgapachebeam-1069/
>>>>>>>>> [5] https://github.com/apache/beam/tree/v2.13.0-RC1
>>>>>>>>> [6] https://github.com/apache/beam/pull/8645
>>>>>>>>> [7] https://github.com/apache/beam-site/pull/589
>>>>>>>>> [8]
>>>>>>>>> https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1031196952
>>>>>>>>>
>>>>>>>>


Re: [VOTE] Release 2.13.0, release candidate #1

2019-05-28 Thread Ankur Goenka
Valentyn, Can you please send the cherry pick PR for
https://issues.apache.org/jira/browse/BEAM-7439

On Tue, May 28, 2019 at 3:04 PM Ankur Goenka  wrote:

> Sure, I will cherry pick those PRs.
>
> On Tue, May 28, 2019 at 2:19 PM Kyle Weaver  wrote:
>
>> Hi Ankur,
>>
>> It's not a blocker, but I'd like to see
>> https://github.com/apache/beam/pull/8558 and
>> https://github.com/apache/beam/pull/8569 be included so TFX examples can
>> be run without errors on the 2.13.0 Spark runner (
>> https://github.com/tensorflow/tfx/pull/84).
>>
>> Kyle Weaver | Software Engineer | github.com/ibzib | kcwea...@google.com
>> | +1650203
>>
>>
>> On Tue, May 28, 2019 at 11:53 AM Ankur Goenka  wrote:
>>
>>> Thanks for the validation.
>>>
>>> I have marked fixed version of
>>> https://issues.apache.org/jira/browse/BEAM-7406
>>> https://issues.apache.org/jira/browse/BEAM-6380 to be 2.13.0 and will
>>> cherry pick the associated commits to the jira.
>>>
>>>
>>> On Tue, May 28, 2019 at 11:19 AM Lukasz Cwik  wrote:
>>>
>>>> I would also suggest to get https://github.com/apache/beam/pull/8668
>>>> in to 2.13.0 since it fixes a logging setup issue on Dataflow (BEAM-7406).
>>>>
>>>> On Tue, May 28, 2019 at 10:22 AM Chamikara Jayalath <
>>>> chamik...@google.com> wrote:
>>>>
>>>>> I would also like to get https://github.com/apache/beam/pull/8661 in
>>>>> to 2.13.0 that fixes https://issues.apache.org/jira/browse/BEAM-6380.
>>>>> It's not a new issue but has affected a number of users.
>>>>>
>>>>> - Cham
>>>>>
>>>>> On Tue, May 28, 2019 at 9:31 AM Valentyn Tymofieiev <
>>>>> valen...@google.com> wrote:
>>>>>
>>>>>> Thanks, Juta Staes, for reporting this issue.
>>>>>>
>>>>>> On Tue, May 28, 2019, 9:19 AM Valentyn Tymofieiev <
>>>>>> valen...@google.com> wrote:
>>>>>>
>>>>>>> -1.
>>>>>>> I would like us to fix
>>>>>>> https://issues.apache.org/jira/browse/BEAM-7439 for 2.13.0. It is a
>>>>>>> regression that happened in 2.12.0, but was not caught by existing 
>>>>>>> tests.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Valentyn
>>>>>>>
>>>>>>> On Wed, May 22, 2019, 4:30 PM Ankur Goenka 
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi everyone,
>>>>>>>>
>>>>>>>> Please review and vote on the release candidate #1 for the version
>>>>>>>> 2.13.0, as follows:
>>>>>>>>
>>>>>>>> [ ] +1, Approve the release
>>>>>>>> [ ] -1, Do not approve the release (please provide specific
>>>>>>>> comments)
>>>>>>>>
>>>>>>>> The complete staging area is available for your review, which
>>>>>>>> includes:
>>>>>>>> * JIRA release notes [1],
>>>>>>>> * the official Apache source release to be deployed to
>>>>>>>> dist.apache.org [2], which is signed with the key with fingerprint
>>>>>>>> 6356C1A9F089B0FA3DE8753688934A6699985948 [3],
>>>>>>>> * all artifacts to be deployed to the Maven Central Repository [4],
>>>>>>>> * source code tag "v2.13.0-RC1" [5],
>>>>>>>> * website pull request listing the release [6] and publishing the
>>>>>>>> API reference manual [7].
>>>>>>>> * Python artifacts are deployed along with the source release to
>>>>>>>> the dist.apache.org [2].
>>>>>>>> * Validation sheet with a tab for 2.13.0 release to help with
>>>>>>>> validation [8].
>>>>>>>>
>>>>>>>> The vote will be open for at least 72 hours. It is adopted by
>>>>>>>> majority approval, with at least 3 PMC affirmative votes.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Ankur
>>>>>>>>
>>>>>>>> [1]
>>>>>>>> https://jira.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527=12345166
>>>>>>>> [2] https://dist.apache.org/repos/dist/dev/beam/2.13.0/
>>>>>>>> [3] https://dist.apache.org/repos/dist/release/beam/KEYS
>>>>>>>> [4]
>>>>>>>> https://repository.apache.org/content/repositories/orgapachebeam-1069/
>>>>>>>> [5] https://github.com/apache/beam/tree/v2.13.0-RC1
>>>>>>>> [6] https://github.com/apache/beam/pull/8645
>>>>>>>> [7] https://github.com/apache/beam-site/pull/589
>>>>>>>> [8]
>>>>>>>> https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1031196952
>>>>>>>>
>>>>>>>


Re: [VOTE] Release 2.13.0, release candidate #1

2019-05-28 Thread Ankur Goenka
Sure, I will cherry pick those PRs.

On Tue, May 28, 2019 at 2:19 PM Kyle Weaver  wrote:

> Hi Ankur,
>
> It's not a blocker, but I'd like to see
> https://github.com/apache/beam/pull/8558 and
> https://github.com/apache/beam/pull/8569 be included so TFX examples can
> be run without errors on the 2.13.0 Spark runner (
> https://github.com/tensorflow/tfx/pull/84).
>
> Kyle Weaver | Software Engineer | github.com/ibzib | kcwea...@google.com
> | +1650203
>
>
> On Tue, May 28, 2019 at 11:53 AM Ankur Goenka  wrote:
>
>> Thanks for the validation.
>>
>> I have marked fixed version of
>> https://issues.apache.org/jira/browse/BEAM-7406
>> https://issues.apache.org/jira/browse/BEAM-6380 to be 2.13.0 and will
>> cherry pick the associated commits to the jira.
>>
>>
>> On Tue, May 28, 2019 at 11:19 AM Lukasz Cwik  wrote:
>>
>>> I would also suggest to get https://github.com/apache/beam/pull/8668 in
>>> to 2.13.0 since it fixes a logging setup issue on Dataflow (BEAM-7406).
>>>
>>> On Tue, May 28, 2019 at 10:22 AM Chamikara Jayalath <
>>> chamik...@google.com> wrote:
>>>
>>>> I would also like to get https://github.com/apache/beam/pull/8661 in
>>>> to 2.13.0 that fixes https://issues.apache.org/jira/browse/BEAM-6380.
>>>> It's not a new issue but has affected a number of users.
>>>>
>>>> - Cham
>>>>
>>>> On Tue, May 28, 2019 at 9:31 AM Valentyn Tymofieiev <
>>>> valen...@google.com> wrote:
>>>>
>>>>> Thanks, Juta Staes, for reporting this issue.
>>>>>
>>>>> On Tue, May 28, 2019, 9:19 AM Valentyn Tymofieiev 
>>>>> wrote:
>>>>>
>>>>>> -1.
>>>>>> I would like us to fix
>>>>>> https://issues.apache.org/jira/browse/BEAM-7439 for 2.13.0. It is a
>>>>>> regression that happened in 2.12.0, but was not caught by existing tests.
>>>>>>
>>>>>> Thanks,
>>>>>> Valentyn
>>>>>>
>>>>>> On Wed, May 22, 2019, 4:30 PM Ankur Goenka  wrote:
>>>>>>
>>>>>>> Hi everyone,
>>>>>>>
>>>>>>> Please review and vote on the release candidate #1 for the version
>>>>>>> 2.13.0, as follows:
>>>>>>>
>>>>>>> [ ] +1, Approve the release
>>>>>>> [ ] -1, Do not approve the release (please provide specific comments)
>>>>>>>
>>>>>>> The complete staging area is available for your review, which
>>>>>>> includes:
>>>>>>> * JIRA release notes [1],
>>>>>>> * the official Apache source release to be deployed to
>>>>>>> dist.apache.org [2], which is signed with the key with fingerprint
>>>>>>> 6356C1A9F089B0FA3DE8753688934A6699985948 [3],
>>>>>>> * all artifacts to be deployed to the Maven Central Repository [4],
>>>>>>> * source code tag "v2.13.0-RC1" [5],
>>>>>>> * website pull request listing the release [6] and publishing the
>>>>>>> API reference manual [7].
>>>>>>> * Python artifacts are deployed along with the source release to the
>>>>>>> dist.apache.org [2].
>>>>>>> * Validation sheet with a tab for 2.13.0 release to help with
>>>>>>> validation [8].
>>>>>>>
>>>>>>> The vote will be open for at least 72 hours. It is adopted by
>>>>>>> majority approval, with at least 3 PMC affirmative votes.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Ankur
>>>>>>>
>>>>>>> [1]
>>>>>>> https://jira.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527=12345166
>>>>>>> [2] https://dist.apache.org/repos/dist/dev/beam/2.13.0/
>>>>>>> [3] https://dist.apache.org/repos/dist/release/beam/KEYS
>>>>>>> [4]
>>>>>>> https://repository.apache.org/content/repositories/orgapachebeam-1069/
>>>>>>> [5] https://github.com/apache/beam/tree/v2.13.0-RC1
>>>>>>> [6] https://github.com/apache/beam/pull/8645
>>>>>>> [7] https://github.com/apache/beam-site/pull/589
>>>>>>> [8]
>>>>>>> https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1031196952
>>>>>>>
>>>>>>


Re: [VOTE] Release 2.13.0, release candidate #1

2019-05-28 Thread Ankur Goenka
Thanks for the validation.

I have marked fixed version of
https://issues.apache.org/jira/browse/BEAM-7406
https://issues.apache.org/jira/browse/BEAM-6380 to be 2.13.0 and will
cherry pick the associated commits to the jira.


On Tue, May 28, 2019 at 11:19 AM Lukasz Cwik  wrote:

> I would also suggest to get https://github.com/apache/beam/pull/8668 in
> to 2.13.0 since it fixes a logging setup issue on Dataflow (BEAM-7406).
>
> On Tue, May 28, 2019 at 10:22 AM Chamikara Jayalath 
> wrote:
>
>> I would also like to get https://github.com/apache/beam/pull/8661 in to
>> 2.13.0 that fixes https://issues.apache.org/jira/browse/BEAM-6380. It's
>> not a new issue but has affected a number of users.
>>
>> - Cham
>>
>> On Tue, May 28, 2019 at 9:31 AM Valentyn Tymofieiev 
>> wrote:
>>
>>> Thanks, Juta Staes, for reporting this issue.
>>>
>>> On Tue, May 28, 2019, 9:19 AM Valentyn Tymofieiev 
>>> wrote:
>>>
>>>> -1.
>>>> I would like us to fix
>>>> https://issues.apache.org/jira/browse/BEAM-7439 for 2.13.0. It is a
>>>> regression that happened in 2.12.0, but was not caught by existing tests.
>>>>
>>>> Thanks,
>>>> Valentyn
>>>>
>>>> On Wed, May 22, 2019, 4:30 PM Ankur Goenka  wrote:
>>>>
>>>>> Hi everyone,
>>>>>
>>>>> Please review and vote on the release candidate #1 for the version
>>>>> 2.13.0, as follows:
>>>>>
>>>>> [ ] +1, Approve the release
>>>>> [ ] -1, Do not approve the release (please provide specific comments)
>>>>>
>>>>> The complete staging area is available for your review, which includes:
>>>>> * JIRA release notes [1],
>>>>> * the official Apache source release to be deployed to dist.apache.org
>>>>> [2], which is signed with the key with fingerprint
>>>>> 6356C1A9F089B0FA3DE8753688934A6699985948 [3],
>>>>> * all artifacts to be deployed to the Maven Central Repository [4],
>>>>> * source code tag "v2.13.0-RC1" [5],
>>>>> * website pull request listing the release [6] and publishing the API
>>>>> reference manual [7].
>>>>> * Python artifacts are deployed along with the source release to the
>>>>> dist.apache.org [2].
>>>>> * Validation sheet with a tab for 2.13.0 release to help with
>>>>> validation [8].
>>>>>
>>>>> The vote will be open for at least 72 hours. It is adopted by majority
>>>>> approval, with at least 3 PMC affirmative votes.
>>>>>
>>>>> Thanks,
>>>>> Ankur
>>>>>
>>>>> [1]
>>>>> https://jira.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527=12345166
>>>>> [2] https://dist.apache.org/repos/dist/dev/beam/2.13.0/
>>>>> [3] https://dist.apache.org/repos/dist/release/beam/KEYS
>>>>> [4]
>>>>> https://repository.apache.org/content/repositories/orgapachebeam-1069/
>>>>> [5] https://github.com/apache/beam/tree/v2.13.0-RC1
>>>>> [6] https://github.com/apache/beam/pull/8645
>>>>> [7] https://github.com/apache/beam-site/pull/589
>>>>> [8]
>>>>> https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1031196952
>>>>>
>>>>


Re: Proposal: Portability SDKHarness Docker Image Release with Beam Version Release.

2019-05-27 Thread Ankur Goenka
+1
We can release the images with 2.13 but we should not block 2.13 release
for this.

On Mon, May 27, 2019, 8:39 AM Thomas Weise  wrote:

> +1
>
>
> On Mon, May 27, 2019 at 6:56 AM Ismaël Mejía  wrote:
>
>> +1
>>
>> On Mon, May 27, 2019 at 3:35 PM Maximilian Michels 
>> wrote:
>> >
>> > +1
>> >
>> > On 27.05.19 14:04, Robert Bradshaw wrote:
>> > > Sounds like everyone's onboard with the plan. Any chance we could
>> > > publish these for the upcoming 2.13 release?
>> > >
>> > > On Wed, Feb 6, 2019 at 6:29 PM Łukasz Gajowy 
>> wrote:
>> > >>
>> > >> +1 to have a registry for images accessible to anyone. For snapshot
>> images, I agree that gcr + apache-beam-testing project seems a good and
>> easy way to start with.
>> > >>
>> > >> Łukasz
>> > >>
>> > >> wt., 22 sty 2019 o 19:43 Mark Liu  napisał(a):
>> > >>>
>> > >>> +1 to have an official Beam released container image.
>> > >>>
>> > >>> Also I would propose to add a verification step to (or after) the
>> release process to do smoke check. Python have ValidatesContainer test that
>> runs basic pipeline using newly built container for verification. Other sdk
>> languages can do similar thing or add a common framework.
>> > >>>
>> > >>> Mark
>> > >>>
>> > >>> On Thu, Jan 17, 2019 at 5:56 AM Alan Myrvold 
>> wrote:
>> > >>>>
>> > >>>> +1 This would be great. gcr.io seems like a good option for
>> snapshots due to the permissions from jenkins to upload and ability to keep
>> snapshots around.
>> > >>>>
>> > >>>> On Wed, Jan 16, 2019 at 6:51 PM Ruoyun Huang 
>> wrote:
>> > >>>>>
>> > >>>>> +1 This would be a great thing to have.
>> > >>>>>
>> > >>>>> On Wed, Jan 16, 2019 at 6:11 PM Ankur Goenka 
>> wrote:
>> > >>>>>>
>> > >>>>>> grc.io seems to be a good option. Given that we don't need the
>> hosting server name in the image name makes it easily changeable later.
>> > >>>>>>
>> > >>>>>> Docker container for Apache Flink is named "flink" and they have
>> different tags for different releases and configurations
>> https://hub.docker.com/_/flink .We can follow a similar model and can
>> name the image as "beam" (beam doesn't seem to be taken on docker hub) and
>> use tags to distinguish Java/Python/Go and versions etc.
>> > >>>>>>
>> > >>>>>> Tags will look like:
>> > >>>>>> java-SNAPSHOT
>> > >>>>>> java-2.10.1
>> > >>>>>> python2-SNAPSHOT
>> > >>>>>> python2-2.10.1
>> > >>>>>> go-SNAPSHOT
>> > >>>>>> go-2.10.1
>> > >>>>>>
>> > >>>>>>
>> > >>>>>> On Wed, Jan 16, 2019 at 5:56 PM Ahmet Altay 
>> wrote:
>> > >>>>>>>
>> > >>>>>>> For snapshots, we could use gcr.io. Permission would not be a
>> problem since Jenkins is already correctly setup. The cost will be covered
>> under apache-beam-testing project. And since this is only for snapshots, it
>> will be only for temporary artifacts not for release artifacts.
>> > >>>>>>>
>> > >>>>>>> On Wed, Jan 16, 2019 at 5:50 PM Valentyn Tymofieiev <
>> valen...@google.com> wrote:
>> > >>>>>>>>
>> > >>>>>>>> +1, releasing containers is a useful process that we need to
>> build in Beam and it is required for FnApi users. Among other reasons,
>> having officially-released Beam SDK harness container images will make it
>> easier for users to do simple customizations to  container images, as they
>> will be able to use container image released by Beam as a base image.
>> > >>>>>>>>
>> > >>>>>>>> Good point about potential storage limitations on Bintray.
>> With Beam Release cadence we may quickly exceed the 10 GB quota. It may
>> also affect our decisions as to which images we want to release, for
>> example: do we want to only release one container image with Pyth

[VOTE] Release 2.13.0, release candidate #1

2019-05-22 Thread Ankur Goenka
Hi everyone,

Please review and vote on the release candidate #1 for the version 2.13.0,
as follows:

[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release to be deployed to dist.apache.org [2],
which is signed with the key with fingerprint
6356C1A9F089B0FA3DE8753688934A6699985948 [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "v2.13.0-RC1" [5],
* website pull request listing the release [6] and publishing the API
reference manual [7].
* Python artifacts are deployed along with the source release to the
dist.apache.org [2].
* Validation sheet with a tab for 2.13.0 release to help with validation
[8].

The vote will be open for at least 72 hours. It is adopted by majority
approval, with at least 3 PMC affirmative votes.

Thanks,
Ankur

[1]
https://jira.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527=12345166
[2] https://dist.apache.org/repos/dist/dev/beam/2.13.0/
[3] https://dist.apache.org/repos/dist/release/beam/KEYS
[4] https://repository.apache.org/content/repositories/orgapachebeam-1069/
[5] https://github.com/apache/beam/tree/v2.13.0-RC1
[6] https://github.com/apache/beam/pull/8645
[7] https://github.com/apache/beam-site/pull/589
[8]
https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1031196952


Re: Enable security for data channels in portability

2019-05-16 Thread Ankur Goenka
Hi Hai,

Thanks for the PR.
Added a couple of comments. Will take a detailed look later.

Thanks,
Ankur

*From: *Hai Lu 
*Date: *Thu, May 16, 2019 at 8:02 PM
*To: * , 
*Cc: * , , 

Hi Lukasz and Ankur,
>
> Here is the PR that implements the idea:
> https://github.com/apache/beam/pull/8597
>
> Would appreciate it if you could take a look.
>
> Thanks,
> Hai
>
> On Tue, Apr 30, 2019 at 9:13 AM Hai Lu  wrote:
>
>> One thing to clarify is that we do not use docker. I don't have too much
>> experience with docker; I assume docker itself already has network
>> isolation, and that's why it was never necessary to enable security in
>> portable runner before?
>>
>> For us because we simply use processes, we need this extra secret
>> (through file system) for authentication.
>>
>> Let me create a ticket and send a PR, which should explain my intention
>> better.
>>
>> Thanks,
>> Hai
>>
>> On Mon, Apr 29, 2019 at 1:03 PM Lukasz Cwik  wrote:
>>
>>> Changing the address to be loopback based upon how the environment is
>>> started (docker container/process/external/...) makes sense.
>>>
>>> How would the SDK and runner support storing/sharing this secret? (For
>>> example, in the docker container, how would the secret get there?)
>>>
>>> On Mon, Apr 29, 2019 at 9:23 AM Hai Lu  wrote:
>>>
>>>> Hi Lukasz and Ankur,
>>>>
>>>> Thank you so much for your response! This is what we're
>>>> doing/implementing in our internal fork right now:
>>>>
>>>>1. We assume that the Java process and Python process *are always
>>>>colocated in the same host*, so first of all we use "loopback"
>>>>address instead of "any address" that's currently being used on the java
>>>>side. That way, the traffic between sdk worker and runner is limited to 
>>>> the
>>>>host but not exposed to network.
>>>>2. Because of the multi-tenant nature of our environment, we still
>>>>want to have authentication even for local host, so that data ports are 
>>>> not
>>>>connected by random processes. Because different jobs have their own 
>>>> user
>>>>name, it's sufficient to *use file system to store an ad-hoc secret*,
>>>>which can be shared by both Python sdk and java runner. The the runner 
>>>> uses
>>>>this secret to authenticate the worker (by using gRPC's interceptor for
>>>>this customized auth)
>>>>3. By having the 2 steps above, we *no longer need transport layer
>>>>security *(SSL/TLS). So we abandon our initial plan to enable
>>>>SSL/TLS.
>>>>
>>>> Above is the high level plan that I'm implementing. I would like to
>>>> have a similar solution in the open source to be merged with our internal
>>>> fork. Let me know what you think. If this sounds OK I will create a ticket
>>>> for myself and will first send out a short write-up in google doc to
>>>> collect comments soon.
>>>>
>>>> Thanks,
>>>> Hai
>>>>
>>>> On Fri, Apr 26, 2019 at 5:24 PM Ankur Goenka  wrote:
>>>>
>>>>> In an offline chat with Hai, It seem useful for users to be able to
>>>>> provide custom authentication like a secret which can be distributed out 
>>>>> of
>>>>> band by the infrastructure and can be provided via file system, rpc to
>>>>> another service etc.
>>>>> gRPC already has some mechanism for standard and custom
>>>>> authentication[1].
>>>>> Instrumenting gRPC channel using command line option or environment
>>>>> variable on the worker machines can be be useful.
>>>>>
>>>>> [1] https://grpc.io/docs/guides/auth/
>>>>>
>>>>> On Fri, Apr 26, 2019 at 4:33 PM Lukasz Cwik  wrote:
>>>>>
>>>>>> The link to the ApiServiceDescriptor is
>>>>>> https://github.com/apache/beam/blob/476e17ed6badd4d5c06c4caf8a824805f40a8e7a/model/pipeline/src/main/proto/endpoints.proto#L31
>>>>>>
>>>>>> On Fri, Apr 26, 2019 at 4:32 PM Lukasz Cwik  wrote:
>>>>>>
>>>>>>> I had originally taken a look at this a while ago but not much has
>>>>>>> progressed since then. The original idea was that the 
>>>>>>> ApiServiceDescriptor
>>>

Re: [ANNOUNCE] New PMC Member: Pablo Estrada

2019-05-15 Thread Ankur Goenka
Congratulations Pablo!

On Wed, May 15, 2019, 12:21 PM Ruoyun Huang  wrote:

> Congratulations, Pablo!
>
> *From: *Charles Chen 
> *Date: *Wed, May 15, 2019 at 11:04 AM
> *To: *dev
>
> Congrats Pablo and thank you for your contributions!
>>
>> On Wed, May 15, 2019, 10:53 AM Valentyn Tymofieiev 
>> wrote:
>>
>>> Congrats, Pablo!
>>>
>>> On Wed, May 15, 2019 at 10:41 AM Yifan Zou  wrote:
>>>
 Congratulations, Pablo!

 *From: *Maximilian Michels 
 *Date: *Wed, May 15, 2019 at 2:06 AM
 *To: * 

 Congrats Pablo! Thank you for your help to grow the Beam community!
>
> On 15.05.19 10:33, Tim Robertson wrote:
> > Congratulations Pablo
> >
> > On Wed, May 15, 2019 at 10:22 AM Ismaël Mejía  > > wrote:
> >
> > Congrats Pablo, well deserved, nece to see your work recognized!
> >
> > On Wed, May 15, 2019 at 9:59 AM Pei HE  > > wrote:
> >  >
> >  > Congrats, Pablo!
> >  >
> >  > On Tue, May 14, 2019 at 11:41 PM Tanay Tummalapalli
> >  > mailto:ttanay.apa...@gmail.com>>
> wrote:
> >  > >
> >  > > Congratulations Pablo!
> >  > >
> >  > > On Wed, May 15, 2019, 12:08 Michael Luckey <
> adude3...@gmail.com
> > > wrote:
> >  > >>
> >  > >> Congrats, Pablo!
> >  > >>
> >  > >> On Wed, May 15, 2019 at 8:21 AM Connell O'Callaghan
> > mailto:conne...@google.com>> wrote:
> >  > >>>
> >  > >>> Awesome well done Pablo!!!
> >  > >>>
> >  > >>> Kenn thank you for sharing this great news with us!!!
> >  > >>>
> >  > >>> On Tue, May 14, 2019 at 11:01 PM Ahmet Altay
> > mailto:al...@google.com>> wrote:
> >  > 
> >  >  Congratulations!
> >  > 
> >  >  On Tue, May 14, 2019 at 9:11 PM Robert Burke
> > mailto:rob...@frantil.com>> wrote:
> >  > >
> >  > > Woohoo! Well deserved.
> >  > >
> >  > > On Tue, May 14, 2019, 8:34 PM Reuven Lax <
> re...@google.com
> > > wrote:
> >  > >>
> >  > >> Congratulations!
> >  > >>
> >  > >> From: Mikhail Gryzykhin  > >
> >  > >> Date: Tue, May 14, 2019 at 8:32 PM
> >  > >> To: mailto:dev@beam.apache.org>>
> >  > >>
> >  > >>> Congratulations Pablo!
> >  > >>>
> >  > >>> On Tue, May 14, 2019, 20:25 Kenneth Knowles
> > mailto:k...@apache.org>> wrote:
> >  > 
> >  >  Hi all,
> >  > 
> >  >  Please join me and the rest of the Beam PMC in
> welcoming
> > Pablo Estrada to join the PMC.
> >  > 
> >  >  Pablo first picked up BEAM-722 in October of 2016 and
> > has been a steady part of the Beam community since then. In
> addition
> > to technical work on Beam Python & Java & runners, I would
> highlight
> > how Pablo grows Beam's community by helping users, working on
> GSoC,
> > giving talks at Beam Summits and other OSS conferences including
> > Flink Forward, and holding training workshops. I cannot do
> justice
> > to Pablo's contributions in a single paragraph.
> >  > 
> >  >  Thanks Pablo, for being a part of Beam.
> >  > 
> >  >  Kenn
> >
>

>
> --
> 
> Ruoyun  Huang
>
>


Re: All validates runner tests seems to be broken.

2019-05-14 Thread Ankur Goenka
It seems to be related.
I will try to rerun the seed job.

*From: *Lukasz Cwik 
*Date: *Tue, May 14, 2019 at 1:52 PM
*To: *dev

Its likely related to the rename done in
> https://github.com/apache/beam/commit/f198de033824949eb66ea533ae8a40b8dd8cd7fe#diff-2bb618406f7ee4470a48343283f368a2
> Kenn is tracking a different issue related to publishing being broken
> in BEAM-7302 which he has a fix for in
> https://github.com/apache/beam/pull/8577
>
> I suspect there will be a few of these gotchas.
>
>
> *From: *Ankur Goenka 
> *Date: *Tue, May 14, 2019 at 1:43 PM
> *To: *dev
>
> Hi,
>>
>> Following tests seems to be broken because of "Project 'unners' not found
>> in root project 'beam'."
>> The command getting executed on Jenkins is
>> gradlew --continue --max-workers=12 -Dorg.gradle.jvmargs=-Xms2g
>> -Dorg.gradle.jvmargs=-Xmx4g :unners:samza:validatesRunner
>> causing the failure.
>> The same is happening on the master
>> https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_PR/19/console
>>
>> Are we making any changes to jenins parsing scripts?
>>
>> Details
>> <https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_PR/20/>
>> [image: @asfgit] <https://github.com/asfgit>
>> Apache Flink Runner ValidatesRunner Tests — FAILURE
>> Details
>> <https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_PR/63/>
>> [image: @asfgit] <https://github.com/asfgit>
>> Apache Gearpump Runner ValidatesRunner Tests — FAILURE
>> Details
>> <https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_PR/16/>
>> [image: @asfgit] <https://github.com/asfgit>
>> Apache Samza Runner ValidatesRunner Tests — FAILURE
>> Details
>> <https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_PR/17/>
>> [image: @asfgit] <https://github.com/asfgit>
>> Apache Spark Runner ValidatesRunner Tests — FAILURE
>> Details
>> <https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_PR/73/>
>> [image: @asfgit] <https://github.com/asfgit>
>> Google Cloud Dataflow Runner PortabilityApi ValidatesRunner Tests —
>> FAILURE
>> Details
>> <https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_PortabilityApi_Dataflow_PR/39/>
>> [image: @asfgit] <https://github.com/asfgit>
>> Google Cloud Dataflow Runner Python ValidatesContainer Tests — FAILURE
>> Details <https://builds.apache.org/job/beam_PostCommit_Py_ValCont_PR/54/>
>> [image: @asfgit] <https://github.com/asfgit>
>> Google Cloud Dataflow Runner Python ValidatesRunner Tests — FAILURE
>> Details
>> <https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow_PR/79/>
>> [image: @asfgit] <https://github.com/asfgit>
>> Google Cloud Dataflow Runner ValidatesRunner Tests — FAILURE
>> Details
>> <https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_PR/59/>
>> [image: @asfgit] <https://github.com/asfgit>
>> Java Flink PortableValidatesRunner Batch Tests — FAILURE
>> Details
>> <https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch_PR/50/>
>> [image: @asfgit] <https://github.com/asfgit>
>> Java Flink PortableValidatesRunner Streaming Tests — FAILURE
>> Details
>> <https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming_PR/77/>
>>
>> Thanks,
>> Ankur
>>
>


Re: All validates runner tests seems to be broken.

2019-05-14 Thread Ankur Goenka
yup, running the seed job.

@Alan Myrvold  The other tests are also failing which
do not have a typo,

*From: *Andrew Pilloud 
*Date: *Tue, May 14, 2019 at 2:02 PM
*To: *dev

So it sounds like a number of the failures are related to a single jenkins
> config for all branches. This means you can't test the release branch if
> the targets change after it is cut. One possibility: do "Run Seed Job" on
> the release branch and kick off all the tests right after that finishes.
> Then do "Run Seed Job" after they all start on head again to restore the
> config.
>
> Andrew
>
> *From: *Alan Myrvold 
> *Date: *Tue, May 14, 2019 at 1:59 PM
> *To: * 
>
> That is a typo added in https://github.com/apache/beam/pull/8194
>>
>> https://github.com/apache/beam/commit/1e7ea0da5073566c3fa26dbc1105105fbe6043ae#diff-9591f0d06e82e711681fd77ed287578b
>>
>> *From: *Ankur Goenka 
>> *Date: *Tue, May 14, 2019 at 1:43 PM
>> *To: *dev
>>
>> Hi,
>>>
>>> Following tests seems to be broken because of "Project 'unners' not
>>> found in root project 'beam'."
>>> The command getting executed on Jenkins is
>>> gradlew --continue --max-workers=12 -Dorg.gradle.jvmargs=-Xms2g
>>> -Dorg.gradle.jvmargs=-Xmx4g :unners:samza:validatesRunner
>>> causing the failure.
>>> The same is happening on the master
>>> https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_PR/19/console
>>>
>>> Are we making any changes to jenins parsing scripts?
>>>
>>> Details
>>> <https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_PR/20/>
>>> [image: @asfgit] <https://github.com/asfgit>
>>> Apache Flink Runner ValidatesRunner Tests — FAILURE
>>> Details
>>> <https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_PR/63/>
>>> [image: @asfgit] <https://github.com/asfgit>
>>> Apache Gearpump Runner ValidatesRunner Tests — FAILURE
>>> Details
>>> <https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_PR/16/>
>>> [image: @asfgit] <https://github.com/asfgit>
>>> Apache Samza Runner ValidatesRunner Tests — FAILURE
>>> Details
>>> <https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_PR/17/>
>>> [image: @asfgit] <https://github.com/asfgit>
>>> Apache Spark Runner ValidatesRunner Tests — FAILURE
>>> Details
>>> <https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_PR/73/>
>>> [image: @asfgit] <https://github.com/asfgit>
>>> Google Cloud Dataflow Runner PortabilityApi ValidatesRunner Tests —
>>> FAILURE
>>> Details
>>> <https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_PortabilityApi_Dataflow_PR/39/>
>>> [image: @asfgit] <https://github.com/asfgit>
>>> Google Cloud Dataflow Runner Python ValidatesContainer Tests — FAILURE
>>> Details
>>> <https://builds.apache.org/job/beam_PostCommit_Py_ValCont_PR/54/>
>>> [image: @asfgit] <https://github.com/asfgit>
>>> Google Cloud Dataflow Runner Python ValidatesRunner Tests — FAILURE
>>> Details
>>> <https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow_PR/79/>
>>> [image: @asfgit] <https://github.com/asfgit>
>>> Google Cloud Dataflow Runner ValidatesRunner Tests — FAILURE
>>> Details
>>> <https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_PR/59/>
>>> [image: @asfgit] <https://github.com/asfgit>
>>> Java Flink PortableValidatesRunner Batch Tests — FAILURE
>>> Details
>>> <https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch_PR/50/>
>>> [image: @asfgit] <https://github.com/asfgit>
>>> Java Flink PortableValidatesRunner Streaming Tests — FAILURE
>>> Details
>>> <https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming_PR/77/>
>>>
>>> Thanks,
>>> Ankur
>>>
>>


Re: All validates runner tests seems to be broken.

2019-05-14 Thread Ankur Goenka
Ahh, I see. Good point.
so shall we revert test-infra?

*From: *Alan Myrvold 
*Date: *Tue, May 14, 2019 at 2:16 PM
*To: * 

Other ones are failing on the branch due to the 2.13.0 branch not having
> https://github.com/apache/beam/pull/8194, and the seed job running from
> master.
>
> *From: *Michael Luckey 
> *Date: *Tue, May 14, 2019 at 2:13 PM
> *To: * 
>
> Unfortunately, I missed the fact that seed job triggers automatically.
>>
>> Yes, you need to run the seed job on your branch to replace the old
>> commands.
>>
>> We might consider resetting to legacy commands, i.e. revert ./test-infra
>> folder. What do you think?
>>
>> On Tue, May 14, 2019 at 11:02 PM Andrew Pilloud 
>> wrote:
>>
>>> So it sounds like a number of the failures are related to a single
>>> jenkins config for all branches. This means you can't test the release
>>> branch if the targets change after it is cut. One possibility: do "Run Seed
>>> Job" on the release branch and kick off all the tests right after that
>>> finishes. Then do "Run Seed Job" after they all start on head again to
>>> restore the config.
>>>
>>> Andrew
>>>
>>> *From: *Alan Myrvold 
>>> *Date: *Tue, May 14, 2019 at 1:59 PM
>>> *To: * 
>>>
>>> That is a typo added in https://github.com/apache/beam/pull/8194
>>>>
>>>> https://github.com/apache/beam/commit/1e7ea0da5073566c3fa26dbc1105105fbe6043ae#diff-9591f0d06e82e711681fd77ed287578b
>>>>
>>>> *From: *Ankur Goenka 
>>>> *Date: *Tue, May 14, 2019 at 1:43 PM
>>>> *To: *dev
>>>>
>>>> Hi,
>>>>>
>>>>> Following tests seems to be broken because of "Project 'unners' not
>>>>> found in root project 'beam'."
>>>>> The command getting executed on Jenkins is
>>>>> gradlew --continue --max-workers=12 -Dorg.gradle.jvmargs=-Xms2g
>>>>> -Dorg.gradle.jvmargs=-Xmx4g :unners:samza:validatesRunner
>>>>> causing the failure.
>>>>> The same is happening on the master
>>>>> https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_PR/19/console
>>>>>
>>>>> Are we making any changes to jenins parsing scripts?
>>>>>
>>>>> Details
>>>>> <https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_PR/20/>
>>>>> [image: @asfgit] <https://github.com/asfgit>
>>>>> Apache Flink Runner ValidatesRunner Tests — FAILURE
>>>>> Details
>>>>> <https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_PR/63/>
>>>>> [image: @asfgit] <https://github.com/asfgit>
>>>>> Apache Gearpump Runner ValidatesRunner Tests — FAILURE
>>>>> Details
>>>>> <https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_PR/16/>
>>>>> [image: @asfgit] <https://github.com/asfgit>
>>>>> Apache Samza Runner ValidatesRunner Tests — FAILURE
>>>>> Details
>>>>> <https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_PR/17/>
>>>>> [image: @asfgit] <https://github.com/asfgit>
>>>>> Apache Spark Runner ValidatesRunner Tests — FAILURE
>>>>> Details
>>>>> <https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_PR/73/>
>>>>> [image: @asfgit] <https://github.com/asfgit>
>>>>> Google Cloud Dataflow Runner PortabilityApi ValidatesRunner Tests —
>>>>> FAILURE
>>>>> Details
>>>>> <https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_PortabilityApi_Dataflow_PR/39/>
>>>>> [image: @asfgit] <https://github.com/asfgit>
>>>>> Google Cloud Dataflow Runner Python ValidatesContainer Tests — FAILURE
>>>>> Details
>>>>> <https://builds.apache.org/job/beam_PostCommit_Py_ValCont_PR/54/>
>>>>> [image: @asfgit] <https://github.com/asfgit>
>>>>> Google Cloud Dataflow Runner Python ValidatesRunner Tests — FAILURE
>>>>> Details
>>>>> <https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow_PR/79/>
>>>>> [image: @asfgit] <https://github.com/asfgit>
>>>>> Google Cloud Dataflow Runner ValidatesRunner Tests — FAILURE
>>>>> Details
>>>>> <https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_PR/59/>
>>>>> [image: @asfgit] <https://github.com/asfgit>
>>>>> Java Flink PortableValidatesRunner Batch Tests — FAILURE
>>>>> Details
>>>>> <https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch_PR/50/>
>>>>> [image: @asfgit] <https://github.com/asfgit>
>>>>> Java Flink PortableValidatesRunner Streaming Tests — FAILURE
>>>>> Details
>>>>> <https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming_PR/77/>
>>>>>
>>>>> Thanks,
>>>>> Ankur
>>>>>
>>>>


All validates runner tests seems to be broken.

2019-05-14 Thread Ankur Goenka
Hi,

Following tests seems to be broken because of "Project 'unners' not found
in root project 'beam'."
The command getting executed on Jenkins is
gradlew --continue --max-workers=12 -Dorg.gradle.jvmargs=-Xms2g
-Dorg.gradle.jvmargs=-Xmx4g :unners:samza:validatesRunner
causing the failure.
The same is happening on the master
https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_PR/19/console

Are we making any changes to jenins parsing scripts?

Details

[image: @asfgit] 
Apache Flink Runner ValidatesRunner Tests — FAILURE
Details

[image: @asfgit] 
Apache Gearpump Runner ValidatesRunner Tests — FAILURE
Details

[image: @asfgit] 
Apache Samza Runner ValidatesRunner Tests — FAILURE
Details

[image: @asfgit] 
Apache Spark Runner ValidatesRunner Tests — FAILURE
Details

[image: @asfgit] 
Google Cloud Dataflow Runner PortabilityApi ValidatesRunner Tests — FAILURE
Details

[image: @asfgit] 
Google Cloud Dataflow Runner Python ValidatesContainer Tests — FAILURE
Details 
[image: @asfgit] 
Google Cloud Dataflow Runner Python ValidatesRunner Tests — FAILURE
Details

[image: @asfgit] 
Google Cloud Dataflow Runner ValidatesRunner Tests — FAILURE
Details

[image: @asfgit] 
Java Flink PortableValidatesRunner Batch Tests — FAILURE
Details

[image: @asfgit] 
Java Flink PortableValidatesRunner Streaming Tests — FAILURE
Details


Thanks,
Ankur


Re: Do we maintain offline artifact version in javadocs sdks/java/javadoc/build.gradle

2019-05-13 Thread Ankur Goenka
Given that this simplifies the release process and keeps the javadocs upto
date., IMO this looks to be a good tradeoff.

*From: *Lukasz Cwik 
*Date: *Mon, May 13, 2019 at 5:09 PM
*To: *dev

While I was looking for the latest versions of docs, I found
> http://javadoc.io. It fetches the javadoc from Maven central and unpacks
> that jar displaying its contents to users. This means that we could make
> all our non Apache Beam javadoc links goto javadoc.io instead of trying
> to find the official project website that maintains them (which sometimes
> there isn't one or they only have the javadoc for the latest).
>
> Has anyone had experience using javadoc.io in the past?
> Would there by any concerns about swapping to use javadoc.io instead of
> the official versions hosted on project pages?
>
> I have an example commit here:
> https://github.com/lukecwik/incubator-beam/commit/94a97fbc83883496feae071cc44689f5fb2f5743
> You can generate the aggregate javadoc via "./gradlew -p sdks/java/javadoc
> aggregateJavadoc" which builds
> "./sdks/java/javadoc/build/docs/javadoc/index.html"
>
> If people are happy with javadoc.io, should we migrate from using
> offlinelinks to links so we don't have to maintain the package lists in
> https://github.com/apache/beam/blob/master/sdks/java/javadoc/?
> This would mean that we would be able to just enumerate all the
> dependencies we have in Apache Beam and generate all the javadoc without
> maintaining a list of packages or dependencies. It would mean that you
> would need to have an internet connection to build the aggregated javadoc
> because the javadoc tool would need to fetch the package-list files from
> javadoc.io. The delta for that change is
> https://github.com/lukecwik/incubator-beam/commit/8cc7c53139d0eecad0ec994b9a313cf31645
>
> From a Javadoc correctness and maintenance point of view, this seems much
> simpler overall to me.
>
>
> *From: *Lukasz Cwik 
> *Date: *Mon, May 13, 2019 at 1:39 PM
> *To: *dev
>
> I see. We should be able to fix that to do what we do when we embed the
>> versions of dependencies in our Maven archetypes like so[1]:
>> dependencies.create(project.library.java.google_api_client).getVersion()
>>
>> I'll send out a PR updating the javadoc pulling to be based off the
>> version and open up a PR.
>>
>> 1:
>> https://github.com/apache/beam/blob/abece47cc1c1c88a519e54e67a2d358b439cf69c/sdks/java/maven-archetypes/examples/build.gradle#L29
>>
>> *From: *Kenneth Knowles 
>> *Date: *Mon, May 13, 2019 at 11:57 AM
>> *To: *dev
>>
>> I expect Ankur is referring to the hardcoded linkOffline bits here:
>>> https://github.com/apache/beam/blob/master/sdks/java/javadoc/build.gradle#L78
>>>  since
>>> the versions are in the URLs, and also the downloaded files used are from
>>> those versions. This helps with flakiness, since otherwise it has to
>>> download stuff to figure out which identifiers are linkable.
>>>
>>> Kenn
>>>
>>> *From: *Lukasz Cwik 
>>> *Date: *Mon, May 13, 2019 at 9:04 AM
>>> *To: *dev
>>>
>>> What is the difference between the two files you are referring to?
>>>>
>>>> Note that sdks/java/javadoc/build.gradle is meant to produce one giant
>>>> javadoc across many modules that users would be interested in
>>>> (core/extensions/io/...) meant to be published on the website.
>>>>
>>>> *From: *Ankur Goenka 
>>>> *Date: *Fri, May 10, 2019 at 5:21 PM
>>>> *To: *dev
>>>>
>>>> Hi,
>>>>>
>>>>> I see that the sdks/java/javadoc/build.gradle is not in sync with
>>>>> org/apache/beam/gradle/BeamModulePlugin.groovy .
>>>>> I wanted to check if we are maintaining or not based on that we can
>>>>> either remove or update sdks/java/javadoc/build.gradle.
>>>>>
>>>>> Thanks,
>>>>> Ankur
>>>>>
>>>>


Re: Cassandra and hadoop test broken on master and in previous releases

2019-05-11 Thread Ankur Goenka
Hi,

I validated the tests on AdoptOpenJDK and HotSpot where the tests succeed.
It seems that my earlier JDK had some issue.

openjdk version "1.8.0_212"
OpenJDK Runtime Environment (AdoptOpenJDK)(build 1.8.0_212-b03)
OpenJDK 64-Bit Server VM (AdoptOpenJDK)(build 25.212-b03, mixed mode)

java version "1.8.0_202"
Java(TM) SE Runtime Environment (build 1.8.0_202-b08)
Java HotSpot(TM) 64-Bit Server VM (build 25.202-b08, mixed mode)

Thanks for validating the tests.

Thanks,
Ankur


*From: *Ankur Goenka 
*Date: *Sat, May 11, 2019 at 7:21 PM
*To: *dev

Thanks for the quick validation. I will try to run it on oracle SDK,
>
> *From: *Jean-Baptiste Onofré 
> *Date: *Sat, May 11, 2019 at 4:37 AM
> *To: * 
>
> Hi,
>>
>> It works fine on my machine. I'm using this JDK:
>>
>> java version "1.8.0_172"
>> Java(TM) SE Runtime Environment (build 1.8.0_172-b11)
>> Java HotSpot(TM) 64-Bit Server VM (build 25.172-b11, mixed mode)
>>
>> Regarding your Gradle scan, the JVM is crashing.
>>
>> Do you have a chance to try another JDK ?
>>
>> I will try with OpenJDK 1.8.0_181.
>>
>> Regards
>> JB
>>
>> On 11/05/2019 01:34, Ankur Goenka wrote:
>> > Hi,
>> >
>> > Cassandra and Hadoop tests for targets :beam-sdks-java-io-cassandra:test
>> > :beam-sdks-java-io-hadoop-format:test are failing at master and in
>> > 2.12.0 release with jvm crash.
>> >
>> > Gradle Scan: https://gradle.com/s/rhseoqeouup6e
>> >
>> > Any help on the debugging failure will be useful.
>> >
>> > Thanks,
>> > Ankur
>>
>> --
>> Jean-Baptiste Onofré
>> jbono...@apache.org
>> http://blog.nanthrax.net
>> Talend - http://www.talend.com
>>
>


Re: Cassandra and hadoop test broken on master and in previous releases

2019-05-11 Thread Ankur Goenka
Thanks for the quick validation. I will try to run it on oracle SDK,

*From: *Jean-Baptiste Onofré 
*Date: *Sat, May 11, 2019 at 4:37 AM
*To: * 

Hi,
>
> It works fine on my machine. I'm using this JDK:
>
> java version "1.8.0_172"
> Java(TM) SE Runtime Environment (build 1.8.0_172-b11)
> Java HotSpot(TM) 64-Bit Server VM (build 25.172-b11, mixed mode)
>
> Regarding your Gradle scan, the JVM is crashing.
>
> Do you have a chance to try another JDK ?
>
> I will try with OpenJDK 1.8.0_181.
>
> Regards
> JB
>
> On 11/05/2019 01:34, Ankur Goenka wrote:
> > Hi,
> >
> > Cassandra and Hadoop tests for targets :beam-sdks-java-io-cassandra:test
> > :beam-sdks-java-io-hadoop-format:test are failing at master and in
> > 2.12.0 release with jvm crash.
> >
> > Gradle Scan: https://gradle.com/s/rhseoqeouup6e
> >
> > Any help on the debugging failure will be useful.
> >
> > Thanks,
> > Ankur
>
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>


Do we maintain offline artifact version in javadocs sdks/java/javadoc/build.gradle

2019-05-10 Thread Ankur Goenka
Hi,

I see that the sdks/java/javadoc/build.gradle is not in sync with
org/apache/beam/gradle/BeamModulePlugin.groovy .
I wanted to check if we are maintaining or not based on that we can either
remove or update sdks/java/javadoc/build.gradle.

Thanks,
Ankur


Cassandra and hadoop test broken on master and in previous releases

2019-05-10 Thread Ankur Goenka
Hi,

Cassandra and Hadoop tests for targets :beam-sdks-java-io-cassandra:test
:beam-sdks-java-io-hadoop-format:test are failing at master and in 2.12.0
release with jvm crash.

Gradle Scan: https://gradle.com/s/rhseoqeouup6e

Any help on the debugging failure will be useful.

Thanks,
Ankur


Re: [PROPOSAL] Preparing for Beam 2.13.0 release

2019-05-09 Thread Ankur Goenka
Branch for 2.13 is cut at
revision number a0953215e08d756eb64c23c1916d483ba6e73945
Branch location: https://github.com/apache/beam/tree/release-2.13.0

Thanks,
Ankur

*From: *Ankur Goenka 
*Date: *Mon, May 6, 2019 at 11:27 AM
*To: *dev

Gentle reminder.
>
> We will be cutting the 2.13.0 branch day after tomorrow ( May 8th ).
> Please mark the release blockers as fixed and Fix Version as 2.13.0.
>
> Thanks,
> Ankur
>
> On Fri, Apr 26, 2019 at 2:28 PM Ankur Goenka  wrote:
>
>> Link in the download link page will be useful.
>>
>> Additionally, to notify user about the next release, shall we add the
>> expected date of the next cut/release to sdk binary so that it's printed on
>> console once every day past the cut date?
>>
>> Something like,
>> Print "Please check new version of Beam" after June 19th in Beam 2.13.0
>> Print "Please check new version of Beam" after July 31st in Beam 2.14.0
>>
>> On Fri, Apr 26, 2019 at 1:13 PM Ismaël Mejía  wrote:
>>
>>> Ah that works, thanks Anton, quite hard to see. Thanks!
>>>
>>> Kenneth maybe for awareness a link in the downloads page will be more
>>> 'visible'.
>>>
>>> On Fri, Apr 26, 2019 at 9:32 PM Kenneth Knowles  wrote:
>>>
>>>> By the way, that link is referenced by
>>>> https://beam.apache.org/community/policies/
>>>>
>>>> Is there a better way to surface the calendar?
>>>>
>>>> Kenn
>>>>
>>>> On Fri, Apr 26, 2019 at 12:23 PM Anton Kedin  wrote:
>>>>
>>>>> Following Ankur's link I see a "[+]GoogleCalendar" button in the
>>>>> bottom right corner of the page. Clicking it opens the google calendar and
>>>>> prompts to add the Beam Calendar (at least in Chrome). Ismael, do you have
>>>>> a similar button in your case?
>>>>>
>>>>> [image: image.png]
>>>>>
>>>>> Regards,
>>>>> Anton
>>>>>
>>>>>
>>>>> On Fri, Apr 26, 2019 at 5:07 AM Ismaël Mejía 
>>>>> wrote:
>>>>>
>>>>>> Ankur, do you have the equivalent link that I can use to subscribe to
>>>>>> that calendar via google calendars?
>>>>>> The link seems to work only to see the calendar in a webpage.
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>> On Fri, Apr 26, 2019 at 1:42 PM Maximilian Michels 
>>>>>> wrote:
>>>>>> >
>>>>>> > Hi Ankur,
>>>>>> >
>>>>>> > Sounds good. This will ensure that we stay on track regarding the
>>>>>> > release cycle.
>>>>>> >
>>>>>> > Thanks,
>>>>>> > Max
>>>>>> >
>>>>>> > On 26.04.19 02:59, Ankur Goenka wrote:
>>>>>> > > Correction, The planned cut date is May 8th.
>>>>>> > >
>>>>>> > > On Thu, Apr 25, 2019 at 4:24 PM Ankur Goenka >>>>> > > <mailto:goe...@google.com>> wrote:
>>>>>> > >
>>>>>> > > Hello Beam community!
>>>>>> > >
>>>>>> > > Beam 2.13 release branch cut date is April 8th according to
>>>>>> the
>>>>>> > > release calendar [1]. I would like to volunteer myself to do
>>>>>> this
>>>>>> > > release. I intend to cut the branch as planned on April 8th
>>>>>> and
>>>>>> > > cherrypick fixes if needed.
>>>>>> > >
>>>>>> > > If you have releasing blocking issues for 2.13 please mark
>>>>>> their
>>>>>> > > "Fix Version" as 2.13.0. Please use 2.14.0 release in JIRA in
>>>>>> case
>>>>>> > > you would like to move any non-blocking issues to that
>>>>>> version.
>>>>>> > >
>>>>>> > > Does this sound reasonable?
>>>>>> > >
>>>>>> > > Thanks,
>>>>>> > > Ankur
>>>>>> > >
>>>>>> > > [1]
>>>>>> > >
>>>>>> https://calendar.google.com/calendar/embed?src=0p73sl034k80oob7seouanigd0%40group.calendar.google.com=America%2FLos_Angeles
>>>>>> > >
>>>>>>
>>>>>


Re: Better naming for runner specific options

2019-05-06 Thread Ankur Goenka
Having namespaces for option makes sense.
I think, along with a help command to print all the options given the
runner name will be useful.
As for the scope of name spacing, I think that assigning a logical name
space gives more flexibility around how and where we declare options. It
also make future refactoring possible.


On Mon, May 6, 2019 at 7:50 AM Maximilian Michels  wrote:

> Good points. As already mentioned there is no namespacing between the
> different pipeline option classes. In particular, there is no separate
> namespace for system and user options which is most concerning.
>
> I'm in favor of an optional namespace using the class name of the
> defining pipeline option class. That way we would at least be able to
> resolve duplicate option names. For example, if there were was "optionX"
> in class A and B, we could use "A#optionX" to refer to it from class A.
>
> -Max
>
> On 04.05.19 02:23, Reza Rokni wrote:
> > Great point Lukasz, worker machine could be relevant to multiple runners.
> >
> > Perhaps for parameters that could have multiple runner relevance, the
> > doc could be rephrased to reflect its potential multiple uses. For
> > example change the help information to start with a generic reference "
> > worker type on the runner" followed by runner specific behavior expected
> > for RunnerA, RunnerB etc...
> >
> > But I do worry that without prefix even generic options could cause
> > confusion. For example if the use of --network is substantially
> > different between runnerA vs runnerB then the user will only have this
> > information by reading the help. It will also mean that a pipeline which
> > is expected to work both on-premise on RunnerA and in the cloud on
> > RunnerB could fail because the format of the options to pass to
> > --network are different.
> >
> > Cheers
> >
> > Reza
> >
> > *From: *Kenneth Knowles mailto:k...@apache.org>>
> > *Date: *Sat, 4 May 2019 at 03:54
> > *To: *dev
> >
> > Even though they are in classes named for specific runners, they are
> > not namespaced. All PipelineOptions exist in a global namespace so
> > they need to be careful to be very precise.
> >
> > It is a good point that even though they may be multiple uses for
> > "machine type" they are probably not going to both happen at the
> > same time.
> >
> > If it becomes an issue, another thing we could do would be to add
> > namespacing support so options have less spooky action, or at least
> > have a way to resolve it when it happens on accident.
> >
> > Kenn
> >
> > On Fri, May 3, 2019 at 10:43 AM Chamikara Jayalath
> > mailto:chamik...@google.com>> wrote:
> >
> > Also, we do have runner specific options classes where truly
> > runner specific options can go.
> >
> >
> https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/options/DataflowPipelineOptions.java
> >
> https://github.com/apache/beam/blob/master/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkPipelineOptions.java
> >
> > On Fri, May 3, 2019 at 9:50 AM Ahmet Altay  > > wrote:
> >
> > I agree, that is a good point.
> >
> > *From: *Lukasz Cwik  lc...@google.com>>
> > *Date: *Fri, May 3, 2019 at 9:37 AM
> > *To: *dev
> >
> > The concept of a machine type isn't necessarily limited
> > to Dataflow. If it made sense for a runner, they could
> > use AWS/Azure machine types as well.
> >
> > On Fri, May 3, 2019 at 9:32 AM Ahmet Altay
> > mailto:al...@google.com>> wrote:
> >
> > This idea was discussed in a PR a few months ago,
> > and JIRA was filed as a follow up [1]. IMO, it makes
> > sense to use a namespace prefix. The primary issue
> > here is that, such a change will very likely be a
> > backward incompatible change and would be hard to do
> > before the next major version.
> >
> > [1] https://issues.apache.org/jira/browse/BEAM-6531
> >
> > *From: *Reza Rokni  > >
> > *Date: *Thu, May 2, 2019 at 8:00 PM
> > *To: *  > >
> >
> > Hi,
> >
> > Was reading this SO question:
> >
> >
> https://stackoverflow.com/questions/53833171/googlecloudoptions-doesnt-have-all-options-that-pipeline-options-has
> >
> > And noticed that in
> >
> >
> https://beam.apache.org/releases/pydoc/2.12.0/_modules/apache_beam/options/pipeline_options.html#WorkerOptions
> >
> > The option is called --worker_machine_type.
> >
> > I wonder if runner 

Re: [PROPOSAL] Preparing for Beam 2.13.0 release

2019-05-06 Thread Ankur Goenka
Gentle reminder.

We will be cutting the 2.13.0 branch day after tomorrow ( May 8th ).
Please mark the release blockers as fixed and Fix Version as 2.13.0.

Thanks,
Ankur

On Fri, Apr 26, 2019 at 2:28 PM Ankur Goenka  wrote:

> Link in the download link page will be useful.
>
> Additionally, to notify user about the next release, shall we add the
> expected date of the next cut/release to sdk binary so that it's printed on
> console once every day past the cut date?
>
> Something like,
> Print "Please check new version of Beam" after June 19th in Beam 2.13.0
> Print "Please check new version of Beam" after July 31st in Beam 2.14.0
>
> On Fri, Apr 26, 2019 at 1:13 PM Ismaël Mejía  wrote:
>
>> Ah that works, thanks Anton, quite hard to see. Thanks!
>>
>> Kenneth maybe for awareness a link in the downloads page will be more
>> 'visible'.
>>
>> On Fri, Apr 26, 2019 at 9:32 PM Kenneth Knowles  wrote:
>>
>>> By the way, that link is referenced by
>>> https://beam.apache.org/community/policies/
>>>
>>> Is there a better way to surface the calendar?
>>>
>>> Kenn
>>>
>>> On Fri, Apr 26, 2019 at 12:23 PM Anton Kedin  wrote:
>>>
>>>> Following Ankur's link I see a "[+]GoogleCalendar" button in the bottom
>>>> right corner of the page. Clicking it opens the google calendar and prompts
>>>> to add the Beam Calendar (at least in Chrome). Ismael, do you have a
>>>> similar button in your case?
>>>>
>>>> [image: image.png]
>>>>
>>>> Regards,
>>>> Anton
>>>>
>>>>
>>>> On Fri, Apr 26, 2019 at 5:07 AM Ismaël Mejía  wrote:
>>>>
>>>>> Ankur, do you have the equivalent link that I can use to subscribe to
>>>>> that calendar via google calendars?
>>>>> The link seems to work only to see the calendar in a webpage.
>>>>>
>>>>> Thanks.
>>>>>
>>>>> On Fri, Apr 26, 2019 at 1:42 PM Maximilian Michels 
>>>>> wrote:
>>>>> >
>>>>> > Hi Ankur,
>>>>> >
>>>>> > Sounds good. This will ensure that we stay on track regarding the
>>>>> > release cycle.
>>>>> >
>>>>> > Thanks,
>>>>> > Max
>>>>> >
>>>>> > On 26.04.19 02:59, Ankur Goenka wrote:
>>>>> > > Correction, The planned cut date is May 8th.
>>>>> > >
>>>>> > > On Thu, Apr 25, 2019 at 4:24 PM Ankur Goenka >>>> > > <mailto:goe...@google.com>> wrote:
>>>>> > >
>>>>> > > Hello Beam community!
>>>>> > >
>>>>> > > Beam 2.13 release branch cut date is April 8th according to the
>>>>> > > release calendar [1]. I would like to volunteer myself to do
>>>>> this
>>>>> > > release. I intend to cut the branch as planned on April 8th and
>>>>> > > cherrypick fixes if needed.
>>>>> > >
>>>>> > > If you have releasing blocking issues for 2.13 please mark
>>>>> their
>>>>> > > "Fix Version" as 2.13.0. Please use 2.14.0 release in JIRA in
>>>>> case
>>>>> > > you would like to move any non-blocking issues to that version.
>>>>> > >
>>>>> > > Does this sound reasonable?
>>>>> > >
>>>>> > > Thanks,
>>>>> > > Ankur
>>>>> > >
>>>>> > > [1]
>>>>> > >
>>>>> https://calendar.google.com/calendar/embed?src=0p73sl034k80oob7seouanigd0%40group.calendar.google.com=America%2FLos_Angeles
>>>>> > >
>>>>>
>>>>


Kotlin iterator error

2019-05-03 Thread Ankur Goenka
Hi,

A beam user on stackoverflow has posted issue while using kotlin sdk.
https://stackoverflow.com/questions/55908999/kotlin-iterable-not-supported-in-apache-beam/55911859#55911859
I am not very familiar with kotlin so can someone please take a look.

Thanks,
Ankur


Re: [ANNOUNCE] New committer announcement: Udi Meiri

2019-05-03 Thread Ankur Goenka
Congratulations Udi!

On Fri, May 3, 2019 at 3:00 PM Connell O'Callaghan 
wrote:

> Well done Udi!!! Congratulations and thank you for your contributions!!!
>
> Kenn thank you for sharing!!!
>
> On Fri, May 3, 2019 at 2:49 PM Yifan Zou  wrote:
>
>> Thanks Udi and congratulations!
>>
>> On Fri, May 3, 2019 at 2:47 PM Robin Qiu  wrote:
>>
>>> Congratulations Udi!!!
>>>
>>> *From: *Ruoyun Huang 
>>> *Date: *Fri, May 3, 2019 at 2:39 PM
>>> *To: * 
>>>
>>> Congratulations Udi!

 On Fri, May 3, 2019 at 2:30 PM Ahmet Altay  wrote:

> Congratulations, Udi!
>
> *From: *Kyle Weaver 
> *Date: *Fri, May 3, 2019 at 2:11 PM
> *To: * 
>
> Congratulations Udi! I look forward to sending you all my reviews for
>> the next month (just kidding :)
>>
>> Kyle Weaver | Software Engineer | github.com/ibzib |
>> kcwea...@google.com | +1650203
>>
>> On Fri, May 3, 2019 at 1:52 PM Charles Chen  wrote:
>> >
>> > Thank you Udi!
>> >
>> > On Fri, May 3, 2019, 1:51 PM Aizhamal Nurmamat kyzy <
>> aizha...@google.com> wrote:
>> >>
>> >> Congratulations, Udi! Thank you for all your contributions!!!
>> >>
>> >> From: Pablo Estrada 
>> >> Date: Fri, May 3, 2019 at 1:45 PM
>> >> To: dev
>> >>
>> >>> Thanks Udi and congrats!
>> >>>
>> >>> On Fri, May 3, 2019 at 1:44 PM Kenneth Knowles 
>> wrote:
>> 
>>  Hi all,
>> 
>>  Please join me and the rest of the Beam PMC in welcoming a new
>> committer: Udi Meiri.
>> 
>>  Udi has been contributing to Beam since late 2017, starting with
>> HDFS support in the Python SDK and continuing with a ton of Python work. 
>> I
>> also will highlight his work on community-building infrastructure,
>> including documentation, experiments with ways to find reviewers for pull
>> requests, gradle build work, analyzing and reducing build times.
>> 
>>  In consideration of Udi's contributions, the Beam PMC trusts Udi
>> with the responsibilities of a Beam committer [1].
>> 
>>  Thank you, Udi, for your contributions.
>> 
>>  Kenn
>> 
>>  [1]
>> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>>
>

 --
 
 Ruoyun  Huang




Re: Congrats to Beam's first 6 Google Open Source Peer Bonus recipients!

2019-05-02 Thread Ankur Goenka
Congratulations and thank you for making Beam awesome!

*From: *Chamikara Jayalath 
*Date: *Thu, May 2, 2019, 4:03 PM
*To: *dev

Congratulations!
>
> On Thu, May 2, 2019 at 10:28 AM Udi Meiri  wrote:
>
>> Congrats everyone!
>>
>> On Thu, May 2, 2019 at 9:55 AM Ahmet Altay  wrote:
>>
>>> Congratulations!
>>>
>>> On Thu, May 2, 2019 at 9:54 AM Yifan Zou  wrote:
>>>
 Congratulations! Well deserved!

 On Thu, May 2, 2019 at 9:37 AM Rui Wang  wrote:

> Congratulations!
>
>
> -Rui
>
> On Thu, May 2, 2019 at 8:23 AM Michael Luckey 
> wrote:
>
>> Congrats! Well deserved!
>>
>> On Thu, May 2, 2019 at 3:29 PM Alexey Romanenko <
>> aromanenko@gmail.com> wrote:
>>
>>> Congrats!
>>>
>>> On 2 May 2019, at 10:06, Gleb Kanterov  wrote:
>>>
>>> Congratulations! Well deserved!
>>>
>>> On Thu, May 2, 2019 at 10:00 AM Ismaël Mejía 
>>> wrote:
>>>
 Congrats everyone !

 On Thu, May 2, 2019 at 9:14 AM Robert Bradshaw 
 wrote:

> Congratulation, and thanks for all the great contributions each
> one of you has made to Beam!
>
> On Thu, May 2, 2019 at 5:51 AM Ruoyun Huang 
> wrote:
>
>> Congratulations everyone!  Well deserved!
>>
>> On Wed, May 1, 2019 at 8:38 PM Kenneth Knowles 
>> wrote:
>>
>>> Congrats! All well deserved!
>>>
>>> Kenn
>>>
>>> On Wed, May 1, 2019 at 8:09 PM Reza Rokni 
>>> wrote:
>>>
 Congratulations!

 On Thu, 2 May 2019 at 10:53, Connell O'Callaghan <
 conne...@google.com> wrote:

> Well done - congratulations to you all!!! Rose thank you for
> sharing this news!!!
>
> On Wed, May 1, 2019 at 19:45 Rose Nguyen 
> wrote:
>
>> Matthias Baetens, Lukazs Gajowy, Suneel Marthi, Maximilian
>> Michels, Alex Van Boxel, and Thomas Weise:
>>
>> Thank you for your exceptional contributions to Apache
>> Beam. I'm looking forward to seeing this project grow and for 
>> more folks
>> to contribute and be recognized! Everyone can read more about 
>> this award on
>> the Google Open Source blog:
>> https://opensource.googleblog.com/2019/04/google-open-source-peer-bonus-winners.html
>>
>> Cheers,
>> --
>> Rose Thị Nguyễn
>>
>

 --
 This email may be confidential and privileged. If you received
 this communication by mistake, please don't forward it to anyone 
 else,
 please erase all copies and attachments, and please let me know 
 that it has
 gone to the wrong person.
 The above terms reflect a potential business arrangement, are
 provided solely as a basis for further discussion, and are not 
 intended to
 be and do not constitute a legally binding obligation. No legally 
 binding
 obligations will be created, implied, or inferred until an 
 agreement in
 final form is executed in writing by all parties involved.

>>>
>>
>> --
>> 
>> Ruoyun  Huang
>>
>>
>>>
>>> --
>>> Cheers,
>>> Gleb
>>>
>>>
>>>


Re: Structured streaming based spark runner.

2019-04-30 Thread Ankur Goenka
Exciting! Thanks Etienne for sharing the design and progress.

On Tue, Apr 30, 2019 at 10:11 AM Etienne Chauchot 
wrote:

> Hi guys,
> As part of the ongoing work on spark runner POC based on structured
> streaming framework, I sketched up a design doc (1) to share context and
> design principles.
> Feel free to comment.
>
> [1]  
> https://s.apache.org/spark-structured-streaming-runner
>
> Etienne
>


Re: Enable security for data channels in portability

2019-04-26 Thread Ankur Goenka
In an offline chat with Hai, It seem useful for users to be able to provide
custom authentication like a secret which can be distributed out of band by
the infrastructure and can be provided via file system, rpc to another
service etc.
gRPC already has some mechanism for standard and custom authentication[1].
Instrumenting gRPC channel using command line option or environment
variable on the worker machines can be be useful.

[1] https://grpc.io/docs/guides/auth/

On Fri, Apr 26, 2019 at 4:33 PM Lukasz Cwik  wrote:

> The link to the ApiServiceDescriptor is
> https://github.com/apache/beam/blob/476e17ed6badd4d5c06c4caf8a824805f40a8e7a/model/pipeline/src/main/proto/endpoints.proto#L31
>
> On Fri, Apr 26, 2019 at 4:32 PM Lukasz Cwik  wrote:
>
>> I had originally taken a look at this a while ago but not much has
>> progressed since then. The original idea was that the ApiServiceDescriptor
>> would be extended to support secure ways of authentication/communication. I
>> was prototyping with an OAuth2 client credentials grant at the time but
>> dropped it as other things were more important. The only currently
>> supported mode across all SDKs is an implicit authenticated/secure mode
>> where all communication is assumed to already be encrypted/private (e.g.
>> over VPN that is managed externally with trusted services) and hence the
>> gRPC channel itself is insecure and there is no authentication being
>> performed.
>>
>> Even though sdk_worker.py seems like it supports credentials, no one
>> invokes the constructor with credentials enabled as can be seen by this
>> comment by Robert[1].
>>
>> For SSL/TLS support it seems like we need some way to configure a runner
>> to be told to use SSL/TLS (potentially with a custom private key and trust
>> chain). Do you have some suggestions on how we add support for passing
>> around channel/call[2] credentials?
>>
>> 1:
>> https://github.com/apache/beam/blob/476e17ed6badd4d5c06c4caf8a824805f40a8e7a/sdks/python/apache_beam/runners/worker/sdk_worker_main.py#L139
>> 2: https://grpc.io/docs/guides/auth/
>>
>> On Tue, Apr 23, 2019 at 5:06 PM Hai Lu  wrote:
>>
>>> Hi,
>>>
>>> This is Hai from LinkedIn. Daniel and I have been working on
>>> productionizing Samza portable runner. BTW, Daniel didn't mention in his
>>> previous email that he has enabled and validated Python 3 for Samza runner
>>> and it worked smoothly. Kudos to the team!
>>>
>>> Here I have a few security related questions about portability. At
>>> LinkedIn, we enable SSL/TLS and ACLs for Kafka data and any data exchange.
>>> In the case of portable runner, we're required to secure the data channels
>>> between Java and Python processes as well because our Samza jobs are
>>> running in a multi-tenant environment. While I'm currently working on this
>>> on our internal branch, I do want to keep it clean and consistent with the
>>> master branch.
>>>
>>> My questions are: were there any plans/thoughts around security for
>>> portability? I see that sdk_worker.py does have some codes to create
>>> secured gRPC channels; is anyone actually leveraging those codes? I don't
>>> see on the Java side any work is done, though.
>>>
>>> Thanks,
>>> Hai Lu
>>>
>>


Re: [PROPOSAL] Preparing for Beam 2.13.0 release

2019-04-26 Thread Ankur Goenka
Link in the download link page will be useful.

Additionally, to notify user about the next release, shall we add the
expected date of the next cut/release to sdk binary so that it's printed on
console once every day past the cut date?

Something like,
Print "Please check new version of Beam" after June 19th in Beam 2.13.0
Print "Please check new version of Beam" after July 31st in Beam 2.14.0

On Fri, Apr 26, 2019 at 1:13 PM Ismaël Mejía  wrote:

> Ah that works, thanks Anton, quite hard to see. Thanks!
>
> Kenneth maybe for awareness a link in the downloads page will be more
> 'visible'.
>
> On Fri, Apr 26, 2019 at 9:32 PM Kenneth Knowles  wrote:
>
>> By the way, that link is referenced by
>> https://beam.apache.org/community/policies/
>>
>> Is there a better way to surface the calendar?
>>
>> Kenn
>>
>> On Fri, Apr 26, 2019 at 12:23 PM Anton Kedin  wrote:
>>
>>> Following Ankur's link I see a "[+]GoogleCalendar" button in the bottom
>>> right corner of the page. Clicking it opens the google calendar and prompts
>>> to add the Beam Calendar (at least in Chrome). Ismael, do you have a
>>> similar button in your case?
>>>
>>> [image: image.png]
>>>
>>> Regards,
>>> Anton
>>>
>>>
>>> On Fri, Apr 26, 2019 at 5:07 AM Ismaël Mejía  wrote:
>>>
>>>> Ankur, do you have the equivalent link that I can use to subscribe to
>>>> that calendar via google calendars?
>>>> The link seems to work only to see the calendar in a webpage.
>>>>
>>>> Thanks.
>>>>
>>>> On Fri, Apr 26, 2019 at 1:42 PM Maximilian Michels 
>>>> wrote:
>>>> >
>>>> > Hi Ankur,
>>>> >
>>>> > Sounds good. This will ensure that we stay on track regarding the
>>>> > release cycle.
>>>> >
>>>> > Thanks,
>>>> > Max
>>>> >
>>>> > On 26.04.19 02:59, Ankur Goenka wrote:
>>>> > > Correction, The planned cut date is May 8th.
>>>> > >
>>>> > > On Thu, Apr 25, 2019 at 4:24 PM Ankur Goenka >>> > > <mailto:goe...@google.com>> wrote:
>>>> > >
>>>> > > Hello Beam community!
>>>> > >
>>>> > > Beam 2.13 release branch cut date is April 8th according to the
>>>> > > release calendar [1]. I would like to volunteer myself to do
>>>> this
>>>> > > release. I intend to cut the branch as planned on April 8th and
>>>> > > cherrypick fixes if needed.
>>>> > >
>>>> > > If you have releasing blocking issues for 2.13 please mark their
>>>> > > "Fix Version" as 2.13.0. Please use 2.14.0 release in JIRA in
>>>> case
>>>> > > you would like to move any non-blocking issues to that version.
>>>> > >
>>>> > > Does this sound reasonable?
>>>> > >
>>>> > > Thanks,
>>>> > > Ankur
>>>> > >
>>>> > > [1]
>>>> > >
>>>> https://calendar.google.com/calendar/embed?src=0p73sl034k80oob7seouanigd0%40group.calendar.google.com=America%2FLos_Angeles
>>>> > >
>>>>
>>>


Re: [PROPOSAL] Preparing for Beam 2.13.0 release

2019-04-25 Thread Ankur Goenka
Correction, The planned cut date is May 8th.

On Thu, Apr 25, 2019 at 4:24 PM Ankur Goenka  wrote:

> Hello Beam community!
>
> Beam 2.13 release branch cut date is April 8th according to the release
> calendar [1]. I would like to volunteer myself to do this release. I intend
> to cut the branch as planned on April 8th and cherrypick fixes if needed.
>
> If you have releasing blocking issues for 2.13 please mark their "Fix
> Version" as 2.13.0. Please use 2.14.0 release in JIRA in case you would
> like to move any non-blocking issues to that version.
>
> Does this sound reasonable?
>
> Thanks,
> Ankur
>
> [1]
> https://calendar.google.com/calendar/embed?src=0p73sl034k80oob7seouanigd0%40group.calendar.google.com=America%2FLos_Angeles
>


[PROPOSAL] Preparing for Beam 2.13.0 release

2019-04-25 Thread Ankur Goenka
Hello Beam community!

Beam 2.13 release branch cut date is April 8th according to the release
calendar [1]. I would like to volunteer myself to do this release. I intend
to cut the branch as planned on April 8th and cherrypick fixes if needed.

If you have releasing blocking issues for 2.13 please mark their "Fix
Version" as 2.13.0. Please use 2.14.0 release in JIRA in case you would
like to move any non-blocking issues to that version.

Does this sound reasonable?

Thanks,
Ankur

[1]
https://calendar.google.com/calendar/embed?src=0p73sl034k80oob7seouanigd0%40group.calendar.google.com=America%2FLos_Angeles


Re: Integration of python/portable runner tests for Samza runner

2019-04-22 Thread Ankur Goenka
Hi Daniel,

We use flinkCompatibilityMatrix [1] to check the Flink compatibility with
python. This is python equivalent to validatesRunner tests in java for
portable runners.
I think we can reuse it for Samza Portable runner with minor refactoring.

[1]
https://github.com/apache/beam/blob/bdb1a713a120a887e71e85c77879dc4446a58541/sdks/python/build.gradle#L305

On Mon, Apr 22, 2019 at 3:21 PM Daniel Chen  wrote:

> Hi everyone,
>
> I'm working on improving the validation of the Python portable Samza
> runner. For java, we have the gradle task ( :validatesRunner) that runs the
> runner validation tests.
> I am looking for pointers on how to similarly integrate/enable the
> portability and Python tests for the Samza runner.
>
> Any help will be greatly appreciated.
>
> Thanks,
> Daniel
>


Re: [ANNOUNCE] New committer announcement: Yifan Zou

2019-04-22 Thread Ankur Goenka
Congratulations Yifan!

On Mon, Apr 22, 2019 at 11:42 AM Thomas Weise  wrote:

> Congrats Yifan!
>
>
> On Mon, Apr 22, 2019 at 10:02 AM Maximilian Michels 
> wrote:
>
>> Congrats! Great work.
>>
>> -Max
>>
>> On 22.04.19 19:00, Rui Wang wrote:
>> > Congratulations! Thanks for your contribution!!
>> >
>> > -Rui
>> >
>> > On Mon, Apr 22, 2019 at 9:57 AM Ruoyun Huang > > > wrote:
>> >
>> > Congratulations, Yifan!
>> >
>> > On Mon, Apr 22, 2019 at 9:48 AM Boyuan Zhang > > > wrote:
>> >
>> > Congratulations, Yifan~
>> >
>> > On Mon, Apr 22, 2019 at 9:29 AM Connell O'Callaghan
>> > mailto:conne...@google.com>> wrote:
>> >
>> > Well done Yifan!!!
>> >
>> > Thank you for sharing Kenn!!!
>> >
>> > On Mon, Apr 22, 2019 at 9:00 AM Ahmet Altay
>> > mailto:al...@google.com>> wrote:
>> >
>> > Congratulations, Yifan!
>> >
>> > On Mon, Apr 22, 2019 at 8:46 AM Tim Robertson
>> > > > > wrote:
>> >
>> > Congratulations Yifan!
>> >
>> > On Mon, Apr 22, 2019 at 5:39 PM Cyrus Maden
>> > mailto:cma...@google.com>>
>> wrote:
>> >
>> > Congratulations Yifan!!
>> >
>> > On Mon, Apr 22, 2019 at 11:26 AM Kenneth Knowles
>> > mailto:k...@apache.org>>
>> wrote:
>> >
>> > Hi all,
>> >
>> > Please join me and the rest of the Beam PMC
>> > in welcoming a new committer: Yifan Zou.
>> >
>> > Yifan has been contributing to Beam since
>> > early 2018. He has proposed 70+ pull
>> > requests, adding dependency checking and
>> > improving test infrastructure. But something
>> > the numbers cannot show adequately is the
>> > huge effort Yifan has put into working with
>> > infra and keeping our Jenkins executors
>> healthy.
>> >
>> > In consideration of Yian's contributions,
>> > the Beam PMC trusts Yifan with the
>> > responsibilities of a Beamcommitter[1].
>> >
>> > Thank you, Yifan, for your contributions.
>> >
>> > Kenn
>> >
>> > [1]
>> >
>> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>> > <
>> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>> >
>> >
>> >
>> >
>> > --
>> > 
>> > Ruoyun  Huang
>> >
>>
>


Re: Artifact staging in cross-language pipelines

2019-04-22 Thread Ankur Goenka
 about credentials, the
> same
> >  > access concern will exist for several cross language
> >  > transforms (mostly IOs) since some will need access to
> >  > credentials to read/write to an external service.
> >  >
> >  > Are there any ideas on how credential propagation
> > could work
> >  > to these IOs?
> >  >
> >  >
> >  > There are some cases where existing IO transforms need
> >  > credentials to access remote resources, for example, size
> >  > estimation, validation, etc. But usually these are
> > optional (or
> >  > transform can be configured to not perform these
> functions).
> >  >
> >  >
> >  > To clarify, I'm only talking about transform expansion here.
> > Many IO
> >  > transforms need read/write access to remote services at run
> > time. So
> >  > probably we need to figure out a way to propagate these
> > credentials
> >  > anyways.
> >  >
> >  > Can we use these mechanisms for staging?
> >  >
> >  >
> >  > I think we'll have to find a way to do one of (1)
> propagate
> >  > credentials to other SDKs (2) allow users to configure SDK
> >  > containers to have necessary credentials (3) do the
> artifact
> >  > staging from the pipeline SDK environment which already
> have
> >  > credentials. I prefer (1) or (2) since this will given a
> >  > transform same feature set whether used directly (in the
> same
> >  > SDK language as the transform) or remotely but it might
> > be hard
> >  > to do this for an arbitrary service that a transform might
> >  > connect to considering the number of ways users can
> configure
> >  > credentials (after an offline discussion with Ankur).
> >  >
> >  >
> >  > On Thu, Apr 18, 2019 at 3:47 PM Ankur Goenka
> >  > mailto:goe...@google.com>
> > <mailto:goe...@google.com <mailto:goe...@google.com>>> wrote:
> >  >
> >  > I agree that the Expansion service knows about the
> >  > artifacts required for a cross language transform
> and
> >  > having a prepackage folder/Zip for transforms
> > based on
> >  > language makes sense.
> >  >
> >  > One think to note here is that expansion service
> > might
> >  > not have the same access privilege as the pipeline
> >  > author and hence might not be able to stage
> > artifacts by
> >  > itself.
> >  > Keeping this in mind I am leaning towards making
> >  > Expansion service provide all the required
> > artifacts to
> >  > the user and let the user stage the artifacts as
> > regular
> >  > artifacts.
> >  > At this time, we only have Beam File System based
> >  > artifact staging which users local credentials to
> > access
> >  > different file systems. Even a docker based
> expansion
> >  > service running on local machine might not have
> > the same
> >  > access privileges.
> >  >
> >  > In brief this is what I am leaning toward.
> >  > User call for pipeline submission -> Expansion
> > service
> >  > provide cross language transforms and relevant
> > artifacts
> >  > to the Sdk -> Sdk Submits the pipeline to
> > Jobserver and
> >  > Stages user and cross language artifacts to
> artifacts
> >  > staging service
> >  >
> >  >
> >  > On Thu, Apr 18, 2019 at 2:33 PM Chamikara Jayalath
> >  >  > <mailto:chamik...@google.com> <mailto:chamik...@google.com
> > <mailto:chamik...@google.com>>> wrote:
> >  >
>

Re: New IOIT Dashboards

2019-04-19 Thread Ankur Goenka
This looks great!
Which runner are we using for the pipeline?

On Fri, Apr 19, 2019 at 12:03 PM Kenneth Knowles  wrote:

> Very cool! I assume times are all in seconds?
>
> On Fri, Apr 19, 2019 at 6:26 AM Łukasz Gajowy  wrote:
>
>> Hi,
>>
>> just wanted to announce that we improved the way we collect metrics from
>> IOIT. Now we use Metrics API for this which allowed us to get more insights
>> and collect run/read/write time (and possibly other metrics in the future)
>> separately.
>>
>> The new dashboards are available here:
>> https://s.apache.org/io-test-dashboards
>> I also updated the docs in this PR:
>> https://github.com/apache/beam/pull/8356
>>
>> Thanks,
>> Łukasz
>>
>>
>>


Re: Artifact staging in cross-language pipelines

2019-04-18 Thread Ankur Goenka
I agree that the Expansion service knows about the artifacts required for a
cross language transform and having a prepackage folder/Zip for transforms
based on language makes sense.

One think to note here is that expansion service might not have the same
access privilege as the pipeline author and hence might not be able to
stage artifacts by itself.
Keeping this in mind I am leaning towards making Expansion service provide
all the required artifacts to the user and let the user stage the artifacts
as regular artifacts.
At this time, we only have Beam File System based artifact staging which
users local credentials to access different file systems. Even a docker
based expansion service running on local machine might not have the same
access privileges.

In brief this is what I am leaning toward.
User call for pipeline submission -> Expansion service provide cross
language transforms and relevant artifacts to the Sdk -> Sdk Submits the
pipeline to Jobserver and Stages user and cross language artifacts to
artifacts staging service


On Thu, Apr 18, 2019 at 2:33 PM Chamikara Jayalath 
wrote:

>
>
> On Thu, Apr 18, 2019 at 2:12 PM Lukasz Cwik  wrote:
>
>> Note that Max did ask whether making the expansion service do the staging
>> made sense, and my first line was agreeing with that direction and
>> expanding on how it could be done (so this is really Max's idea or from
>> whomever he got the idea from).
>>
>
> +1 to what Max said then :)
>
>
>>
>> I believe a lot of the value of the expansion service is not having users
>> need to be aware of all the SDK specific dependencies when they are trying
>> to create a pipeline, only the "user" who is launching the expansion
>> service may need to. And in that case we can have a prepackaged expansion
>> service application that does what most users would want (e.g. expansion
>> service as a docker container, a single bundled jar, ...). We (the Apache
>> Beam community) could choose to host a default implementation of the
>> expansion service as well.
>>
>
> I'm not against this. But I think this is a secondary more advanced
> use-case. For a Beam users that needs to use a Java transform that they
> already have in a Python pipeline, we should provide a way to allow
> starting up a expansion service (with dependencies needed for that) and
> running a pipeline that uses this external Java transform (with
> dependencies that are needed at runtime). Probably, it'll be enough to
> allow providing all dependencies when starting up the expansion service and
> allow expansion service to do the staging of jars are well. I don't see a
> need to include the list of jars in the ExpansionResponse sent to the
> Python SDK.
>
>
>>
>> On Thu, Apr 18, 2019 at 2:02 PM Chamikara Jayalath 
>> wrote:
>>
>>> I think there are two kind of dependencies we have to consider.
>>>
>>> (1) Dependencies that are needed to expand the transform.
>>>
>>> These have to be provided when we start the expansion service so that
>>> available external transforms are correctly registered with the expansion
>>> service.
>>>
>>> (2) Dependencies that are not needed at expansion but may be needed at
>>> runtime.
>>>
>>> I think in both cases, users have to provide these dependencies either
>>> when expansion service is started or when a pipeline is being executed.
>>>
>>> Max, I'm not sure why expansion service will need to provide
>>> dependencies to the user since user will already be aware of these. Are you
>>> talking about a expansion service that is readily available that will be
>>> used by many Beam users ? I think such a (possibly long running) service
>>> will have to maintain a repository of transforms and should have mechanism
>>> for registering new transforms and discovering already registered
>>> transforms etc. I think there's more design work needed to make transform
>>> expansion service support such use-cases. Currently, I think allowing
>>> pipeline author to provide the jars when starting the expansion service and
>>> when executing the pipeline will be adequate.
>>>
>>> Regarding the entity that will perform the staging, I like Luke's idea
>>> of allowing expansion service to do the staging (of jars provided by the
>>> user). Notion of artifacts and how they are extracted/represented is SDK
>>> dependent. So if the pipeline SDK tries to do this we have to add n x (n
>>> -1) configurations (for n SDKs).
>>>
>>> - Cham
>>>
>>> On Thu, Apr 18, 2019 at 11:45 AM Lukasz Cwik  wrote:
>>>
 We can expose the artifact staging endpoint and artifact token to allow
 the expansion service to upload any resources its environment may need. For
 example, the expansion service for the Beam Java SDK would be able to
 upload jars.

 In the "docker" environment, the Apache Beam Java SDK harness container
 would fetch the relevant artifacts for itself and be able to execute the
 pipeline. (Note that a docker environment could skip all this artifact
 staging if the docker 

Re: [EXT] Re: [DOC] Portable Spark Runner

2019-04-15 Thread Ankur Goenka
Thanks for sharing.
This looks great!

On Mon, Apr 15, 2019 at 2:54 PM Kenneth Knowles  wrote:

> Great. Thanks for sharing!
>
> On Mon, Apr 15, 2019 at 2:38 PM Lei Xu  wrote:
>
>> This is super nice! Really look forward to use this.
>>
>> On Mon, Apr 15, 2019 at 2:34 PM Thomas Weise  wrote:
>>
>>> Great to see the portable Spark runner taking shape. Thanks for the
>>> update!
>>>
>>>
>>> On Mon, Apr 15, 2019 at 10:53 AM Pablo Estrada 
>>> wrote:
>>>
 This is very cool Kyle. Thanks for moving it forward!
 Best
 -P.

 On Fri, Apr 12, 2019 at 1:21 PM Lukasz Cwik  wrote:

> Thanks for the doc.
>
> On Fri, Apr 12, 2019 at 11:34 AM Kyle Weaver 
> wrote:
>
>> Hi everyone,
>>
>> As some of you know, I've been piggybacking on the existing Spark and
>> Flink runners to create a portable version of the Spark runner. I wrote 
>> up
>> a summary of the work I've done so far and what remains to be done. I'll
>> keep updating this going forward to provide a reasonably up-to-date
>> description of the state of the project. Please comment on the doc if you
>> have any thoughts.
>>
>> Link:
>>
>> https://docs.google.com/document/d/1j8GERTiHUuc6CzzCXZHc38rBn41uWfATBh2-5JN8hro/edit?usp=sharing
>>
>> Thanks,
>> Kyle
>>
>> Kyle Weaver |  Software Engineer | github.com/ibzib |
>> kcwea...@google.com |  +1650203
>>
>
>>
>> *Confidentiality Note:* We care about protecting our proprietary
>> information, confidential material, and trade secrets. This message may
>> contain some or all of those things. Cruise will suffer material harm if
>> anyone other than the intended recipient disseminates or takes any action
>> based on this message. If you have received this message (including any
>> attachments) in error, please delete it immediately and notify the sender
>> promptly.
>
>


Re: [review?] WordCount in Kotlin

2019-04-12 Thread Ankur Goenka
Absolutely :)
I took this opportunity for a general reminder.

Thanks again for taking this Kotlin example to completion.

On Fri, Apr 12, 2019 at 1:24 PM Pablo Estrada  wrote:

> I've merged via a squashed commit that references Jira and the PR. That
> should be reasonable?
> Best
> -P.
>
> On Fri, Apr 12, 2019, 12:22 PM Ankur Goenka  wrote:
>
>> Thanks Pablo and Harshit.
>>
>> Just a quick reminder, please squash the "fixup" sort of commits in the
>> PR based on the prior discussion on the mailing list
>> https://lists.apache.org/thread.html/6d922820d6fc352479f88e5c8737f2c8893ddb706a1e578b50d28948@%3Cdev.beam.apache.org%3E
>>
>> On Fri, Apr 12, 2019 at 11:58 AM Pablo Estrada 
>> wrote:
>>
>>> I've merged this here: https://github.com/apache/beam/pull/8291
>>>
>>> Thanks for all who took a look, and to Harshit for the contribution. : )
>>>
>>> On Thu, Apr 4, 2019 at 10:30 PM Jean-Baptiste Onofré 
>>> wrote:
>>>
>>>> Thanks for the update Pablo.
>>>>
>>>> I will try to take a look during the week end.
>>>>
>>>> Regards
>>>> JB
>>>>
>>>> On 04/04/2019 23:16, Pablo Estrada wrote:
>>>> > Hello all,
>>>> > as community member has been very kind to contribute a Kotlin
>>>> > translation of the WordCount pipeline[1]. The documentation, tests,
>>>> and
>>>> > gradle structure for it is very good, so I am happy to merge, but
>>>> since
>>>> > this code will become our first Kotlin "documentation"/entrypoint, I
>>>> > wanted to be cautious.
>>>> > So if anyone wants to take a look to review the change, please do. I
>>>> > will merge this in a couple days.
>>>> > Thanks!
>>>> > -P.
>>>> >
>>>> > [1] https://github.com/apache/beam/pull/8034
>>>>
>>>> --
>>>> Jean-Baptiste Onofré
>>>> jbono...@apache.org
>>>> http://blog.nanthrax.net
>>>> Talend - http://www.talend.com
>>>>
>>>


Re: [review?] WordCount in Kotlin

2019-04-12 Thread Ankur Goenka
Thanks Pablo and Harshit.

Just a quick reminder, please squash the "fixup" sort of commits in the PR
based on the prior discussion on the mailing list
https://lists.apache.org/thread.html/6d922820d6fc352479f88e5c8737f2c8893ddb706a1e578b50d28948@%3Cdev.beam.apache.org%3E

On Fri, Apr 12, 2019 at 11:58 AM Pablo Estrada  wrote:

> I've merged this here: https://github.com/apache/beam/pull/8291
>
> Thanks for all who took a look, and to Harshit for the contribution. : )
>
> On Thu, Apr 4, 2019 at 10:30 PM Jean-Baptiste Onofré 
> wrote:
>
>> Thanks for the update Pablo.
>>
>> I will try to take a look during the week end.
>>
>> Regards
>> JB
>>
>> On 04/04/2019 23:16, Pablo Estrada wrote:
>> > Hello all,
>> > as community member has been very kind to contribute a Kotlin
>> > translation of the WordCount pipeline[1]. The documentation, tests, and
>> > gradle structure for it is very good, so I am happy to merge, but since
>> > this code will become our first Kotlin "documentation"/entrypoint, I
>> > wanted to be cautious.
>> > So if anyone wants to take a look to review the change, please do. I
>> > will merge this in a couple days.
>> > Thanks!
>> > -P.
>> >
>> > [1] https://github.com/apache/beam/pull/8034
>>
>> --
>> Jean-Baptiste Onofré
>> jbono...@apache.org
>> http://blog.nanthrax.net
>> Talend - http://www.talend.com
>>
>


Re: [ANNOUNCE] New committer announcement: Boyuan Zhang

2019-04-11 Thread Ankur Goenka
Congrats Boyuan!

On Thu, Apr 11, 2019 at 4:52 PM Mark Liu  wrote:

> Congrats Boyuan!
>
> On Thu, Apr 11, 2019 at 9:53 AM Alexey Romanenko 
> wrote:
>
>> > since early 2018
>> > 100+ pull requests
>>
>> Wow, this is impressive! Great job, congrats!
>>
>> > On 11 Apr 2019, at 15:08, Maximilian Michels  wrote:
>> >
>> > Great work! Congrats.
>> >
>> > On 11.04.19 13:41, Robert Bradshaw wrote:
>> >> Congratulations!
>> >> On Thu, Apr 11, 2019 at 12:29 PM Michael Luckey 
>> wrote:
>> >>>
>> >>> Congrats and welcome, Boyuan
>> >>>
>> >>> On Thu, Apr 11, 2019 at 12:27 PM Tim Robertson <
>> timrobertson...@gmail.com> wrote:
>> 
>>  Many congratulations Boyuan!
>> 
>>  On Thu, Apr 11, 2019 at 10:50 AM Łukasz Gajowy 
>> wrote:
>> >
>> > Congrats Boyuan! :)
>> >
>> > śr., 10 kwi 2019 o 23:49 Chamikara Jayalath 
>> napisał(a):
>> >>
>> >> Congrats Boyuan!
>> >>
>> >> On Wed, Apr 10, 2019 at 11:14 AM Yifan Zou 
>> wrote:
>> >>>
>> >>> Congratulations Boyuan!
>> >>>
>> >>> On Wed, Apr 10, 2019 at 10:49 AM Daniel Oliveira <
>> danolive...@google.com> wrote:
>> 
>>  Congrats Boyuan!
>> 
>>  On Wed, Apr 10, 2019 at 10:20 AM Rui Wang 
>> wrote:
>> >
>> > So well deserved!
>> >
>> > -Rui
>> >
>> > On Wed, Apr 10, 2019 at 10:12 AM Pablo Estrada <
>> pabl...@google.com> wrote:
>> >>
>> >> Well deserved : ) congrats Boyuan!
>> >>
>> >> On Wed, Apr 10, 2019 at 10:08 AM Aizhamal Nurmamat kyzy <
>> aizha...@google.com> wrote:
>> >>>
>> >>> Congratulations Boyuan!
>> >>>
>> >>> On Wed, Apr 10, 2019 at 9:52 AM Ruoyun Huang <
>> ruo...@google.com> wrote:
>> 
>>  Thanks for your contributions and congratulations Boyuan!
>> 
>>  On Wed, Apr 10, 2019 at 9:00 AM Kenneth Knowles <
>> k...@apache.org> wrote:
>> >
>> > Hi all,
>> >
>> > Please join me and the rest of the Beam PMC in welcoming a
>> new committer: Boyuan Zhang.
>> >
>> > Boyuan has been contributing to Beam since early 2018. She
>> has proposed 100+ pull requests across a wide range of topics: bug fixes,
>> to integration tests, build improvements, metrics features, release
>> automation. Two big picture things to highlight are building/releasing Beam
>> Python wheels and managing the donation of the Beam Dataflow Java Worker,
>> including help with I.P. clearance.
>> >
>> > In consideration of Boyuan's contributions, the Beam PMC
>> trusts Boyuan with the responsibilities of a Beam committer [1].
>> >
>> > Thank you, Boyuan, for your contributions.
>> >
>> > Kenn
>> >
>> > [1]
>> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>> 
>> 
>> 
>>  --
>>  
>>  Ruoyun  Huang
>> 
>>
>>


Re: [DISCUSS] Backwards compatibility of @Experimental features

2019-04-03 Thread Ankur Goenka
I think a release version with Experimental flag makes sense.
In addition, I think many of our user start to rely on experimental
features because they are not even aware that these features are
experimental and its really hard to find the experimental features used
without giving a good look at the Beam code and having some knowledge about
it.

It will be good it we can have a step at the pipeline submission time which
can print all the experiments used in verbose mode. This might also require
to add a meaningful group name for the experiment example

@Experimental("SDF", 2.15.0)

This will of-course add additional effort and require additional context
while tagging experiments.

On Wed, Apr 3, 2019 at 4:43 PM Reuven Lax  wrote:

> Our Experimental annotation has become almost useless. Many core,
> widely-used parts of the API (e.g. triggers) are still all marked as
> experimental. So many users use these features that we couldn't really
> change them (in a backwards-incompatible) without hurting many users, so
> the fact they are marked Experimental has become a fiction.
>
> Could we add a deadline to the Experimental tag - a release version when
> it will be removed? e.g.
>
> @Experimental(2.15.0)
>
> We can have a test that ensure that the tag is removed at this version. Of
> course if we're not ready to remove experimental by that version, it's fine
> - we can always bump the tagged version. However this forces us to think
> about each one.
>
> Downside - it might add more toil to the existing release process.
>
> Reuven
>
>
> On Wed, Apr 3, 2019 at 4:00 PM Kyle Weaver  wrote:
>
>> > We might also want to get in the habit of reviewing if something should
>> no longer be experimental.
>>
>> +1
>>
>> Kyle Weaver |  Software Engineer |  kcwea...@google.com |  +1650203
>>
>>
>> On Wed, Apr 3, 2019 at 3:53 PM Kenneth Knowles  wrote:
>>
>>> I think option 2 with n=1 minor version seems OK. So users get the
>>> message for one release and it is gone the next. We should make sure the
>>> deprecation warning says "this is an experimental feature, so it will be
>>> removed after 1 minor version". And we need a process for doing it so it
>>> doesn't sit around. I think we should also leave room for using our own
>>> judgment about whether the user pain is very little and then it is not
>>> needed to have a deprecation cycle.
>>>
>>> We might also want to get in the habit of reviewing if something should
>>> no longer be experimental.
>>>
>>> Kenn
>>>
>>> On Wed, Apr 3, 2019 at 2:33 PM Ismaël Mejía  wrote:
>>>
 When we did the first stable release of Beam (2.0.0) we decided to
 annotate most of the Beam IOs as @Experimental because we were
 cautious about not getting the APIs right in the first try. This was a
 really good decision because we could do serious improvements and
 refactorings to them in the first releases without the hassle of
 keeping backwards compatibility. However after some more releases
 users started to rely on features and supported versions, so we ended
 up in a situation where we could not change them arbitrarily without
 consequences to the final users.

 So we started to deprecate some features and parts of the API without
 removing them, e.g. the introduction of HadoopFormatIO deprecated
 HadoopInputFormatIO, we deprecated methods of MongoDbIO and MqttIO to
 improve the APIs (in most cases with valid/improved replacements), and
 recently it was discussed to removal of support for older versions in
 KafkaIO.

 Keeping deprecated stuff in experimental APIs does not seem to make
 sense, but it is what he have started to do to be ‘user friendly’, but
 it is probably a good moment to define, what should be the clear path
 for removal and breaking changes of experimental features, some
 options:

 1. Stay as we were, do not mark things as deprecated and remove them
 at will because this is the contract of @Experimental.
 2. Deprecate stuff and remove it after n versions (where n could be 3
 releases).
 3. Deprecate stuff and remove it just after a new LTS is decided to
 ensure users who need these features may still have them for some
 time.

 I would like to know your opinions about this, or if you have other
 ideas. Notice that in discussion I refer only to @Experimental
 features.

>>>


Re: [ANNOUNCE] New committer announcement: Mark Liu

2019-03-25 Thread Ankur Goenka
Congratulations Mark!

On Mon, Mar 25, 2019 at 12:04 PM Jason Kuster 
wrote:

> Wonderful, congrats Mark!
>
> On Mon, Mar 25, 2019 at 11:30 AM Alan Myrvold  wrote:
>
>> congratulations, Mark!!!
>>
>> On Mon, Mar 25, 2019 at 10:05 AM Ruoyun Huang  wrote:
>>
>>> Congratulations Mark!
>>>
>>> On Mon, Mar 25, 2019 at 9:31 AM Udi Meiri  wrote:
>>>
 Congrats Mark!

 On Mon, Mar 25, 2019 at 9:24 AM Ahmet Altay  wrote:

> Congratulations, Mark! 
>
> On Mon, Mar 25, 2019 at 7:24 AM Tim Robertson <
> timrobertson...@gmail.com> wrote:
>
>> Congratulations Mark!
>>
>>
>> On Mon, Mar 25, 2019 at 3:18 PM Michael Luckey 
>> wrote:
>>
>>> Nice! Congratulations, Mark.
>>>
>>> On Mon, Mar 25, 2019 at 2:42 PM Katarzyna Kucharczyk <
>>> ka.kucharc...@gmail.com> wrote:
>>>
 Congratulations, Mark! 

 On Mon, Mar 25, 2019 at 11:24 AM Gleb Kanterov 
 wrote:

> Congratulations!
>
> On Mon, Mar 25, 2019 at 10:23 AM Łukasz Gajowy 
> wrote:
>
>> Congrats! :)
>>
>>
>>
>> pon., 25 mar 2019 o 08:11 Aizhamal Nurmamat kyzy <
>> aizha...@google.com> napisał(a):
>>
>>> Congratulations, Mark!
>>>
>>> On Sun, Mar 24, 2019 at 23:18 Pablo Estrada 
>>> wrote:
>>>
 Yeaah  Mark! : ) Congrats : D

 On Sun, Mar 24, 2019 at 10:32 PM Yifan Zou 
 wrote:

> Congratulations Mark!
>
> On Sun, Mar 24, 2019 at 10:25 PM Connell O'Callaghan <
> conne...@google.com> wrote:
>
>> Well done congratulations Mark!!!
>>
>> On Sun, Mar 24, 2019 at 10:17 PM Robert Burke <
>> rob...@frantil.com> wrote:
>>
>>> Congratulations Mark! 
>>>
>>> On Sun, Mar 24, 2019, 10:08 PM Valentyn Tymofieiev <
>>> valen...@google.com> wrote:
>>>
 Congratulations, Mark!

 Thanks for your contributions, in particular for your
 efforts to parallelize test execution for Python SDK and 
 increase the speed
 of Python precommit checks.

 On Sun, Mar 24, 2019 at 9:40 PM Kenneth Knowles <
 k...@apache.org> wrote:

> Hi all,
>
> Please join me and the rest of the Beam PMC in welcoming
> a new committer: Mark Liu.
>
> Mark has been contributing to Beam since late 2016! He has
> proposed 100+ pull requests. Mark was instrumental in 
> expanding test and
> infrastructure coverage, especially for Python. In
> consideration of Mark's contributions, the Beam PMC trusts 
> Mark with the
> responsibilities of a Beam committer [1].
>
> Thank you, Mark, for your contributions.
>
> Kenn
>
> [1] https://beam.apache.org/contribute/become-a-committer/
> #an-apache-beam-committer
>
 --
>>>
>>> *Aizhamal Nurmamat kyzy*
>>>
>>> Open Source Program Manager
>>>
>>> 646-355-9740 Mobile
>>>
>>> 601 North 34th Street, Seattle, WA 98103
>>>
>>>
>>>
>
> --
> Cheers,
> Gleb
>

>>>
>>> --
>>> 
>>> Ruoyun  Huang
>>>
>>>
>
> --
> ---
> Jason Kuster
> Apache Beam / Google Cloud Dataflow
>
> See something? Say something. go/jasonkuster-feedback
> 
>


  1   2   >