Re: [ANNOUNCE] Beam 2.15.0 Released!

2019-08-26 Thread jincheng sun
Cheers!! Thanks for driving the release, Yifan!
Thanks a lot to everyone who helped making this release possible!

Best,
Jincheng

Thomas Weise  于2019年8月27日周二 下午12:54写道:

> Yifan, thanks for managing this release. It went smoothly!
>
>
> On Fri, Aug 23, 2019 at 2:32 PM Kenneth Knowles  wrote:
>
>> Nice work!
>>
>> On Fri, Aug 23, 2019 at 11:26 AM Charles Chen  wrote:
>>
>>> Thank you Yifan!
>>>
>>> On Fri, Aug 23, 2019 at 11:12 AM Hannah Jiang 
>>> wrote:
>>>
 Thank you Yifan!

 On Fri, Aug 23, 2019 at 11:09 AM Yichi Zhang  wrote:

> Thank you Yifan!
>
> On Fri, Aug 23, 2019 at 11:06 AM Robin Qiu  wrote:
>
>> Thank you Yifan!
>>
>> On Fri, Aug 23, 2019 at 11:05 AM Rui Wang  wrote:
>>
>>> Thank you Yifan!
>>>
>>> -Rui
>>>
>>> On Fri, Aug 23, 2019 at 9:21 AM Pablo Estrada 
>>> wrote:
>>>
 Thanks Yifan!

 On Fri, Aug 23, 2019 at 8:54 AM Connell O'Callaghan <
 conne...@google.com> wrote:

>
> +1 thank you Yifan!!!
>
> On Fri, Aug 23, 2019 at 8:49 AM Ahmet Altay 
> wrote:
>
>> Thank you Yifan!
>>
>> On Fri, Aug 23, 2019 at 8:00 AM Yifan Zou 
>> wrote:
>>
>>> The Apache Beam team is pleased to announce the release of
>>> version 2.15.0.
>>>
>>> Apache Beam is an open source unified programming model to
>>> define and
>>> execute data processing pipelines, including ETL, batch and
>>> stream
>>> (continuous) processing. See https://beam.apache.org
>>>
>>> You can download the release here:
>>>
>>> https://beam.apache.org/get-started/downloads/
>>>
>>> This release includes bug fixes, features, and improvements
>>> detailed on
>>> the Beam blog:
>>> https://beam.apache.org/blog/2019/08/22/beam-2.15.0.html
>>>
>>> Thanks to everyone who contributed to this release, and we hope
>>> you enjoy
>>> using Beam 2.15.0.
>>>
>>> Yifan Zou
>>>
>>


Re: [ANNOUNCE] New committer: Valentyn Tymofieiev

2019-08-26 Thread jincheng sun
Congrats Valentyn!

Best,
Jincheng

Ankur Goenka  于2019年8月27日周二 上午10:37写道:

> Congratulations Valentyn!
>
> On Mon, Aug 26, 2019, 5:02 PM Yifan Zou  wrote:
>
>> Congratulations, Valentyn! Well deserved!
>>
>> On Mon, Aug 26, 2019 at 3:31 PM Aizhamal Nurmamat kyzy <
>> aizha...@google.com> wrote:
>>
>>> Congratulations! and thank you for your contributions, Valentyn!
>>>
>>> On Mon, Aug 26, 2019 at 3:26 PM Thomas Weise  wrote:
>>>
 Congrats!


 On Mon, Aug 26, 2019 at 3:22 PM Heejong Lee  wrote:

> Congratulations! :)
>
> On Mon, Aug 26, 2019 at 2:44 PM Rui Wang  wrote:
>
>> Congratulations!
>>
>>
>> -Rui
>>
>> On Mon, Aug 26, 2019 at 2:36 PM Hannah Jiang 
>> wrote:
>>
>>> Congratulations Valentyn, well deserved!
>>>
>>> On Mon, Aug 26, 2019 at 2:34 PM Chamikara Jayalath <
>>> chamik...@google.com> wrote:
>>>
 Congrats Valentyn!

 On Mon, Aug 26, 2019 at 2:32 PM Pablo Estrada 
 wrote:

> Thanks Valentyn!
>
> On Mon, Aug 26, 2019 at 2:29 PM Robin Qiu 
> wrote:
>
>> Thank you Valentyn! Congratulations!
>>
>> On Mon, Aug 26, 2019 at 2:28 PM Robert Bradshaw <
>> rober...@google.com> wrote:
>>
>>> Hi,
>>>
>>> Please join me and the rest of the Beam PMC in welcoming a new
>>> committer: Valentyn Tymofieiev
>>>
>>> Valentyn has made numerous contributions to Beam over the last
>>> several
>>> years (including 100+ pull requests), most recently pushing
>>> through
>>> the effort to make Beam compatible with Python 3. He is also an
>>> active
>>> participant in design discussions on the list, participates in
>>> release
>>> candidate validation, and proactively helps keep our tests green.
>>>
>>> In consideration of Valentyn's contributions, the Beam PMC
>>> trusts him
>>> with the responsibilities of a Beam committer [1].
>>>
>>> Thank you, Valentyn, for your contributions and looking forward
>>> to many more!
>>>
>>> Robert, on behalf of the Apache Beam PMC
>>>
>>> [1]
>>> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>>>
>>


Re: [ANNOUNCE] Beam 2.15.0 Released!

2019-08-26 Thread Thomas Weise
Yifan, thanks for managing this release. It went smoothly!


On Fri, Aug 23, 2019 at 2:32 PM Kenneth Knowles  wrote:

> Nice work!
>
> On Fri, Aug 23, 2019 at 11:26 AM Charles Chen  wrote:
>
>> Thank you Yifan!
>>
>> On Fri, Aug 23, 2019 at 11:12 AM Hannah Jiang 
>> wrote:
>>
>>> Thank you Yifan!
>>>
>>> On Fri, Aug 23, 2019 at 11:09 AM Yichi Zhang  wrote:
>>>
 Thank you Yifan!

 On Fri, Aug 23, 2019 at 11:06 AM Robin Qiu  wrote:

> Thank you Yifan!
>
> On Fri, Aug 23, 2019 at 11:05 AM Rui Wang  wrote:
>
>> Thank you Yifan!
>>
>> -Rui
>>
>> On Fri, Aug 23, 2019 at 9:21 AM Pablo Estrada 
>> wrote:
>>
>>> Thanks Yifan!
>>>
>>> On Fri, Aug 23, 2019 at 8:54 AM Connell O'Callaghan <
>>> conne...@google.com> wrote:
>>>

 +1 thank you Yifan!!!

 On Fri, Aug 23, 2019 at 8:49 AM Ahmet Altay 
 wrote:

> Thank you Yifan!
>
> On Fri, Aug 23, 2019 at 8:00 AM Yifan Zou 
> wrote:
>
>> The Apache Beam team is pleased to announce the release of
>> version 2.15.0.
>>
>> Apache Beam is an open source unified programming model to define
>> and
>> execute data processing pipelines, including ETL, batch and stream
>> (continuous) processing. See https://beam.apache.org
>>
>> You can download the release here:
>>
>> https://beam.apache.org/get-started/downloads/
>>
>> This release includes bug fixes, features, and improvements
>> detailed on
>> the Beam blog:
>> https://beam.apache.org/blog/2019/08/22/beam-2.15.0.html
>>
>> Thanks to everyone who contributed to this release, and we hope
>> you enjoy
>> using Beam 2.15.0.
>>
>> Yifan Zou
>>
>


Re: [ANNOUNCE] New committer: Valentyn Tymofieiev

2019-08-26 Thread Ankur Goenka
Congratulations Valentyn!

On Mon, Aug 26, 2019, 5:02 PM Yifan Zou  wrote:

> Congratulations, Valentyn! Well deserved!
>
> On Mon, Aug 26, 2019 at 3:31 PM Aizhamal Nurmamat kyzy <
> aizha...@google.com> wrote:
>
>> Congratulations! and thank you for your contributions, Valentyn!
>>
>> On Mon, Aug 26, 2019 at 3:26 PM Thomas Weise  wrote:
>>
>>> Congrats!
>>>
>>>
>>> On Mon, Aug 26, 2019 at 3:22 PM Heejong Lee  wrote:
>>>
 Congratulations! :)

 On Mon, Aug 26, 2019 at 2:44 PM Rui Wang  wrote:

> Congratulations!
>
>
> -Rui
>
> On Mon, Aug 26, 2019 at 2:36 PM Hannah Jiang 
> wrote:
>
>> Congratulations Valentyn, well deserved!
>>
>> On Mon, Aug 26, 2019 at 2:34 PM Chamikara Jayalath <
>> chamik...@google.com> wrote:
>>
>>> Congrats Valentyn!
>>>
>>> On Mon, Aug 26, 2019 at 2:32 PM Pablo Estrada 
>>> wrote:
>>>
 Thanks Valentyn!

 On Mon, Aug 26, 2019 at 2:29 PM Robin Qiu 
 wrote:

> Thank you Valentyn! Congratulations!
>
> On Mon, Aug 26, 2019 at 2:28 PM Robert Bradshaw <
> rober...@google.com> wrote:
>
>> Hi,
>>
>> Please join me and the rest of the Beam PMC in welcoming a new
>> committer: Valentyn Tymofieiev
>>
>> Valentyn has made numerous contributions to Beam over the last
>> several
>> years (including 100+ pull requests), most recently pushing
>> through
>> the effort to make Beam compatible with Python 3. He is also an
>> active
>> participant in design discussions on the list, participates in
>> release
>> candidate validation, and proactively helps keep our tests green.
>>
>> In consideration of Valentyn's contributions, the Beam PMC trusts
>> him
>> with the responsibilities of a Beam committer [1].
>>
>> Thank you, Valentyn, for your contributions and looking forward
>> to many more!
>>
>> Robert, on behalf of the Apache Beam PMC
>>
>> [1]
>> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>>
>


Re: [ANNOUNCE] New committer: Valentyn Tymofieiev

2019-08-26 Thread Yifan Zou
Congratulations, Valentyn! Well deserved!

On Mon, Aug 26, 2019 at 3:31 PM Aizhamal Nurmamat kyzy 
wrote:

> Congratulations! and thank you for your contributions, Valentyn!
>
> On Mon, Aug 26, 2019 at 3:26 PM Thomas Weise  wrote:
>
>> Congrats!
>>
>>
>> On Mon, Aug 26, 2019 at 3:22 PM Heejong Lee  wrote:
>>
>>> Congratulations! :)
>>>
>>> On Mon, Aug 26, 2019 at 2:44 PM Rui Wang  wrote:
>>>
 Congratulations!


 -Rui

 On Mon, Aug 26, 2019 at 2:36 PM Hannah Jiang 
 wrote:

> Congratulations Valentyn, well deserved!
>
> On Mon, Aug 26, 2019 at 2:34 PM Chamikara Jayalath <
> chamik...@google.com> wrote:
>
>> Congrats Valentyn!
>>
>> On Mon, Aug 26, 2019 at 2:32 PM Pablo Estrada 
>> wrote:
>>
>>> Thanks Valentyn!
>>>
>>> On Mon, Aug 26, 2019 at 2:29 PM Robin Qiu 
>>> wrote:
>>>
 Thank you Valentyn! Congratulations!

 On Mon, Aug 26, 2019 at 2:28 PM Robert Bradshaw <
 rober...@google.com> wrote:

> Hi,
>
> Please join me and the rest of the Beam PMC in welcoming a new
> committer: Valentyn Tymofieiev
>
> Valentyn has made numerous contributions to Beam over the last
> several
> years (including 100+ pull requests), most recently pushing through
> the effort to make Beam compatible with Python 3. He is also an
> active
> participant in design discussions on the list, participates in
> release
> candidate validation, and proactively helps keep our tests green.
>
> In consideration of Valentyn's contributions, the Beam PMC trusts
> him
> with the responsibilities of a Beam committer [1].
>
> Thank you, Valentyn, for your contributions and looking forward to
> many more!
>
> Robert, on behalf of the Apache Beam PMC
>
> [1]
> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>



Re: [PROPOSAL] Storing, displaying and detecting anomalies in test results

2019-08-26 Thread Pablo Estrada
Thanks Kamil for bringing this!
+Manisha Bhardwaj  +Mark Liu  have
worked on internal benchmarking at Google - would you take a look please?

On Fri, Aug 23, 2019 at 3:22 AM Kamil Wasilewski <
kamil.wasilew...@polidea.com> wrote:

> Hi all,
>
> Recently we did some research on how to visualize IO performance tests,
> Nexmark and Load test results better and how to detect regressions
> automatically in an easy way using tools dedicated for the job.
>
> We'd like to share a proposal with you:
> 
> https://s.apache.org/test-metrics-storage
>
>
>
> Any comments are highly appreciated.
>
> Thanks,
>
> Kamil
>


Re: [ANNOUNCE] New committer: Valentyn Tymofieiev

2019-08-26 Thread Aizhamal Nurmamat kyzy
Congratulations! and thank you for your contributions, Valentyn!

On Mon, Aug 26, 2019 at 3:26 PM Thomas Weise  wrote:

> Congrats!
>
>
> On Mon, Aug 26, 2019 at 3:22 PM Heejong Lee  wrote:
>
>> Congratulations! :)
>>
>> On Mon, Aug 26, 2019 at 2:44 PM Rui Wang  wrote:
>>
>>> Congratulations!
>>>
>>>
>>> -Rui
>>>
>>> On Mon, Aug 26, 2019 at 2:36 PM Hannah Jiang 
>>> wrote:
>>>
 Congratulations Valentyn, well deserved!

 On Mon, Aug 26, 2019 at 2:34 PM Chamikara Jayalath <
 chamik...@google.com> wrote:

> Congrats Valentyn!
>
> On Mon, Aug 26, 2019 at 2:32 PM Pablo Estrada 
> wrote:
>
>> Thanks Valentyn!
>>
>> On Mon, Aug 26, 2019 at 2:29 PM Robin Qiu  wrote:
>>
>>> Thank you Valentyn! Congratulations!
>>>
>>> On Mon, Aug 26, 2019 at 2:28 PM Robert Bradshaw 
>>> wrote:
>>>
 Hi,

 Please join me and the rest of the Beam PMC in welcoming a new
 committer: Valentyn Tymofieiev

 Valentyn has made numerous contributions to Beam over the last
 several
 years (including 100+ pull requests), most recently pushing through
 the effort to make Beam compatible with Python 3. He is also an
 active
 participant in design discussions on the list, participates in
 release
 candidate validation, and proactively helps keep our tests green.

 In consideration of Valentyn's contributions, the Beam PMC trusts
 him
 with the responsibilities of a Beam committer [1].

 Thank you, Valentyn, for your contributions and looking forward to
 many more!

 Robert, on behalf of the Apache Beam PMC

 [1]
 https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer

>>>


Re: [ANNOUNCE] New committer: Valentyn Tymofieiev

2019-08-26 Thread Thomas Weise
Congrats!


On Mon, Aug 26, 2019 at 3:22 PM Heejong Lee  wrote:

> Congratulations! :)
>
> On Mon, Aug 26, 2019 at 2:44 PM Rui Wang  wrote:
>
>> Congratulations!
>>
>>
>> -Rui
>>
>> On Mon, Aug 26, 2019 at 2:36 PM Hannah Jiang 
>> wrote:
>>
>>> Congratulations Valentyn, well deserved!
>>>
>>> On Mon, Aug 26, 2019 at 2:34 PM Chamikara Jayalath 
>>> wrote:
>>>
 Congrats Valentyn!

 On Mon, Aug 26, 2019 at 2:32 PM Pablo Estrada 
 wrote:

> Thanks Valentyn!
>
> On Mon, Aug 26, 2019 at 2:29 PM Robin Qiu  wrote:
>
>> Thank you Valentyn! Congratulations!
>>
>> On Mon, Aug 26, 2019 at 2:28 PM Robert Bradshaw 
>> wrote:
>>
>>> Hi,
>>>
>>> Please join me and the rest of the Beam PMC in welcoming a new
>>> committer: Valentyn Tymofieiev
>>>
>>> Valentyn has made numerous contributions to Beam over the last
>>> several
>>> years (including 100+ pull requests), most recently pushing through
>>> the effort to make Beam compatible with Python 3. He is also an
>>> active
>>> participant in design discussions on the list, participates in
>>> release
>>> candidate validation, and proactively helps keep our tests green.
>>>
>>> In consideration of Valentyn's contributions, the Beam PMC trusts him
>>> with the responsibilities of a Beam committer [1].
>>>
>>> Thank you, Valentyn, for your contributions and looking forward to
>>> many more!
>>>
>>> Robert, on behalf of the Apache Beam PMC
>>>
>>> [1]
>>> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>>>
>>


Re: unsubscribe

2019-08-26 Thread Joana Carrasqueira
Thank you! :)

On Mon, Aug 26, 2019 at 3:07 PM Valentyn Tymofieiev 
wrote:

> Hi Joana,
>
> you should email dev-unsubscr...@beam.apache.org.
>
> On Mon, Aug 26, 2019 at 3:01 PM Joana Carrasqueira 
> wrote:
>
>> Hi all,
>>
>> I moved to a new job in Tensorflow and I would like to unsubscribe from
>> this channel. I really appreciate my time with this community!
>>
>> Thank you,
>> Joana
>>
>

-- 

*Joana Carrasqueira*

Developer Relations Program Manager - Tensorflow

+1 415-602-2507

1600 Amphitheatre Pkwy, Google Building 41, Mountain View, CA 94043


Re: [ANNOUNCE] New committer: Valentyn Tymofieiev

2019-08-26 Thread Heejong Lee
Congratulations! :)

On Mon, Aug 26, 2019 at 2:44 PM Rui Wang  wrote:

> Congratulations!
>
>
> -Rui
>
> On Mon, Aug 26, 2019 at 2:36 PM Hannah Jiang 
> wrote:
>
>> Congratulations Valentyn, well deserved!
>>
>> On Mon, Aug 26, 2019 at 2:34 PM Chamikara Jayalath 
>> wrote:
>>
>>> Congrats Valentyn!
>>>
>>> On Mon, Aug 26, 2019 at 2:32 PM Pablo Estrada 
>>> wrote:
>>>
 Thanks Valentyn!

 On Mon, Aug 26, 2019 at 2:29 PM Robin Qiu  wrote:

> Thank you Valentyn! Congratulations!
>
> On Mon, Aug 26, 2019 at 2:28 PM Robert Bradshaw 
> wrote:
>
>> Hi,
>>
>> Please join me and the rest of the Beam PMC in welcoming a new
>> committer: Valentyn Tymofieiev
>>
>> Valentyn has made numerous contributions to Beam over the last several
>> years (including 100+ pull requests), most recently pushing through
>> the effort to make Beam compatible with Python 3. He is also an active
>> participant in design discussions on the list, participates in release
>> candidate validation, and proactively helps keep our tests green.
>>
>> In consideration of Valentyn's contributions, the Beam PMC trusts him
>> with the responsibilities of a Beam committer [1].
>>
>> Thank you, Valentyn, for your contributions and looking forward to
>> many more!
>>
>> Robert, on behalf of the Apache Beam PMC
>>
>> [1]
>> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>>
>


unsubscribe

2019-08-26 Thread Joana Carrasqueira
Hi all,

I moved to a new job in Tensorflow and I would like to unsubscribe from
this channel. I really appreciate my time with this community!

Thank you,
Joana


Re: [ANNOUNCE] New committer: Valentyn Tymofieiev

2019-08-26 Thread Rui Wang
Congratulations!


-Rui

On Mon, Aug 26, 2019 at 2:36 PM Hannah Jiang  wrote:

> Congratulations Valentyn, well deserved!
>
> On Mon, Aug 26, 2019 at 2:34 PM Chamikara Jayalath 
> wrote:
>
>> Congrats Valentyn!
>>
>> On Mon, Aug 26, 2019 at 2:32 PM Pablo Estrada  wrote:
>>
>>> Thanks Valentyn!
>>>
>>> On Mon, Aug 26, 2019 at 2:29 PM Robin Qiu  wrote:
>>>
 Thank you Valentyn! Congratulations!

 On Mon, Aug 26, 2019 at 2:28 PM Robert Bradshaw 
 wrote:

> Hi,
>
> Please join me and the rest of the Beam PMC in welcoming a new
> committer: Valentyn Tymofieiev
>
> Valentyn has made numerous contributions to Beam over the last several
> years (including 100+ pull requests), most recently pushing through
> the effort to make Beam compatible with Python 3. He is also an active
> participant in design discussions on the list, participates in release
> candidate validation, and proactively helps keep our tests green.
>
> In consideration of Valentyn's contributions, the Beam PMC trusts him
> with the responsibilities of a Beam committer [1].
>
> Thank you, Valentyn, for your contributions and looking forward to
> many more!
>
> Robert, on behalf of the Apache Beam PMC
>
> [1]
> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>



Re: [ANNOUNCE] New committer: Valentyn Tymofieiev

2019-08-26 Thread Hannah Jiang
Congratulations Valentyn, well deserved!

On Mon, Aug 26, 2019 at 2:34 PM Chamikara Jayalath 
wrote:

> Congrats Valentyn!
>
> On Mon, Aug 26, 2019 at 2:32 PM Pablo Estrada  wrote:
>
>> Thanks Valentyn!
>>
>> On Mon, Aug 26, 2019 at 2:29 PM Robin Qiu  wrote:
>>
>>> Thank you Valentyn! Congratulations!
>>>
>>> On Mon, Aug 26, 2019 at 2:28 PM Robert Bradshaw 
>>> wrote:
>>>
 Hi,

 Please join me and the rest of the Beam PMC in welcoming a new
 committer: Valentyn Tymofieiev

 Valentyn has made numerous contributions to Beam over the last several
 years (including 100+ pull requests), most recently pushing through
 the effort to make Beam compatible with Python 3. He is also an active
 participant in design discussions on the list, participates in release
 candidate validation, and proactively helps keep our tests green.

 In consideration of Valentyn's contributions, the Beam PMC trusts him
 with the responsibilities of a Beam committer [1].

 Thank you, Valentyn, for your contributions and looking forward to many
 more!

 Robert, on behalf of the Apache Beam PMC

 [1]
 https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer

>>>


Re: [ANNOUNCE] New committer: Valentyn Tymofieiev

2019-08-26 Thread Chamikara Jayalath
Congrats Valentyn!

On Mon, Aug 26, 2019 at 2:32 PM Pablo Estrada  wrote:

> Thanks Valentyn!
>
> On Mon, Aug 26, 2019 at 2:29 PM Robin Qiu  wrote:
>
>> Thank you Valentyn! Congratulations!
>>
>> On Mon, Aug 26, 2019 at 2:28 PM Robert Bradshaw 
>> wrote:
>>
>>> Hi,
>>>
>>> Please join me and the rest of the Beam PMC in welcoming a new
>>> committer: Valentyn Tymofieiev
>>>
>>> Valentyn has made numerous contributions to Beam over the last several
>>> years (including 100+ pull requests), most recently pushing through
>>> the effort to make Beam compatible with Python 3. He is also an active
>>> participant in design discussions on the list, participates in release
>>> candidate validation, and proactively helps keep our tests green.
>>>
>>> In consideration of Valentyn's contributions, the Beam PMC trusts him
>>> with the responsibilities of a Beam committer [1].
>>>
>>> Thank you, Valentyn, for your contributions and looking forward to many
>>> more!
>>>
>>> Robert, on behalf of the Apache Beam PMC
>>>
>>> [1]
>>> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>>>
>>


Re: [ANNOUNCE] New committer: Valentyn Tymofieiev

2019-08-26 Thread Charles Chen
Thank you and congratulations Valentyn!  Much appreciated and deserved!

On Mon, Aug 26, 2019 at 2:33 PM Reza Rokni  wrote:

> Thanks Valentin!
>
> On Tue, 27 Aug 2019, 05:32 Pablo Estrada,  wrote:
>
>> Thanks Valentyn!
>>
>> On Mon, Aug 26, 2019 at 2:29 PM Robin Qiu  wrote:
>>
>>> Thank you Valentyn! Congratulations!
>>>
>>> On Mon, Aug 26, 2019 at 2:28 PM Robert Bradshaw 
>>> wrote:
>>>
 Hi,

 Please join me and the rest of the Beam PMC in welcoming a new
 committer: Valentyn Tymofieiev

 Valentyn has made numerous contributions to Beam over the last several
 years (including 100+ pull requests), most recently pushing through
 the effort to make Beam compatible with Python 3. He is also an active
 participant in design discussions on the list, participates in release
 candidate validation, and proactively helps keep our tests green.

 In consideration of Valentyn's contributions, the Beam PMC trusts him
 with the responsibilities of a Beam committer [1].

 Thank you, Valentyn, for your contributions and looking forward to many
 more!

 Robert, on behalf of the Apache Beam PMC

 [1]
 https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer

>>>


Re: [ANNOUNCE] New committer: Valentyn Tymofieiev

2019-08-26 Thread Reza Rokni
Thanks Valentin!

On Tue, 27 Aug 2019, 05:32 Pablo Estrada,  wrote:

> Thanks Valentyn!
>
> On Mon, Aug 26, 2019 at 2:29 PM Robin Qiu  wrote:
>
>> Thank you Valentyn! Congratulations!
>>
>> On Mon, Aug 26, 2019 at 2:28 PM Robert Bradshaw 
>> wrote:
>>
>>> Hi,
>>>
>>> Please join me and the rest of the Beam PMC in welcoming a new
>>> committer: Valentyn Tymofieiev
>>>
>>> Valentyn has made numerous contributions to Beam over the last several
>>> years (including 100+ pull requests), most recently pushing through
>>> the effort to make Beam compatible with Python 3. He is also an active
>>> participant in design discussions on the list, participates in release
>>> candidate validation, and proactively helps keep our tests green.
>>>
>>> In consideration of Valentyn's contributions, the Beam PMC trusts him
>>> with the responsibilities of a Beam committer [1].
>>>
>>> Thank you, Valentyn, for your contributions and looking forward to many
>>> more!
>>>
>>> Robert, on behalf of the Apache Beam PMC
>>>
>>> [1]
>>> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>>>
>>


Re: [ANNOUNCE] New committer: Valentyn Tymofieiev

2019-08-26 Thread Pablo Estrada
Thanks Valentyn!

On Mon, Aug 26, 2019 at 2:29 PM Robin Qiu  wrote:

> Thank you Valentyn! Congratulations!
>
> On Mon, Aug 26, 2019 at 2:28 PM Robert Bradshaw 
> wrote:
>
>> Hi,
>>
>> Please join me and the rest of the Beam PMC in welcoming a new
>> committer: Valentyn Tymofieiev
>>
>> Valentyn has made numerous contributions to Beam over the last several
>> years (including 100+ pull requests), most recently pushing through
>> the effort to make Beam compatible with Python 3. He is also an active
>> participant in design discussions on the list, participates in release
>> candidate validation, and proactively helps keep our tests green.
>>
>> In consideration of Valentyn's contributions, the Beam PMC trusts him
>> with the responsibilities of a Beam committer [1].
>>
>> Thank you, Valentyn, for your contributions and looking forward to many
>> more!
>>
>> Robert, on behalf of the Apache Beam PMC
>>
>> [1]
>> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>>
>


Re: [ANNOUNCE] New committer: Valentyn Tymofieiev

2019-08-26 Thread Robin Qiu
Thank you Valentyn! Congratulations!

On Mon, Aug 26, 2019 at 2:28 PM Robert Bradshaw  wrote:

> Hi,
>
> Please join me and the rest of the Beam PMC in welcoming a new
> committer: Valentyn Tymofieiev
>
> Valentyn has made numerous contributions to Beam over the last several
> years (including 100+ pull requests), most recently pushing through
> the effort to make Beam compatible with Python 3. He is also an active
> participant in design discussions on the list, participates in release
> candidate validation, and proactively helps keep our tests green.
>
> In consideration of Valentyn's contributions, the Beam PMC trusts him
> with the responsibilities of a Beam committer [1].
>
> Thank you, Valentyn, for your contributions and looking forward to many
> more!
>
> Robert, on behalf of the Apache Beam PMC
>
> [1]
> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>


[ANNOUNCE] New committer: Valentyn Tymofieiev

2019-08-26 Thread Robert Bradshaw
Hi,

Please join me and the rest of the Beam PMC in welcoming a new
committer: Valentyn Tymofieiev

Valentyn has made numerous contributions to Beam over the last several
years (including 100+ pull requests), most recently pushing through
the effort to make Beam compatible with Python 3. He is also an active
participant in design discussions on the list, participates in release
candidate validation, and proactively helps keep our tests green.

In consideration of Valentyn's contributions, the Beam PMC trusts him
with the responsibilities of a Beam committer [1].

Thank you, Valentyn, for your contributions and looking forward to many more!

Robert, on behalf of the Apache Beam PMC

[1] 
https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer


Re: Brief of interactive Beam

2019-08-26 Thread Robert Bradshaw
On Fri, Aug 23, 2019 at 4:25 PM Ning Kang  wrote:

> On Aug 23, 2019, at 3:09 PM, Robert Bradshaw  wrote:
>
> Cool, sounds like we're getting closer to the same page. Some more replies
> below.
>
> On Fri, Aug 23, 2019 at 1:47 PM Ning Kang  wrote:
>
>> Thanks for the feedback, Robert! I think I got your idea.
>> Let me summarize it to see if it’s correct:
>> 1. You want everything about
>>
>> standard Beam concepts
>>
>>  to follow existing pattern: so we can shot down create_pipeline() and
>> keep the InteractiveRunner notion when constructing pipeline, I agree with
>> it. A runner can delegate another runner, also agreed. Let’s keep it that
>> way.
>>
>
> Despite everything I've written, I'm not convinced that exposing this as a
> Runner is the most intuitive way to get interactivity either. Given that
> the "magic" of interactivity is being able to watch PCollections (for
> inspection and further construction), and if no PCollecitons are watched
> execution proceeds as normal, what are your thoughts about making all
> pipelines "interactive" and just doing the magic iff there are PCollections
> to watch? (The opt-in incantation here would be ibeam.watch(globals()) or
> similar.)
>
> FWIW, Flume has something similar (called marking collections as to be
> materialized). It has its pros and cons.
>
> By default __main__ is watched, similar to the watch(globals()). If no
> PCollection variable is being watched, it’s not doing any magic.
> I’m not sure about making all pipelines “interactive” such as by adding an
> “interactive=True/False” option when constructing pipeline.
>

My point was that watch(globals()) (or anything else) would be the explicit
op in to interactive, instead of doing interactive=True or manually
constructing an InteractiveRunner or anything else.


> Since we couldn’t decide which one is more intuitive, I would stick to the
> existing InteractiveRunner constructor that is open sourced.
> And we try to avoid changing any code outside …/runners/interactive/.
>
> Yes, we can stick with what's already there for now to avoid blocking any
implementation work.

> 2. watch() and visualize() can be in the independent interactive beam
>> module since they are
>>
>> concepts that are unique to being interactive
>>
>> 3. I'll add some example for the run_pipeline() in design doc. The short
>> answer is run_pipeline() != p.run(). Thanks for sharing the doc (
>> https://s.apache.org/no-beam-pipeline).
>> As described in the doc, when constructing the pipeline, we still want to
>> bundle a runner and options to the constructed pipeline even in the future.
>> So if the runner is InteractiveRunner, the interactivity instrument
>> (implicitly applied read/write cache PTransform and input/output wiring) is
>> only applied when "run_pipeline()" of the runner implementation is invoked.
>> p.run() will apply the instrument. However, this static function
>> run_pipeline() takes in a new runner and options,
>> invoking “run_pipeline()” implementation of the new runner and wouldn’t
>> have the instrument, thus no interactivity.
>> Because you cannot (don’t want to, as seen in the doc, users cannot
>> access the bundled pipeline/options in the future) change the runner easily
>> without re-executing all the notebook cells, this shorthand function allows
>> a user to run pipeline without interactivity immediately anywhere in a
>> notebook. In the meantime, the pipeline is still bundled with the original
>> Interactive Runner. The users can keep developing further pipelines.
>> The usage of this function is not intuitive until you put it in a
>> notebook user scenario where users develop, test in prod-like env and
>> develop further. And it’s equivalent to users writing
>> "from_runner_api(to_runner_api(pipeline))” in their notebook. It’s just a
>> shorthand.
>>
>
> What you're trying to work around here is the flaw in the existing API
> that a user binds the choice of Runner before pipeline construction, rather
> than at the point of execution. I propose we look at fixing this in Beam
> itself.
>
> Then I would propose not exposing this. If late runner binding is
> supported, we wouldn’t even need this. We can write it in an example
> notebook rather than exposing it.
>

Sounds good.


> 4. And we both agree that implicit cache is palatable and should be the
>> only thing we use to support interactivity. Cache and watched pipeline
>> definition (which tells us what to cache) are the main “hidden state” I
>> meant. Because the cache mechanism is totally implicit and hidden from the
>> user. A cache is either read or written in a p.run(). If an existing cache
>> is not used in a p.run(), it expires. If the user restarts the IPython
>> kernel, all cache should expire too.
>>
>
> Depending on how we label items in the cache, they could survive kernel
> restarts as well. This relates to another useful feature in Beam where if a
> batch pipeline fails towards the end, one may want to resume/rerun it from
> there 

Re: Write-through-cache in State logic

2019-08-26 Thread Lukasz Cwik
Your summary below makes sense to me. I can see that recovery from rolling
back doesn't need to be a priority and simplifies the solution for user
state caching down to one token.

Providing cache tokens upfront does require the Runner to know what
"version" of everything it may supply to the SDK upfront (instead of on
request) which would mean that the Runner may need to have a mapping from
cache token to internal version identifier for things like side inputs
which are typically broadcast. The Runner would also need to poll to see if
the side input has changed in the background to not block processing
bundles with "stale" side input data.

Ping me once you have the Runner PR updated and I'll take a look again.

On Mon, Aug 26, 2019 at 12:20 PM Maximilian Michels  wrote:

> Thank you for the summary Luke. I really appreciate the effort you put
> into this!
>
> > Based upon your discussion you seem to want option #1
>
> I'm actually for option #2. The option to cache/invalidate side inputs
> is important, and we should incorporate this in the design. That's why
> option #1 is not flexible enough. However, a first implementation could
> defer caching of side inputs.
>
> Option #3 was my initial thinking and the first version of the PR, but I
> think we agreed that there wouldn't be much gain from keeping a cache
> token per state id.
>
> Option #4 is what is specifically documented in the reference doc and
> already part of the Proto, where valid tokens are provided for each new
> bundle and also as part of the response of a get/put/clear. We mentioned
> that the reply does not have to be waited on synchronously (I mentioned
> it even), but it complicates the implementation. The idea Thomas and I
> expressed was that a response is not even necessary if we assume
> validity of the upfront provided cache tokens for the lifetime of a
> bundle and that cache tokens will be invalidated as soon as the Runner
> fails in any way. This is naturally the case for Flink because it will
> simply "forget" its current cache tokens.
>
> I currently envision the following schema:
>
> Runner
> ==
>
> - Runner generates a globally unique cache token, one for user state and
> one for each side input

- The token is supplied to the SDK Harness for each bundle request

- For the lifetime of a Runner<=>SDK Harness connection this cache token
> will not change
> - Runner will generate a new token if the connection/key space changes
> between Runner and SDK Harness


> SDK
> ===
>
> - For each bundle the SDK worker stores the list of valid cache tokens
> - The SDK Harness keep a global cache across all its (local) workers
> which is a LRU cache: state_key => (cache_token, value)
> - get: Lookup cache using the valid cache token for the state. If no
> match, then fetch from Runner and use the already available token for
> caching
> - put: Put value in cache with a valid cache token, put value to pending
> writes which will be flushed out latest when the bundle ends
> - clear: same as put but clear cache
>
> It does look like this is not too far off from what you were describing.
> The main difference is that we just work with a single cache token. In
> my opinion we do not need the second cache token for writes, as long as
> we ensure that we generate a new cache token if the bundle/checkpoint
> fails.
>
> I have a draft PR
>   for the Runner: https://github.com/apache/beam/pull/9374
>   for the SDK: https://github.com/apache/beam/pull/9418
>
> Note that the Runner PR needs to be updated to fully reflected the above
> scheme. The SDK implementation is WIP. I want to make sure that we
> clarify the design before this gets finalized.
>
> Thanks again for all your comments. Much appreciated!
>
> Cheers,
> Max
>
> On 26.08.19 19:58, Lukasz Cwik wrote:
> > There were originally a couple of ideas around how caching could work:
> > 1) One cache token for the entire bundle that is supplied up front. The
> > SDK caches everything using the given token. All reads/clear/append for
> > all types of state happen under this token. Anytime a side input
> > changes, key processing partition range changes or a bundle fails to
> > process, the runner chooses a new cache token effectively invalidating
> > everything in the past>
> > 2) One cache token per type of state that is supplied up front.
> > The SDK caches all requests for a given type using the given cache
> > token. The runner can selectively choose which type to keep and which to
> > invalidate. Bundle failure and key processing partition changes
> > invalidate all user state, side input change invalidates all side inputs.
> >
> > 3) One cache token per state id that is supplied up front.
> > The SDK caches all requests for the given state id using the given cache
> > token. The runner can selectively choose which to invalidate and which
> > to keep. Bundle failure and key processing partition changes invalidate
> > all user state, side input changes only invalidate the side input that

Re: Write-through-cache in State logic

2019-08-26 Thread Maximilian Michels
Thank you for the summary Luke. I really appreciate the effort you put
into this!

> Based upon your discussion you seem to want option #1

I'm actually for option #2. The option to cache/invalidate side inputs
is important, and we should incorporate this in the design. That's why
option #1 is not flexible enough. However, a first implementation could
defer caching of side inputs.

Option #3 was my initial thinking and the first version of the PR, but I
think we agreed that there wouldn't be much gain from keeping a cache
token per state id.

Option #4 is what is specifically documented in the reference doc and
already part of the Proto, where valid tokens are provided for each new
bundle and also as part of the response of a get/put/clear. We mentioned
that the reply does not have to be waited on synchronously (I mentioned
it even), but it complicates the implementation. The idea Thomas and I
expressed was that a response is not even necessary if we assume
validity of the upfront provided cache tokens for the lifetime of a
bundle and that cache tokens will be invalidated as soon as the Runner
fails in any way. This is naturally the case for Flink because it will
simply "forget" its current cache tokens.

I currently envision the following schema:

Runner
==

- Runner generates a globally unique cache token, one for user state and
one for each side input
- The token is supplied to the SDK Harness for each bundle request
- For the lifetime of a Runner<=>SDK Harness connection this cache token
will not change
- Runner will generate a new token if the connection/key space changes
between Runner and SDK Harness

SDK
===

- For each bundle the SDK worker stores the list of valid cache tokens
- The SDK Harness keep a global cache across all its (local) workers
which is a LRU cache: state_key => (cache_token, value)
- get: Lookup cache using the valid cache token for the state. If no
match, then fetch from Runner and use the already available token for
caching
- put: Put value in cache with a valid cache token, put value to pending
writes which will be flushed out latest when the bundle ends
- clear: same as put but clear cache

It does look like this is not too far off from what you were describing.
The main difference is that we just work with a single cache token. In
my opinion we do not need the second cache token for writes, as long as
we ensure that we generate a new cache token if the bundle/checkpoint fails.

I have a draft PR
  for the Runner: https://github.com/apache/beam/pull/9374
  for the SDK: https://github.com/apache/beam/pull/9418

Note that the Runner PR needs to be updated to fully reflected the above
scheme. The SDK implementation is WIP. I want to make sure that we
clarify the design before this gets finalized.

Thanks again for all your comments. Much appreciated!

Cheers,
Max

On 26.08.19 19:58, Lukasz Cwik wrote:
> There were originally a couple of ideas around how caching could work:
> 1) One cache token for the entire bundle that is supplied up front. The
> SDK caches everything using the given token. All reads/clear/append for
> all types of state happen under this token. Anytime a side input
> changes, key processing partition range changes or a bundle fails to
> process, the runner chooses a new cache token effectively invalidating
> everything in the past>
> 2) One cache token per type of state that is supplied up front.
> The SDK caches all requests for a given type using the given cache
> token. The runner can selectively choose which type to keep and which to
> invalidate. Bundle failure and key processing partition changes
> invalidate all user state, side input change invalidates all side inputs.
> 
> 3) One cache token per state id that is supplied up front.
> The SDK caches all requests for the given state id using the given cache
> token. The runner can selectively choose which to invalidate and which
> to keep. Bundle failure and key processing partition changes invalidate
> all user state, side input changes only invalidate the side input that
> changed.
> 
> 4) A cache token on each read/clear/append that is supplied on the
> response of the call with an initial valid set that is supplied at
> start. The runner can selectively choose which to keep on start. Bundle
> failure allows runners to "roll back" to a known good state by selecting
> the previous valid cache token as part of the initial set. Key
> processing partition changes allow runners to keep cached state that
> hasn't changed since it can be tied to a version number of the state
> itself as part of the initial set. Side input changes only invalidate
> the side input that changed.
> 
> Based upon your discussion you seem to want option #1 which doesn't work
> well with side inputs clearing cached state. If we want to have user
> state survive a changing side input, we would want one of the other
> options. I do agree that supplying the cache token upfront is
> significantly simpler. Currently the protos are 

Re: Write-through-cache in State logic

2019-08-26 Thread Lukasz Cwik
There were originally a couple of ideas around how caching could work:
1) One cache token for the entire bundle that is supplied up front. The SDK
caches everything using the given token. All reads/clear/append for all
types of state happen under this token. Anytime a side input changes, key
processing partition range changes or a bundle fails to process, the runner
chooses a new cache token effectively invalidating everything in the past.

2) One cache token per type of state that is supplied up front.
The SDK caches all requests for a given type using the given cache token.
The runner can selectively choose which type to keep and which to
invalidate. Bundle failure and key processing partition changes invalidate
all user state, side input change invalidates all side inputs.

3) One cache token per state id that is supplied up front.
The SDK caches all requests for the given state id using the given cache
token. The runner can selectively choose which to invalidate and which to
keep. Bundle failure and key processing partition changes invalidate all
user state, side input changes only invalidate the side input that changed.

4) A cache token on each read/clear/append that is supplied on the response
of the call with an initial valid set that is supplied at start. The runner
can selectively choose which to keep on start. Bundle failure allows
runners to "roll back" to a known good state by selecting the previous
valid cache token as part of the initial set. Key processing partition
changes allow runners to keep cached state that hasn't changed since it can
be tied to a version number of the state itself as part of the initial set.
Side input changes only invalidate the side input that changed.

Based upon your discussion you seem to want option #1 which doesn't work
well with side inputs clearing cached state. If we want to have user state
survive a changing side input, we would want one of the other options. I do
agree that supplying the cache token upfront is significantly simpler.
Currently the protos are setup for #4 since it was the most flexible and at
the time the pros outweighed the cons.

I don't understand why you think you need to wait for a response for the
append/clear to get its cache token since the only reason you need the
cache token is that you want to use that cached data when processing a
different bundle. I was thinking that the flow on the SDK side would be
something like (assuming there is a global cache of cache token -> (map of
state key -> data))
1) Create a local cache of (map of state key -> data) using the initial set
of valid cache tokens
2) Make all mutations in place on local cache without waiting for response.
3) When response comes back, update global cache with new cache token ->
(map of state key -> data)) (this is when the data becomes visible to other
bundles that start processing)
4) Before the bundle finishes processing, wait for all outstanding state
calls to finish.

To implement caching on the runner side, you would keep track of at most 2
cache tokens per state key, one cache token represents the initial value
when the bundle started while the second represents the modified state. If
the bundle succeeds the runner passes in the set of tokens which represent
the new state, if the bundle fails you process using the original ones.

After thinking through the implementation again, we could supply two cache
tokens for each state id, the first being the set of initial tokens if no
writes happen while the second represents the token to use if the SDK
changes the state. This gives us the simplification where we don't need to
wait for the response before we update the global cache making a typical
blocking cache much easier to do. We also get the benefit that runners can
supply either the same cache token for a state id or different ones. If the
runner supplies the same one then its telling the SDK to make modifications
in place without any rollback (which is good on memory since we are
reducing copies of stuff) or if the runner supplies two different ones then
its telling the SDK to keep the old data around. If we went through with
this new option the SDK side logic would be (assuming there is a global
cache of cache token -> (map of state key -> data)):

1) Create an empty local set of state ids that are dirty when starting a
new bundle (dirty set)

For reads/gets:
2A) If the request is a read (get), use dirty set to choose which cache
token to lookup and use in the global cache. If the global cache is missing
data issue the appropriate request providing the result.

For writes/appends/clear:
2B) if the cache tokens are different for the state id, add the state id to
the dirty set if it isn't there and perform the appropriate modification to
convert the old cached state data to the new state data
3B) modify the global caches data
4B) issue the request to the runner
5B*) add this request to the set of requests to block on before completing
the bundle.

(* Note, there 

Beam Dependency Check Report (2019-08-26)

2019-08-26 Thread Apache Jenkins Server

High Priority Dependency Updates Of Beam Python SDK:


  Dependency Name
  Current Version
  Latest Version
  Release Date Of the Current Used Version
  Release Date Of The Latest Release
  JIRA Issue
  
google-cloud-pubsub
0.39.1
0.45.0
2019-01-21
2019-08-05BEAM-5539
mock
2.0.0
3.0.5
2019-05-20
2019-05-20BEAM-7369
oauth2client
3.0.0
4.1.3
2018-12-10
2018-12-10BEAM-6089
Sphinx
1.8.5
2.2.0
2019-05-20
2019-08-19BEAM-7370
High Priority Dependency Updates Of Beam Java SDK:


  Dependency Name
  Current Version
  Latest Version
  Release Date Of the Current Used Version
  Release Date Of The Latest Release
  JIRA Issue
  
com.github.spotbugs:spotbugs
3.1.12
4.0.0-beta3
2019-03-01
2019-06-23BEAM-7792
com.github.spotbugs:spotbugs-annotations
3.1.12
4.0.0-beta3
2019-03-01
2019-06-23BEAM-6951
javax.servlet:javax.servlet-api
3.1.0
4.0.1
2013-04-25
2018-04-20BEAM-5750
junit:junit
4.13-beta-1
4.13-beta-3
2018-11-25
2019-05-05BEAM-6127
org.conscrypt:conscrypt-openjdk
1.1.3
2.2.1
2018-06-04
2019-08-08BEAM-5748
org.eclipse.jetty:jetty-server
9.2.10.v20150310
10.0.0-alpha0
2015-03-10
2019-07-11BEAM-5752
org.eclipse.jetty:jetty-servlet
9.2.10.v20150310
10.0.0-alpha0
2015-03-10
2019-07-11BEAM-5753
Gradle:
5.2.1
5.6
2019-08-19
2019-08-19BEAM-8002

 A dependency update is high priority if it satisfies one of following criteria: 

 It has major versions update available, e.g. org.assertj:assertj-core 2.5.0 -> 3.10.0; 


 It is over 3 minor versions behind the latest version, e.g. org.tukaani:xz 1.5 -> 1.8; 


 The current version is behind the later version for over 180 days, e.g. com.google.auto.service:auto-service 2014-10-24 -> 2017-12-11. 

 In Beam, we make a best-effort attempt at keeping all dependencies up-to-date.
 In the future, issues will be filed and tracked for these automatically,
 but in the meantime you can search for existing issues or open a new one.

 For more information:  Beam Dependency Guide