Re: Apache Beam Newsletter - August 2018

2018-08-22 Thread Etienne Chauchot
Hi Rose, 
I know the newsletter has already been sent, but may I add some of my ongoing 
subjects:
What's been done:- CI improvement: for each new commit on master Nexmark suite 
is run in both batch and streaming mode
in spark, flink, dataflow (thanks to Andrew) and dashboards graphs are produced 
to track functional and performance
regressions.
For talks, I guess only talks that already took place are included, not the 
ones scheduled for the ApacheCon in
September right ?
Etienne




Le vendredi 10 août 2018 à 12:37 -0700, Rose Nguyen a écrit :
> August 2018 | Newsletter
> What’s been doneApache Beam 2.6.0 Release
> The Apache Beam team is pleased to announce the release of 2.6.0 version! 
> This is the second release under the new
> build system, and the process has kept improving.You can download the release 
> here and read the release notes for more
> details.
> 
> Beam Summit London 2018 (by: Matthias Baetens, Gris Cuevas, Viktor Kotai)
> Approval from the Apache Software Foundation is underway. We are currently 
> finding a venue  and sponsors. We’ll send
> the call for participation soon to curate the agenda.If you’re interested in 
> participating in the organization of the
> event, reach out to the organizers.Dates TBD be we are considering the first 
> or last days of October.
> Support for Bounded SDF in all runners (by: Eugene Kirpichov)
> Beam introduced recently a new type of DoFn called SplittableDoFn (SDF) to 
> enable richer modularity in its IO
> connectors. Support for SDF in bounded (batch) connectors was added for all 
> runners. Apache Kudu IO (by: Tim
> Robertson)
> A new IO connector for the Apache Kudu data store was added recently.See 
> BEAM-2661 for more details on it.
> IO improvements (by: Ismaël Mejía)
> HBaseIO added a new transform based on SDF called readAll.See BEAM-4020 for 
> more details on it.
> 
> 
> What we’re working on...Interactive Runner for Beam (by: Harsh Vardhan, Sindy 
> Li, Chamikara Jayalath, Anand Iyer,
> Robert Bradshaw)
> Notebook-based interactive processing of Beam pipelines.This is now ready to 
> try out in Jupyter Notebook for
> BeamPython pipelines over DirectRunner!See the design doc for more details 
> and watch a demo here.Thoughts, comments
> and discussions welcome :)
> Python 3 Support (by, in alphabetical order: Ahmet Altay,  Robert Bradshaw, 
> Charles Chen, Matthias Feys, Holden Karau,
> Sergei Lebedev, Robbe Sneyders, Valentyn Tymofieiev)
> Major progress has been made on making Beam Python codebase 
> Python3-compatible through futurization.Read for more
> details in the proposal.
> 
> New IO connectors (by: John Rudolf Lewis, Jacob Marble)
> Amazon Simple Queue Service (SQS) is in review.Amazon Redshift is in 
> progress.Portable Runners (by: Ankur Goenka,
> Eugene Kirpichov, Ben Sidhom, Axel Magnuson, Thomas Weise, Ryan Williams , 
> Robert Bradshaw, Daniel Oliveira, Holden
> Karau)
> Good progress on Portable Flink Runner and many of the ValidatesRunner tests 
> are passing now.Portable Flink Runner can
> now execute batch WordCount in Java, Python and Go.Many enhancements and bug 
> fixes in Portable Reference Runner.See
> Jira https://issues.apache.org/jira/browse/BEAM-2889 for more details on  
> progress. Dependencies (by: Yifan Zou,
> Chamikara Jayalath)
> We added a dependencies guide for Beam and tooling to automatically create 
> JIRAs for significantly outdated
> dependencies. We are working on upgrading existing dependencies.See the Beam 
> dependencies guide for more details.
> 
> 
> 
> New MembersNew Contributors
> Rose Nguyen, Seattle, WA, USABeam docs contributor Working to improve docs 
> usability Connell O'Callaghan, Seattle, WA,
> USAInterested in growing the communityHelping with community triages and 
> managing issues
> 
> 
> 
> Talks & MeetupsStream Processing Meetup@LinkedIn  7/19/18
> Xinyu Liu gave a talk on building a Samza Runner for Beam“Beam meet up, 
> Samza!” and see it here. 
> Large Scale Landuse Classification of Satellite Images, Berlin 
> Buzzwords@Berlin 6/11/18
> Suneel Marthi and Jose Luis Contreras gave a talk on using streaming 
> pipelines built on Apache Flink for model
> training and inference. They leveraged convolutional Neural Networks (CNNs) 
> built with Apache MXNet to train Deep
> Learning models for land use classification. Read about it and watch it here.
> Big Data in Production Meetup@Cambridge, MA 6/28/18
> Robert Bradshaw and Eila Arich-Landkof gave a talk about Apache Beam and 
> machine learning. Event details here and
> watch their talks here.
> 
> ResourcesAwesome Beam (by: Pablo Estrada)
> Inspired by efforts in Awesome Flink and  Awesome Hadoop, I’ve created the 
> Awesome Beam repo to aggregate interesting
> Beam things.
> 
> 
> Until Next Time!
> This edition was curated by our community of contributors, committers and 
> PMCs. It contains work done in June and July
> of 2018 and ongoing efforts. We hope to provide visibility to what's going on 
> in the community, 

Re: Apache Beam Newsletter - August 2018

2018-08-12 Thread Griselda Cuevas
Thanks for compiling and sending Rose!


On Fri, 10 Aug 2018 at 13:09, Pablo Estrada  wrote:

> Thank you for compiling this Rose! Interesting stories and news.
> -P.
>
> On Fri, Aug 10, 2018 at 12:37 PM Rose Nguyen  wrote:
>
>> [image: Beam.png]
>>
>> August 2018 | Newsletter
>>
>> What’s been done
>>
>> [image: Tick - done]
>>
>> Apache Beam 2.6.0 Release
>>
>>-
>>
>>The Apache Beam team is pleased to announce the release of 2.6.0
>>version! This is the second release under the new build system, and
>>the process has kept improving.
>>-
>>
>>You can download the release here
>> and read the release
>>notes
>>
>> 
>>for more details.
>>
>>
>> Beam Summit London 2018 (by: Matthias Baetens, Gris Cuevas, Viktor Kotai)
>>
>>-
>>
>>Approval from the Apache Software Foundation is underway. We are
>>currently finding a venue and sponsors. We’ll send the call for
>>participation soon to curate the agenda.
>>-
>>
>>If you’re interested in participating in the organization of the
>>event, reach out to the organizers.
>>-
>>
>>Dates TBD be we are considering the first or last days of October.
>>
>>
>> Support for Bounded SDF in all runners (by: Eugene Kirpichov)
>>
>>-
>>
>>Beam introduced recently a new type of DoFn called SplittableDoFn
>> (SDF)
>>to enable richer modularity in its IO connectors.
>>-
>>
>>Support for SDF in bounded (batch) connectors was added for all
>>runners.
>>
>> Apache Kudu IO (by: Tim Robertson)
>>
>>-
>>
>>A new IO connector for the Apache Kudu
>> data
>>store was added recently.
>>-
>>
>>See BEAM-2661 for more details on it.
>>
>>
>> IO improvements (by: Ismaël Mejía)
>>
>>-
>>
>>HBaseIO added a new transform based on SDF called readAll.
>>-
>>
>>See BEAM-4020 for more details on it.
>>
>>
>>
>> What we’re working on...
>>
>> Interactive Runner for Beam (by: Harsh Vardhan, Sindy Li, Chamikara
>> Jayalath, Anand Iyer, Robert Bradshaw)
>>
>>-
>>
>>Notebook-based interactive processing of Beam pipelines.
>>-
>>
>>This is now ready to try out in Jupyter Notebook for BeamPython
>>pipelines over DirectRunner!
>>-
>>
>>See the design doc  for more
>>details and watch a demo here
>>.
>>-
>>
>>Thoughts, comments and discussions welcome :)
>>
>>
>> Python 3 Support (by, in alphabetical order: Ahmet Altay,  Robert
>> Bradshaw, Charles Chen, Matthias Feys, Holden Karau, Sergei Lebedev, Robbe
>> Sneyders, Valentyn Tymofieiev)
>>
>>-
>>
>>Major progress has been made on making Beam Python codebase
>>Python3-compatible through futurization.
>>-
>>
>>Read for more details in the proposal
>>.
>>
>>
>> New IO connectors (by: John Rudolf Lewis, Jacob Marble)
>>
>>-
>>
>>Amazon Simple Queue Service (SQS) is in review.
>>-
>>
>>Amazon Redshift is in progress.
>>
>> Portable Runners (by: Ankur Goenka, Eugene Kirpichov, Ben Sidhom, Axel
>> Magnuson, Thomas Weise, Ryan Williams , Robert Bradshaw, Daniel Oliveira,
>> Holden Karau)
>>
>>-
>>
>>Good progress on Portable Flink Runner and many of the
>>ValidatesRunner tests are passing now.
>>-
>>
>>Portable Flink Runner can now execute batch WordCount in Java, Python
>>and Go.
>>-
>>
>>Many enhancements and bug fixes in Portable Reference Runner.
>>-
>>
>>See Jira https://issues.apache.org/jira/browse/BEAM-2889 for more
>>details on  progress.
>>
>> Dependencies (by: Yifan Zou, Chamikara Jayalath)
>>
>>-
>>
>>We added a dependencies guide for Beam and tooling to automatically
>>create JIRAs for significantly outdated dependencies. We are working on
>>upgrading existing dependencies.
>>-
>>
>>See the Beam dependencies guide
>> for more details.
>>
>>
>>
>>
>> New Members
>>
>> New Contributors
>>
>>-
>>
>>Rose Nguyen, Seattle, WA, USA
>>-
>>
>>   Beam docs contributor
>>   -
>>
>>   Working to improve docs usability
>>   -
>>
>>Connell O'Callaghan, Seattle, WA, USA
>>-
>>
>>   Interested in growing the community
>>   -
>>
>>   Helping with community triages and managing issues
>>
>>
>>
>>
>> Talks & Meetups
>>
>> Stream Processing Meetup@LinkedIn  7/19/18
>>
>>-
>>
>>Xinyu Liu gave a talk on building a Samza Runner for Beam
>>-
>>
>>“Beam meet up, Samza!” and see it here
>>.
>>
>>
>> Large Scale Landuse Classification of Satellite 

Re: Apache Beam Newsletter - August 2018

2018-08-10 Thread Pablo Estrada
Thank you for compiling this Rose! Interesting stories and news.
-P.

On Fri, Aug 10, 2018 at 12:37 PM Rose Nguyen  wrote:

> [image: Beam.png]
>
> August 2018 | Newsletter
>
> What’s been done
>
> [image: Tick - done]
>
> Apache Beam 2.6.0 Release
>
>-
>
>The Apache Beam team is pleased to announce the release of 2.6.0
>version! This is the second release under the new build system, and
>the process has kept improving.
>-
>
>You can download the release here
> and read the release
>notes
>
> 
>for more details.
>
>
> Beam Summit London 2018 (by: Matthias Baetens, Gris Cuevas, Viktor Kotai)
>
>-
>
>Approval from the Apache Software Foundation is underway. We are
>currently finding a venue and sponsors. We’ll send the call for
>participation soon to curate the agenda.
>-
>
>If you’re interested in participating in the organization of the
>event, reach out to the organizers.
>-
>
>Dates TBD be we are considering the first or last days of October.
>
>
> Support for Bounded SDF in all runners (by: Eugene Kirpichov)
>
>-
>
>Beam introduced recently a new type of DoFn called SplittableDoFn
> (SDF)
>to enable richer modularity in its IO connectors.
>-
>
>Support for SDF in bounded (batch) connectors was added for all
>runners.
>
> Apache Kudu IO (by: Tim Robertson)
>
>-
>
>A new IO connector for the Apache Kudu
> data
>store was added recently.
>-
>
>See BEAM-2661 for more details on it.
>
>
> IO improvements (by: Ismaël Mejía)
>
>-
>
>HBaseIO added a new transform based on SDF called readAll.
>-
>
>See BEAM-4020 for more details on it.
>
>
>
> What we’re working on...
>
> Interactive Runner for Beam (by: Harsh Vardhan, Sindy Li, Chamikara
> Jayalath, Anand Iyer, Robert Bradshaw)
>
>-
>
>Notebook-based interactive processing of Beam pipelines.
>-
>
>This is now ready to try out in Jupyter Notebook for BeamPython
>pipelines over DirectRunner!
>-
>
>See the design doc  for more
>details and watch a demo here
>.
>-
>
>Thoughts, comments and discussions welcome :)
>
>
> Python 3 Support (by, in alphabetical order: Ahmet Altay,  Robert
> Bradshaw, Charles Chen, Matthias Feys, Holden Karau, Sergei Lebedev, Robbe
> Sneyders, Valentyn Tymofieiev)
>
>-
>
>Major progress has been made on making Beam Python codebase
>Python3-compatible through futurization.
>-
>
>Read for more details in the proposal
>.
>
>
> New IO connectors (by: John Rudolf Lewis, Jacob Marble)
>
>-
>
>Amazon Simple Queue Service (SQS) is in review.
>-
>
>Amazon Redshift is in progress.
>
> Portable Runners (by: Ankur Goenka, Eugene Kirpichov, Ben Sidhom, Axel
> Magnuson, Thomas Weise, Ryan Williams , Robert Bradshaw, Daniel Oliveira,
> Holden Karau)
>
>-
>
>Good progress on Portable Flink Runner and many of the ValidatesRunner
>tests are passing now.
>-
>
>Portable Flink Runner can now execute batch WordCount in Java, Python
>and Go.
>-
>
>Many enhancements and bug fixes in Portable Reference Runner.
>-
>
>See Jira https://issues.apache.org/jira/browse/BEAM-2889 for more
>details on  progress.
>
> Dependencies (by: Yifan Zou, Chamikara Jayalath)
>
>-
>
>We added a dependencies guide for Beam and tooling to automatically
>create JIRAs for significantly outdated dependencies. We are working on
>upgrading existing dependencies.
>-
>
>See the Beam dependencies guide
> for more details.
>
>
>
>
> New Members
>
> New Contributors
>
>-
>
>Rose Nguyen, Seattle, WA, USA
>-
>
>   Beam docs contributor
>   -
>
>   Working to improve docs usability
>   -
>
>Connell O'Callaghan, Seattle, WA, USA
>-
>
>   Interested in growing the community
>   -
>
>   Helping with community triages and managing issues
>
>
>
>
> Talks & Meetups
>
> Stream Processing Meetup@LinkedIn  7/19/18
>
>-
>
>Xinyu Liu gave a talk on building a Samza Runner for Beam
>-
>
>“Beam meet up, Samza!” and see it here
>.
>
>
> Large Scale Landuse Classification of Satellite Images, Berlin
> Buzzwords@Berlin 6/11/18
>
>-
>
>Suneel Marthi and Jose Luis Contreras gave a talk on using streaming
>pipelines built on Apache Flink for model training and inference. They
>leveraged convolutional Neural Networks (CNNs) built with Apache MXNet to
>train Deep 

Apache Beam Newsletter - August 2018

2018-08-10 Thread Rose Nguyen
[image: Beam.png]

August 2018 | Newsletter

What’s been done

[image: Tick - done]

Apache Beam 2.6.0 Release

   -

   The Apache Beam team is pleased to announce the release of 2.6.0
   version! This is the second release under the new build system, and the
   process has kept improving.
   -

   You can download the release here
    and read the release
   notes
   

   for more details.


Beam Summit London 2018 (by: Matthias Baetens, Gris Cuevas, Viktor Kotai)

   -

   Approval from the Apache Software Foundation is underway. We are
   currently finding a venue and sponsors. We’ll send the call for
   participation soon to curate the agenda.
   -

   If you’re interested in participating in the organization of the event,
   reach out to the organizers.
   -

   Dates TBD be we are considering the first or last days of October.


Support for Bounded SDF in all runners (by: Eugene Kirpichov)

   -

   Beam introduced recently a new type of DoFn called SplittableDoFn
    (SDF) to
   enable richer modularity in its IO connectors.
   -

   Support for SDF in bounded (batch) connectors was added for all runners.

Apache Kudu IO (by: Tim Robertson)

   -

   A new IO connector for the Apache Kudu
    data
   store was added recently.
   -

   See BEAM-2661 for more details on it.


IO improvements (by: Ismaël Mejía)

   -

   HBaseIO added a new transform based on SDF called readAll.
   -

   See BEAM-4020 for more details on it.



What we’re working on...

Interactive Runner for Beam (by: Harsh Vardhan, Sindy Li, Chamikara
Jayalath, Anand Iyer, Robert Bradshaw)

   -

   Notebook-based interactive processing of Beam pipelines.
   -

   This is now ready to try out in Jupyter Notebook for BeamPython
   pipelines over DirectRunner!
   -

   See the design doc  for more
   details and watch a demo here
   .
   -

   Thoughts, comments and discussions welcome :)


Python 3 Support (by, in alphabetical order: Ahmet Altay,  Robert Bradshaw,
Charles Chen, Matthias Feys, Holden Karau, Sergei Lebedev, Robbe Sneyders,
Valentyn Tymofieiev)

   -

   Major progress has been made on making Beam Python codebase
   Python3-compatible through futurization.
   -

   Read for more details in the proposal
   .


New IO connectors (by: John Rudolf Lewis, Jacob Marble)

   -

   Amazon Simple Queue Service (SQS) is in review.
   -

   Amazon Redshift is in progress.

Portable Runners (by: Ankur Goenka, Eugene Kirpichov, Ben Sidhom, Axel
Magnuson, Thomas Weise, Ryan Williams , Robert Bradshaw, Daniel Oliveira,
Holden Karau)

   -

   Good progress on Portable Flink Runner and many of the ValidatesRunner
   tests are passing now.
   -

   Portable Flink Runner can now execute batch WordCount in Java, Python
   and Go.
   -

   Many enhancements and bug fixes in Portable Reference Runner.
   -

   See Jira https://issues.apache.org/jira/browse/BEAM-2889 for more
   details on  progress.

Dependencies (by: Yifan Zou, Chamikara Jayalath)

   -

   We added a dependencies guide for Beam and tooling to automatically
   create JIRAs for significantly outdated dependencies. We are working on
   upgrading existing dependencies.
   -

   See the Beam dependencies guide
    for more details.




New Members

New Contributors

   -

   Rose Nguyen, Seattle, WA, USA
   -

  Beam docs contributor
  -

  Working to improve docs usability
  -

   Connell O'Callaghan, Seattle, WA, USA
   -

  Interested in growing the community
  -

  Helping with community triages and managing issues




Talks & Meetups

Stream Processing Meetup@LinkedIn  7/19/18

   -

   Xinyu Liu gave a talk on building a Samza Runner for Beam
   -

   “Beam meet up, Samza!” and see it here
   .


Large Scale Landuse Classification of Satellite Images, Berlin
Buzzwords@Berlin 6/11/18

   -

   Suneel Marthi and Jose Luis Contreras gave a talk on using streaming
   pipelines built on Apache Flink for model training and inference. They
   leveraged convolutional Neural Networks (CNNs) built with Apache MXNet to
   train Deep Learning models for land use classification.
   -

   Read about it and watch it here
   

   .


Big Data in Production Meetup@Cambridge, MA 6/28/18

   -

   Robert Bradshaw and Eila Arich-Landkof gave a talk about Apache Beam and
   machine learning.
   -

   Event details here
   
   and