Hi,I Just wanted to add some points on subjects I took part in: On the Elasticsearch IO stand point, in addittion to ES 6 support, there was also - the addition of an exponential retry backoff on the write for all the ES versions.- meta-data support for all the ES versions. BestEtienne
Le lundi 10 septembre 2018 à 19:56 -0700, Rose Nguyen a écrit : > September 2018 | Newsletter > > > What’s been done > > CI improvement (by: Etienne Chauchot) > For each new commit on master Nexmark suite is run in both batch and > streaming mode in Spark, Flink, Cloud Dataflow > (thanks to Andrew) and dashboards graphs are produced to track functional and > performance regressions. > Elasticsearch IO Supports Version 6 (by: Dat Tran) > Elasticsearch IO now supports version 6.x in addition to version 2.x and > 5.x.See the merged PR for more details. > KuduIO Added (by: Tim Robertson) > Apache Beam master now has KuduIO that will be released with Beam 2.7.0.See > BEAM-2661 for more details. > > > > What we’re working on... > > Flink Portable Runner (by: Ankur Goenka, Maximilian Michels, Thomas Weise, > Ryan Williams) > Support for streaming side inputs merged > Portable > Compatibility Matrix tests pass in streaming mode > Many more > ValidatesRunner tests pass (ValidatesRunner is a comprehensive suite for Beam > test pipelines)Python Pipelines can be > tested without bringing up a JobServer first (it is started in a container) > Experimental support for executing > the SDK harnesses in a process instead of a Docker container Bug fixes to > Beam discovered during working on the > portabilityState and Timer Support in Python SDK (by: Charles Chen, Robert > Bradshaw) > This change adds the reference DirectRunner implementation of the Python User > State and Timers API. With this change, > a user can execute DoFns with state and timers on the DirectRunner.See the > design doc and PR for more details.. > New IO - HadoopOutputFormatIO (by: Alexey Romanenko) > Adding support of MapReduce OutputFormat.See BEAM-5310 for more details. > High-level Java 8 DSL (by: David Moravek, Vaclav Plajt, Marek Simunek) > Adding high-level Java 8 DSL based on Euphoria API projectSee BEAM-3900 for > more details. Performance improvements for > HDFS file writing operations (by: Tim Robertson) > Autocreate directories when doing an HDFS renameSee PR for more details > Recognition of non-code contributions (by: Gris Cuevas) > Got consensus about recognizing non-code contributionsSee discussion for more > detailsPlanned launch date: Beam Summit > London (October 2nd) > Weekly Community Updates (by: Gris Cuevas) > Some of the project’s subcomponents run weekly updates in the mailing list, > we’ll be consolidating best practices to > share a weekly community update with all project related must knows in a shell > > > > What’s planned > > Beam Cookbook (by: Austin Bennett, David Cavazos, Gris Cuevas, Andrea > Foegler, Rose Nguyen, Connell O'Callaghan, and > you!) > We are creating a cookbook for common data science tasks in Beam and have > started brainstormingWe want to have a > hackathon after the London Summit to generate content from the communityThere > will be a session at the summit to > gather more ideas and input. Watch the dev and users mailing list for a call > for contributions soon!. > Beam 2.7.0 release (by: Charles Chen) > > Beam Mascot (by: Gris Cuevas & Community!) > We got approval to launch a contest to create a new Apache Beam mascotSee > discussion for more details, if you’re > interested in driving this, reach out in the thread!Planned launch date: Last > week of September > > > > New Members > > New Contributors > Đạt Trần, Ho Chi Minh City, VietnamSee BEAM-5107 for more details on > “Support ES-6.x for ElasticsearchIO” Ravi > Pathak, Copenhagen, DenmarkUsing Beam for indexing open data on species at > GBIF.orgImproving robustness of SolrIO > New Committers > Tim Robertson, Copenhagen, Denmark > > > > Events, Talks & Meetups > > [Coming Up] Beam Summit @ London, England > Organized by: Matthias Baetens, Victor Kotai, Alex Van Boxel & Gris CuevasThe > Beam Summit London 2018 will take place > on October 1 and 2 in London. If you’re interested in speaking reach out to > gris@apache.orgMore info can be found in > the blog post and you can get your tickets on Eventbrite > [Coming Up] ApacheCon @ Montréal, Canada > Will take place Sep 24-27 Etienne Chauchot will give a talk on Universal > Metrics with BeamAlexey Romanenko and Ismaël > Mejía will give a talk on Building portable and evolvable data-intensive > applications with ApacheIsmaël Mejía and > Eugene Kirpichov will give a talk on Robust, performant and modular APIs for > data ingestion with Apache BeamGris > Cuevas will host a Birds of a Feather session on 9/26: Design Thinking to > manage online communities in Open Source > Projects… It’ll be a Beam get together, we’ll have food & Swag, join us! > [Coming Up] DataEngConf @ Barcelona, Spain > Will take place Sep 25-26Maximilian Michels will give an introduction to Beam > and its portability features. > [Occurred] OSCON @ Portland, OR, USA (by: Holden Karau & Gris Cuevas) > Holden Karau gave a talk on TFT/TFMA + Beam on Flink (and other related > adventures). Watch the video here and see the > slides hereGris Cuevas gave a talk about active inclusion in Open Source, > slides here > [Occurred] Open Challenge @ Guadalajara, Mexico (by: OSoM, IBM & Google) > Arianne Navarro, Hector Paredes, Pablo Estrada & Gris Cuevas hosted a > Hackathon for Apache Beam and BlueXolo, results > include 3PR for Beam and 8 Software Engineers introduced to Apache Beam > [Occurred] Open Source Summit @ Vancouver, Canada > Gris Cuevas gave a talk on active diversification in Open Source, slides > hereIsmael Mejia gave a talk on Apache Beam, > see details here > [Occurred] Flink Forward @ Berlin, Germany > Robert Bradshaw and Maximilian Michels gave talk on Universal Machine > Learning with Apache Beam, schedule, > slidesAljoscha Krettek and Thomas Weise Python Streaming Pipelines with Beam > on Flink, schedule, slides > > > > Resources > > Setting up a Java Development Env Beam on GCP (by: Jacob Ferriero)This post > will help you get a development > environment up and running to start developing Java Dataflow jobs. By the end > you’ll be able to run an Apache Beam > locally in debug mode, execute code in a REPL to speed your development > cycles, and submit your job to Google Cloud > Dataflow. Medium Post. Coding Apache Beam in your Web Browser (by: Daniel De > Leo)But what happens when you’re on the > go on a computer which doesn’t support your IDE of choice, or you’re using > someone else’s computer and need to develop > Apache Beam pipelines? Google has you covered! Google’s Cloud Shell comes > with a built-in Code Editor for > developing/modifying code (it’s based on Eclipse’s Orion). It’s not as full > featured as an IDE but it does beat using > Vim or Emacs to edit code! Medium Post. Building a real time quant trading > engine on Dataflow and Beam (by: Lei He) > In this post, we are going to build a data pipeline that analyzes real time > stock tick data streamed from gCloud > Pub/Sub, runs them through a pair correlation trading algorithm, and outputs > trading signals onto Pub/Sub for > execution. Medium Post. Apache Beam: Reading from S3 and writing to BigQuery > (by: Asa Harland) > In this article we look at how we can use Apache Beam to extract data from > AWS S3 (or Google Cloud Storage), run some > aggregations over the data and store the result in BigQuery. Medium Post. > Apache Beam Events & Meetups > Join our Slack channel! > > Until Next Time! > > This edition was curated by our community of contributors, committers and > PMCs. It contains work done in August 2018 > and ongoing efforts. We hope to provide visibility to what's going on in the > community, so if you have questions, feel > free to ask in this thread. > -- > Rose Thị Nguyễn >