Excellent guys, Welcome to Beam !

I am looking for ways to integrate Beam with the standard notebook tools
(Zẽppelin / Jupyter [ipython], so I am really happy to see the python SDK
arriving to Beam, Awesome.

Ismaël Mejía

On Fri, Jun 3, 2016 at 7:17 PM, Amit Sela <[email protected]> wrote:

> Welcome Python people ;)
>
> I know a few people who've been waiting for this one!
>
> On Fri, Jun 3, 2016, 19:53 Davor Bonaci <[email protected]> wrote:
>
> > Welcome Python SDK, as well as Silviu, Charles, Ahmet and Chamikara!
> >
> > On Fri, Jun 3, 2016 at 7:07 AM, Jean-Baptiste Onofré <[email protected]>
> > wrote:
> >
> > > Absolutely ;)
> > >
> > >
> > > On 06/03/2016 03:51 PM, James Malone wrote:
> > >
> > >> Hey Silviu!
> > >>
> > >> I think JB is proposing we create a python directory in the sdks
> > directory
> > >> in the root repository (and modify the configuration files
> accordingly):
> > >>
> > >>     https://github.com/apache/incubator-beam/tree/master/sdks
> > >>
> > >> This Beam document here titled "Apache Beam (Incubating): Repository
> > >> Structure" details the proposed repository structure and may be
> useful:
> > >>
> > >>
> > >>
> > >>
> >
> https://drive.google.com/a/google.com/folderview?id=0B-IhJZh9Ab52OFBVZHpsNjc4eXc
> > >>
> > >> Best,
> > >>
> > >> James
> > >>
> > >>
> > >>
> > >> On Fri, Jun 3, 2016 at 6:34 AM, Silviu Calinoiu
> > >> <[email protected]>
> > >> wrote:
> > >>
> > >> Hi JB,
> > >>> Thanks for the welcome! I come from the Python land so  I am not
> quite
> > >>> familiar with Maven. What do you mean by a Maven module? You mean an
> > >>> artifact so you can install things? In Python, people are used to
> > >>> packages
> > >>> downloaded from PyPI (pypi.python.org -- which is sort of Maven for
> > >>> Python). Whatever is the standard way of doing things in Apache we'll
> > do
> > >>> it. Just asking for clarifications.
> > >>>
> > >>> By the way this discussion is very useful since we will have to iron
> > out
> > >>> several details like this.
> > >>> Thanks,
> > >>> Silviu
> > >>>
> > >>> On Fri, Jun 3, 2016 at 6:19 AM, Jean-Baptiste Onofré <
> [email protected]>
> > >>> wrote:
> > >>>
> > >>> Hi Silviu,
> > >>>>
> > >>>> thanks for detailed update and great work !
> > >>>>
> > >>>> I would advice to create a:
> > >>>>
> > >>>> sdks/python
> > >>>>
> > >>>> Maven module to store the Python SDK.
> > >>>>
> > >>>> WDYT ?
> > >>>>
> > >>>> By the way, welcome aboard and great to have you all guys in the
> team
> > !
> > >>>>
> > >>>> Regards
> > >>>> JB
> > >>>>
> > >>>> On 06/03/2016 03:13 PM, Silviu Calinoiu wrote:
> > >>>>
> > >>>> Hi all,
> > >>>>>
> > >>>>> My name is Silviu Calinoiu and I am a member of the Cloud Dataflow
> > team
> > >>>>> working on the Python SDK.  As the original Beam proposal (
> > >>>>> https://wiki.apache.org/incubator/BeamProposal) mentioned, we have
> > >>>>> been
> > >>>>> planning to merge the Python SDK into Beam. The Python SDK is in an
> > >>>>>
> > >>>> early
> > >>>
> > >>>> stage of development (alpha milestone) and so this is a good time to
> > >>>>>
> > >>>> move
> > >>>
> > >>>> the code without causing too much disruption to our customers.
> > >>>>> Additionally, this enables the Beam community to contribute as soon
> > as
> > >>>>> possible.
> > >>>>>
> > >>>>> The current state of the SDK is as follows:
> > >>>>>
> > >>>>>      -
> > >>>>>
> > >>>>>      Open-sourced at
> > >>>>> https://github.com/GoogleCloudPlatform/DataflowPythonSDK/
> > >>>>>
> > >>>>>
> > >>>>>      -
> > >>>>>
> > >>>>>      Model: All main concepts are present.
> > >>>>>      -
> > >>>>>
> > >>>>>      I/O: SDK supports text (Google Cloud Storage) and BigQuery
> > >>>>>
> > >>>> connectors
> > >>>
> > >>>>      and has a framework for adding additional sources and sinks.
> > >>>>>      -
> > >>>>>
> > >>>>>      Runners: SDK has two pipeline runners: direct runner (in
> > process,
> > >>>>> local
> > >>>>>      execution) and Cloud Dataflow runner for batch pipelines
> (submit
> > >>>>> job
> > >>>>> to
> > >>>>>      Google Dataflow service). The current direct runner is bounded
> > >>>>> only
> > >>>>> (batch
> > >>>>>      execution) but there is work in progress to support unbounded
> > (as
> > >>>>> in
> > >>>>> Java).
> > >>>>>      -
> > >>>>>
> > >>>>>      Testing: The code base has unit test coverage for all the
> > modules
> > >>>>>
> > >>>> and
> > >>>
> > >>>>      several integration and end to end tests (similar in coverage
> to
> > >>>>> the
> > >>>>> Java
> > >>>>>      SDK). Streaming is not well tested end to end yet since Cloud
> > >>>>>
> > >>>> Dataflow
> > >>>
> > >>>>      focused first on batch.
> > >>>>>      -
> > >>>>>
> > >>>>>      Docs: We have matching Python documentation for the features
> > >>>>>
> > >>>> currently
> > >>>
> > >>>>      supported by Cloud Dataflow. The docs are on cloud.google.com
> > >>>>>
> > >>>> (access
> > >>>
> > >>>>      only by whitelist due to the alpha stage of the project). Devin
> > is
> > >>>>> working
> > >>>>>      on the transition of all docs to Apache.
> > >>>>>
> > >>>>>
> > >>>>> In the next days/weeks we would like to prepare and start migrating
> > the
> > >>>>> code and you should start seeing some pull requests. We also hope
> > that
> > >>>>>
> > >>>> the
> > >>>
> > >>>> Beam community will shape the SDK going forward. In particular, all
> > the
> > >>>>> model improvements implemented for Java (Runner API, etc.) will
> have
> > >>>>> equivalents in Python once they stabilize. If you have any advice
> > >>>>> before
> > >>>>> we
> > >>>>> start the journey please let us know.
> > >>>>>
> > >>>>> The team that will join the Beam effort consists of me (Silviu
> > >>>>>
> > >>>> Calinoiu),
> > >>>
> > >>>> Charles Chen, Ahmet Altay, Chamikara Jayalath, and last but not
> least
> > >>>>> Robert Bradshaw (who is already an Apache Beam committer).
> > >>>>>
> > >>>>> So let us know what you think!
> > >>>>>
> > >>>>> Best regards,
> > >>>>>
> > >>>>> Silviu
> > >>>>>
> > >>>>>
> > >>>>> --
> > >>>> Jean-Baptiste Onofré
> > >>>> [email protected]
> > >>>> http://blog.nanthrax.net
> > >>>> Talend - http://www.talend.com
> > >>>>
> > >>>>
> > >>>
> > >>
> > > --
> > > Jean-Baptiste Onofré
> > > [email protected]
> > > http://blog.nanthrax.net
> > > Talend - http://www.talend.com
> > >
> >
>

Reply via email to