Awesome job, Silviu! Really excited to have Python SDK join us in Beam.

I'll take care of merging the pull request. Let's start with a feature
branch, as per previous conversations on the dev@ list.

On Tue, Jun 14, 2016 at 12:22 PM, Silviu Calinoiu <
[email protected]> wrote:

> Thanks everybody for the welcoming and feedback. The initial code move was
> proposed as pull request #461 [1].
>
> Looking forward to working with everybody in the Beam community and
> especially any Pythonistas out there.
>
> Thanks,
> Silviu
>
> [1] https://github.com/apache/incubator-beam/pull/461
>
> On Sat, Jun 4, 2016 at 12:35 AM, Ismaël Mejía <[email protected]> wrote:
>
> > Excellent guys, Welcome to Beam !
> >
> > I am looking for ways to integrate Beam with the standard notebook tools
> > (Zẽppelin / Jupyter [ipython], so I am really happy to see the python SDK
> > arriving to Beam, Awesome.
> >
> > Ismaël Mejía
> >
> > On Fri, Jun 3, 2016 at 7:17 PM, Amit Sela <[email protected]> wrote:
> >
> > > Welcome Python people ;)
> > >
> > > I know a few people who've been waiting for this one!
> > >
> > > On Fri, Jun 3, 2016, 19:53 Davor Bonaci <[email protected]>
> > wrote:
> > >
> > > > Welcome Python SDK, as well as Silviu, Charles, Ahmet and Chamikara!
> > > >
> > > > On Fri, Jun 3, 2016 at 7:07 AM, Jean-Baptiste Onofré <
> [email protected]>
> > > > wrote:
> > > >
> > > > > Absolutely ;)
> > > > >
> > > > >
> > > > > On 06/03/2016 03:51 PM, James Malone wrote:
> > > > >
> > > > >> Hey Silviu!
> > > > >>
> > > > >> I think JB is proposing we create a python directory in the sdks
> > > > directory
> > > > >> in the root repository (and modify the configuration files
> > > accordingly):
> > > > >>
> > > > >>     https://github.com/apache/incubator-beam/tree/master/sdks
> > > > >>
> > > > >> This Beam document here titled "Apache Beam (Incubating):
> Repository
> > > > >> Structure" details the proposed repository structure and may be
> > > useful:
> > > > >>
> > > > >>
> > > > >>
> > > > >>
> > > >
> > >
> >
> https://drive.google.com/a/google.com/folderview?id=0B-IhJZh9Ab52OFBVZHpsNjc4eXc
> > > > >>
> > > > >> Best,
> > > > >>
> > > > >> James
> > > > >>
> > > > >>
> > > > >>
> > > > >> On Fri, Jun 3, 2016 at 6:34 AM, Silviu Calinoiu
> > > > >> <[email protected]>
> > > > >> wrote:
> > > > >>
> > > > >> Hi JB,
> > > > >>> Thanks for the welcome! I come from the Python land so  I am not
> > > quite
> > > > >>> familiar with Maven. What do you mean by a Maven module? You mean
> > an
> > > > >>> artifact so you can install things? In Python, people are used to
> > > > >>> packages
> > > > >>> downloaded from PyPI (pypi.python.org -- which is sort of Maven
> > for
> > > > >>> Python). Whatever is the standard way of doing things in Apache
> > we'll
> > > > do
> > > > >>> it. Just asking for clarifications.
> > > > >>>
> > > > >>> By the way this discussion is very useful since we will have to
> > iron
> > > > out
> > > > >>> several details like this.
> > > > >>> Thanks,
> > > > >>> Silviu
> > > > >>>
> > > > >>> On Fri, Jun 3, 2016 at 6:19 AM, Jean-Baptiste Onofré <
> > > [email protected]>
> > > > >>> wrote:
> > > > >>>
> > > > >>> Hi Silviu,
> > > > >>>>
> > > > >>>> thanks for detailed update and great work !
> > > > >>>>
> > > > >>>> I would advice to create a:
> > > > >>>>
> > > > >>>> sdks/python
> > > > >>>>
> > > > >>>> Maven module to store the Python SDK.
> > > > >>>>
> > > > >>>> WDYT ?
> > > > >>>>
> > > > >>>> By the way, welcome aboard and great to have you all guys in the
> > > team
> > > > !
> > > > >>>>
> > > > >>>> Regards
> > > > >>>> JB
> > > > >>>>
> > > > >>>> On 06/03/2016 03:13 PM, Silviu Calinoiu wrote:
> > > > >>>>
> > > > >>>> Hi all,
> > > > >>>>>
> > > > >>>>> My name is Silviu Calinoiu and I am a member of the Cloud
> > Dataflow
> > > > team
> > > > >>>>> working on the Python SDK.  As the original Beam proposal (
> > > > >>>>> https://wiki.apache.org/incubator/BeamProposal) mentioned, we
> > have
> > > > >>>>> been
> > > > >>>>> planning to merge the Python SDK into Beam. The Python SDK is
> in
> > an
> > > > >>>>>
> > > > >>>> early
> > > > >>>
> > > > >>>> stage of development (alpha milestone) and so this is a good
> time
> > to
> > > > >>>>>
> > > > >>>> move
> > > > >>>
> > > > >>>> the code without causing too much disruption to our customers.
> > > > >>>>> Additionally, this enables the Beam community to contribute as
> > soon
> > > > as
> > > > >>>>> possible.
> > > > >>>>>
> > > > >>>>> The current state of the SDK is as follows:
> > > > >>>>>
> > > > >>>>>      -
> > > > >>>>>
> > > > >>>>>      Open-sourced at
> > > > >>>>> https://github.com/GoogleCloudPlatform/DataflowPythonSDK/
> > > > >>>>>
> > > > >>>>>
> > > > >>>>>      -
> > > > >>>>>
> > > > >>>>>      Model: All main concepts are present.
> > > > >>>>>      -
> > > > >>>>>
> > > > >>>>>      I/O: SDK supports text (Google Cloud Storage) and BigQuery
> > > > >>>>>
> > > > >>>> connectors
> > > > >>>
> > > > >>>>      and has a framework for adding additional sources and
> sinks.
> > > > >>>>>      -
> > > > >>>>>
> > > > >>>>>      Runners: SDK has two pipeline runners: direct runner (in
> > > > process,
> > > > >>>>> local
> > > > >>>>>      execution) and Cloud Dataflow runner for batch pipelines
> > > (submit
> > > > >>>>> job
> > > > >>>>> to
> > > > >>>>>      Google Dataflow service). The current direct runner is
> > bounded
> > > > >>>>> only
> > > > >>>>> (batch
> > > > >>>>>      execution) but there is work in progress to support
> > unbounded
> > > > (as
> > > > >>>>> in
> > > > >>>>> Java).
> > > > >>>>>      -
> > > > >>>>>
> > > > >>>>>      Testing: The code base has unit test coverage for all the
> > > > modules
> > > > >>>>>
> > > > >>>> and
> > > > >>>
> > > > >>>>      several integration and end to end tests (similar in
> coverage
> > > to
> > > > >>>>> the
> > > > >>>>> Java
> > > > >>>>>      SDK). Streaming is not well tested end to end yet since
> > Cloud
> > > > >>>>>
> > > > >>>> Dataflow
> > > > >>>
> > > > >>>>      focused first on batch.
> > > > >>>>>      -
> > > > >>>>>
> > > > >>>>>      Docs: We have matching Python documentation for the
> features
> > > > >>>>>
> > > > >>>> currently
> > > > >>>
> > > > >>>>      supported by Cloud Dataflow. The docs are on
> > cloud.google.com
> > > > >>>>>
> > > > >>>> (access
> > > > >>>
> > > > >>>>      only by whitelist due to the alpha stage of the project).
> > Devin
> > > > is
> > > > >>>>> working
> > > > >>>>>      on the transition of all docs to Apache.
> > > > >>>>>
> > > > >>>>>
> > > > >>>>> In the next days/weeks we would like to prepare and start
> > migrating
> > > > the
> > > > >>>>> code and you should start seeing some pull requests. We also
> hope
> > > > that
> > > > >>>>>
> > > > >>>> the
> > > > >>>
> > > > >>>> Beam community will shape the SDK going forward. In particular,
> > all
> > > > the
> > > > >>>>> model improvements implemented for Java (Runner API, etc.) will
> > > have
> > > > >>>>> equivalents in Python once they stabilize. If you have any
> advice
> > > > >>>>> before
> > > > >>>>> we
> > > > >>>>> start the journey please let us know.
> > > > >>>>>
> > > > >>>>> The team that will join the Beam effort consists of me (Silviu
> > > > >>>>>
> > > > >>>> Calinoiu),
> > > > >>>
> > > > >>>> Charles Chen, Ahmet Altay, Chamikara Jayalath, and last but not
> > > least
> > > > >>>>> Robert Bradshaw (who is already an Apache Beam committer).
> > > > >>>>>
> > > > >>>>> So let us know what you think!
> > > > >>>>>
> > > > >>>>> Best regards,
> > > > >>>>>
> > > > >>>>> Silviu
> > > > >>>>>
> > > > >>>>>
> > > > >>>>> --
> > > > >>>> Jean-Baptiste Onofré
> > > > >>>> [email protected]
> > > > >>>> http://blog.nanthrax.net
> > > > >>>> Talend - http://www.talend.com
> > > > >>>>
> > > > >>>>
> > > > >>>
> > > > >>
> > > > > --
> > > > > Jean-Baptiste Onofré
> > > > > [email protected]
> > > > > http://blog.nanthrax.net
> > > > > Talend - http://www.talend.com
> > > > >
> > > >
> > >
> >
>

Reply via email to