Welcome Python people ;)

I know a few people who've been waiting for this one!

On Fri, Jun 3, 2016, 19:53 Davor Bonaci <[email protected]> wrote:

> Welcome Python SDK, as well as Silviu, Charles, Ahmet and Chamikara!
>
> On Fri, Jun 3, 2016 at 7:07 AM, Jean-Baptiste Onofré <[email protected]>
> wrote:
>
> > Absolutely ;)
> >
> >
> > On 06/03/2016 03:51 PM, James Malone wrote:
> >
> >> Hey Silviu!
> >>
> >> I think JB is proposing we create a python directory in the sdks
> directory
> >> in the root repository (and modify the configuration files accordingly):
> >>
> >>     https://github.com/apache/incubator-beam/tree/master/sdks
> >>
> >> This Beam document here titled "Apache Beam (Incubating): Repository
> >> Structure" details the proposed repository structure and may be useful:
> >>
> >>
> >>
> >>
> https://drive.google.com/a/google.com/folderview?id=0B-IhJZh9Ab52OFBVZHpsNjc4eXc
> >>
> >> Best,
> >>
> >> James
> >>
> >>
> >>
> >> On Fri, Jun 3, 2016 at 6:34 AM, Silviu Calinoiu
> >> <[email protected]>
> >> wrote:
> >>
> >> Hi JB,
> >>> Thanks for the welcome! I come from the Python land so  I am not quite
> >>> familiar with Maven. What do you mean by a Maven module? You mean an
> >>> artifact so you can install things? In Python, people are used to
> >>> packages
> >>> downloaded from PyPI (pypi.python.org -- which is sort of Maven for
> >>> Python). Whatever is the standard way of doing things in Apache we'll
> do
> >>> it. Just asking for clarifications.
> >>>
> >>> By the way this discussion is very useful since we will have to iron
> out
> >>> several details like this.
> >>> Thanks,
> >>> Silviu
> >>>
> >>> On Fri, Jun 3, 2016 at 6:19 AM, Jean-Baptiste Onofré <[email protected]>
> >>> wrote:
> >>>
> >>> Hi Silviu,
> >>>>
> >>>> thanks for detailed update and great work !
> >>>>
> >>>> I would advice to create a:
> >>>>
> >>>> sdks/python
> >>>>
> >>>> Maven module to store the Python SDK.
> >>>>
> >>>> WDYT ?
> >>>>
> >>>> By the way, welcome aboard and great to have you all guys in the team
> !
> >>>>
> >>>> Regards
> >>>> JB
> >>>>
> >>>> On 06/03/2016 03:13 PM, Silviu Calinoiu wrote:
> >>>>
> >>>> Hi all,
> >>>>>
> >>>>> My name is Silviu Calinoiu and I am a member of the Cloud Dataflow
> team
> >>>>> working on the Python SDK.  As the original Beam proposal (
> >>>>> https://wiki.apache.org/incubator/BeamProposal) mentioned, we have
> >>>>> been
> >>>>> planning to merge the Python SDK into Beam. The Python SDK is in an
> >>>>>
> >>>> early
> >>>
> >>>> stage of development (alpha milestone) and so this is a good time to
> >>>>>
> >>>> move
> >>>
> >>>> the code without causing too much disruption to our customers.
> >>>>> Additionally, this enables the Beam community to contribute as soon
> as
> >>>>> possible.
> >>>>>
> >>>>> The current state of the SDK is as follows:
> >>>>>
> >>>>>      -
> >>>>>
> >>>>>      Open-sourced at
> >>>>> https://github.com/GoogleCloudPlatform/DataflowPythonSDK/
> >>>>>
> >>>>>
> >>>>>      -
> >>>>>
> >>>>>      Model: All main concepts are present.
> >>>>>      -
> >>>>>
> >>>>>      I/O: SDK supports text (Google Cloud Storage) and BigQuery
> >>>>>
> >>>> connectors
> >>>
> >>>>      and has a framework for adding additional sources and sinks.
> >>>>>      -
> >>>>>
> >>>>>      Runners: SDK has two pipeline runners: direct runner (in
> process,
> >>>>> local
> >>>>>      execution) and Cloud Dataflow runner for batch pipelines (submit
> >>>>> job
> >>>>> to
> >>>>>      Google Dataflow service). The current direct runner is bounded
> >>>>> only
> >>>>> (batch
> >>>>>      execution) but there is work in progress to support unbounded
> (as
> >>>>> in
> >>>>> Java).
> >>>>>      -
> >>>>>
> >>>>>      Testing: The code base has unit test coverage for all the
> modules
> >>>>>
> >>>> and
> >>>
> >>>>      several integration and end to end tests (similar in coverage to
> >>>>> the
> >>>>> Java
> >>>>>      SDK). Streaming is not well tested end to end yet since Cloud
> >>>>>
> >>>> Dataflow
> >>>
> >>>>      focused first on batch.
> >>>>>      -
> >>>>>
> >>>>>      Docs: We have matching Python documentation for the features
> >>>>>
> >>>> currently
> >>>
> >>>>      supported by Cloud Dataflow. The docs are on cloud.google.com
> >>>>>
> >>>> (access
> >>>
> >>>>      only by whitelist due to the alpha stage of the project). Devin
> is
> >>>>> working
> >>>>>      on the transition of all docs to Apache.
> >>>>>
> >>>>>
> >>>>> In the next days/weeks we would like to prepare and start migrating
> the
> >>>>> code and you should start seeing some pull requests. We also hope
> that
> >>>>>
> >>>> the
> >>>
> >>>> Beam community will shape the SDK going forward. In particular, all
> the
> >>>>> model improvements implemented for Java (Runner API, etc.) will have
> >>>>> equivalents in Python once they stabilize. If you have any advice
> >>>>> before
> >>>>> we
> >>>>> start the journey please let us know.
> >>>>>
> >>>>> The team that will join the Beam effort consists of me (Silviu
> >>>>>
> >>>> Calinoiu),
> >>>
> >>>> Charles Chen, Ahmet Altay, Chamikara Jayalath, and last but not least
> >>>>> Robert Bradshaw (who is already an Apache Beam committer).
> >>>>>
> >>>>> So let us know what you think!
> >>>>>
> >>>>> Best regards,
> >>>>>
> >>>>> Silviu
> >>>>>
> >>>>>
> >>>>> --
> >>>> Jean-Baptiste Onofré
> >>>> [email protected]
> >>>> http://blog.nanthrax.net
> >>>> Talend - http://www.talend.com
> >>>>
> >>>>
> >>>
> >>
> > --
> > Jean-Baptiste Onofré
> > [email protected]
> > http://blog.nanthrax.net
> > Talend - http://www.talend.com
> >
>

Reply via email to