Thanks everybody for the welcoming and feedback. The initial code move was proposed as pull request #461 [1].
Looking forward to working with everybody in the Beam community and especially any Pythonistas out there. Thanks, Silviu [1] https://github.com/apache/incubator-beam/pull/461 On Sat, Jun 4, 2016 at 12:35 AM, Ismaël Mejía <[email protected]> wrote: > Excellent guys, Welcome to Beam ! > > I am looking for ways to integrate Beam with the standard notebook tools > (Zẽppelin / Jupyter [ipython], so I am really happy to see the python SDK > arriving to Beam, Awesome. > > Ismaël Mejía > > On Fri, Jun 3, 2016 at 7:17 PM, Amit Sela <[email protected]> wrote: > > > Welcome Python people ;) > > > > I know a few people who've been waiting for this one! > > > > On Fri, Jun 3, 2016, 19:53 Davor Bonaci <[email protected]> > wrote: > > > > > Welcome Python SDK, as well as Silviu, Charles, Ahmet and Chamikara! > > > > > > On Fri, Jun 3, 2016 at 7:07 AM, Jean-Baptiste Onofré <[email protected]> > > > wrote: > > > > > > > Absolutely ;) > > > > > > > > > > > > On 06/03/2016 03:51 PM, James Malone wrote: > > > > > > > >> Hey Silviu! > > > >> > > > >> I think JB is proposing we create a python directory in the sdks > > > directory > > > >> in the root repository (and modify the configuration files > > accordingly): > > > >> > > > >> https://github.com/apache/incubator-beam/tree/master/sdks > > > >> > > > >> This Beam document here titled "Apache Beam (Incubating): Repository > > > >> Structure" details the proposed repository structure and may be > > useful: > > > >> > > > >> > > > >> > > > >> > > > > > > https://drive.google.com/a/google.com/folderview?id=0B-IhJZh9Ab52OFBVZHpsNjc4eXc > > > >> > > > >> Best, > > > >> > > > >> James > > > >> > > > >> > > > >> > > > >> On Fri, Jun 3, 2016 at 6:34 AM, Silviu Calinoiu > > > >> <[email protected]> > > > >> wrote: > > > >> > > > >> Hi JB, > > > >>> Thanks for the welcome! I come from the Python land so I am not > > quite > > > >>> familiar with Maven. What do you mean by a Maven module? You mean > an > > > >>> artifact so you can install things? In Python, people are used to > > > >>> packages > > > >>> downloaded from PyPI (pypi.python.org -- which is sort of Maven > for > > > >>> Python). Whatever is the standard way of doing things in Apache > we'll > > > do > > > >>> it. Just asking for clarifications. > > > >>> > > > >>> By the way this discussion is very useful since we will have to > iron > > > out > > > >>> several details like this. > > > >>> Thanks, > > > >>> Silviu > > > >>> > > > >>> On Fri, Jun 3, 2016 at 6:19 AM, Jean-Baptiste Onofré < > > [email protected]> > > > >>> wrote: > > > >>> > > > >>> Hi Silviu, > > > >>>> > > > >>>> thanks for detailed update and great work ! > > > >>>> > > > >>>> I would advice to create a: > > > >>>> > > > >>>> sdks/python > > > >>>> > > > >>>> Maven module to store the Python SDK. > > > >>>> > > > >>>> WDYT ? > > > >>>> > > > >>>> By the way, welcome aboard and great to have you all guys in the > > team > > > ! > > > >>>> > > > >>>> Regards > > > >>>> JB > > > >>>> > > > >>>> On 06/03/2016 03:13 PM, Silviu Calinoiu wrote: > > > >>>> > > > >>>> Hi all, > > > >>>>> > > > >>>>> My name is Silviu Calinoiu and I am a member of the Cloud > Dataflow > > > team > > > >>>>> working on the Python SDK. As the original Beam proposal ( > > > >>>>> https://wiki.apache.org/incubator/BeamProposal) mentioned, we > have > > > >>>>> been > > > >>>>> planning to merge the Python SDK into Beam. The Python SDK is in > an > > > >>>>> > > > >>>> early > > > >>> > > > >>>> stage of development (alpha milestone) and so this is a good time > to > > > >>>>> > > > >>>> move > > > >>> > > > >>>> the code without causing too much disruption to our customers. > > > >>>>> Additionally, this enables the Beam community to contribute as > soon > > > as > > > >>>>> possible. > > > >>>>> > > > >>>>> The current state of the SDK is as follows: > > > >>>>> > > > >>>>> - > > > >>>>> > > > >>>>> Open-sourced at > > > >>>>> https://github.com/GoogleCloudPlatform/DataflowPythonSDK/ > > > >>>>> > > > >>>>> > > > >>>>> - > > > >>>>> > > > >>>>> Model: All main concepts are present. > > > >>>>> - > > > >>>>> > > > >>>>> I/O: SDK supports text (Google Cloud Storage) and BigQuery > > > >>>>> > > > >>>> connectors > > > >>> > > > >>>> and has a framework for adding additional sources and sinks. > > > >>>>> - > > > >>>>> > > > >>>>> Runners: SDK has two pipeline runners: direct runner (in > > > process, > > > >>>>> local > > > >>>>> execution) and Cloud Dataflow runner for batch pipelines > > (submit > > > >>>>> job > > > >>>>> to > > > >>>>> Google Dataflow service). The current direct runner is > bounded > > > >>>>> only > > > >>>>> (batch > > > >>>>> execution) but there is work in progress to support > unbounded > > > (as > > > >>>>> in > > > >>>>> Java). > > > >>>>> - > > > >>>>> > > > >>>>> Testing: The code base has unit test coverage for all the > > > modules > > > >>>>> > > > >>>> and > > > >>> > > > >>>> several integration and end to end tests (similar in coverage > > to > > > >>>>> the > > > >>>>> Java > > > >>>>> SDK). Streaming is not well tested end to end yet since > Cloud > > > >>>>> > > > >>>> Dataflow > > > >>> > > > >>>> focused first on batch. > > > >>>>> - > > > >>>>> > > > >>>>> Docs: We have matching Python documentation for the features > > > >>>>> > > > >>>> currently > > > >>> > > > >>>> supported by Cloud Dataflow. The docs are on > cloud.google.com > > > >>>>> > > > >>>> (access > > > >>> > > > >>>> only by whitelist due to the alpha stage of the project). > Devin > > > is > > > >>>>> working > > > >>>>> on the transition of all docs to Apache. > > > >>>>> > > > >>>>> > > > >>>>> In the next days/weeks we would like to prepare and start > migrating > > > the > > > >>>>> code and you should start seeing some pull requests. We also hope > > > that > > > >>>>> > > > >>>> the > > > >>> > > > >>>> Beam community will shape the SDK going forward. In particular, > all > > > the > > > >>>>> model improvements implemented for Java (Runner API, etc.) will > > have > > > >>>>> equivalents in Python once they stabilize. If you have any advice > > > >>>>> before > > > >>>>> we > > > >>>>> start the journey please let us know. > > > >>>>> > > > >>>>> The team that will join the Beam effort consists of me (Silviu > > > >>>>> > > > >>>> Calinoiu), > > > >>> > > > >>>> Charles Chen, Ahmet Altay, Chamikara Jayalath, and last but not > > least > > > >>>>> Robert Bradshaw (who is already an Apache Beam committer). > > > >>>>> > > > >>>>> So let us know what you think! > > > >>>>> > > > >>>>> Best regards, > > > >>>>> > > > >>>>> Silviu > > > >>>>> > > > >>>>> > > > >>>>> -- > > > >>>> Jean-Baptiste Onofré > > > >>>> [email protected] > > > >>>> http://blog.nanthrax.net > > > >>>> Talend - http://www.talend.com > > > >>>> > > > >>>> > > > >>> > > > >> > > > > -- > > > > Jean-Baptiste Onofré > > > > [email protected] > > > > http://blog.nanthrax.net > > > > Talend - http://www.talend.com > > > > > > > > > >
