Hey Silviu! I think JB is proposing we create a python directory in the sdks directory in the root repository (and modify the configuration files accordingly):
https://github.com/apache/incubator-beam/tree/master/sdks This Beam document here titled "Apache Beam (Incubating): Repository Structure" details the proposed repository structure and may be useful: https://drive.google.com/a/google.com/folderview?id=0B-IhJZh9Ab52OFBVZHpsNjc4eXc Best, James On Fri, Jun 3, 2016 at 6:34 AM, Silviu Calinoiu <[email protected]> wrote: > Hi JB, > Thanks for the welcome! I come from the Python land so I am not quite > familiar with Maven. What do you mean by a Maven module? You mean an > artifact so you can install things? In Python, people are used to packages > downloaded from PyPI (pypi.python.org -- which is sort of Maven for > Python). Whatever is the standard way of doing things in Apache we'll do > it. Just asking for clarifications. > > By the way this discussion is very useful since we will have to iron out > several details like this. > Thanks, > Silviu > > On Fri, Jun 3, 2016 at 6:19 AM, Jean-Baptiste Onofré <[email protected]> > wrote: > > > Hi Silviu, > > > > thanks for detailed update and great work ! > > > > I would advice to create a: > > > > sdks/python > > > > Maven module to store the Python SDK. > > > > WDYT ? > > > > By the way, welcome aboard and great to have you all guys in the team ! > > > > Regards > > JB > > > > On 06/03/2016 03:13 PM, Silviu Calinoiu wrote: > > > >> Hi all, > >> > >> My name is Silviu Calinoiu and I am a member of the Cloud Dataflow team > >> working on the Python SDK. As the original Beam proposal ( > >> https://wiki.apache.org/incubator/BeamProposal) mentioned, we have been > >> planning to merge the Python SDK into Beam. The Python SDK is in an > early > >> stage of development (alpha milestone) and so this is a good time to > move > >> the code without causing too much disruption to our customers. > >> Additionally, this enables the Beam community to contribute as soon as > >> possible. > >> > >> The current state of the SDK is as follows: > >> > >> - > >> > >> Open-sourced at > >> https://github.com/GoogleCloudPlatform/DataflowPythonSDK/ > >> > >> > >> - > >> > >> Model: All main concepts are present. > >> - > >> > >> I/O: SDK supports text (Google Cloud Storage) and BigQuery > connectors > >> and has a framework for adding additional sources and sinks. > >> - > >> > >> Runners: SDK has two pipeline runners: direct runner (in process, > >> local > >> execution) and Cloud Dataflow runner for batch pipelines (submit job > >> to > >> Google Dataflow service). The current direct runner is bounded only > >> (batch > >> execution) but there is work in progress to support unbounded (as in > >> Java). > >> - > >> > >> Testing: The code base has unit test coverage for all the modules > and > >> several integration and end to end tests (similar in coverage to the > >> Java > >> SDK). Streaming is not well tested end to end yet since Cloud > Dataflow > >> focused first on batch. > >> - > >> > >> Docs: We have matching Python documentation for the features > currently > >> supported by Cloud Dataflow. The docs are on cloud.google.com > (access > >> only by whitelist due to the alpha stage of the project). Devin is > >> working > >> on the transition of all docs to Apache. > >> > >> > >> In the next days/weeks we would like to prepare and start migrating the > >> code and you should start seeing some pull requests. We also hope that > the > >> Beam community will shape the SDK going forward. In particular, all the > >> model improvements implemented for Java (Runner API, etc.) will have > >> equivalents in Python once they stabilize. If you have any advice before > >> we > >> start the journey please let us know. > >> > >> The team that will join the Beam effort consists of me (Silviu > Calinoiu), > >> Charles Chen, Ahmet Altay, Chamikara Jayalath, and last but not least > >> Robert Bradshaw (who is already an Apache Beam committer). > >> > >> So let us know what you think! > >> > >> Best regards, > >> > >> Silviu > >> > >> > > -- > > Jean-Baptiste Onofré > > [email protected] > > http://blog.nanthrax.net > > Talend - http://www.talend.com > > >
