Hi James, Yes we fit right into sdks/python. I will have out a doc/proposal about where things go so people can comment. It will follow closely the Beam repository guidelines. Thanks, Silviu
On Fri, Jun 3, 2016 at 6:51 AM, James Malone <[email protected] > wrote: > Hey Silviu! > > I think JB is proposing we create a python directory in the sdks directory > in the root repository (and modify the configuration files accordingly): > > https://github.com/apache/incubator-beam/tree/master/sdks > > This Beam document here titled "Apache Beam (Incubating): Repository > Structure" details the proposed repository structure and may be useful: > > > > https://drive.google.com/a/google.com/folderview?id=0B-IhJZh9Ab52OFBVZHpsNjc4eXc > > Best, > > James > > > > On Fri, Jun 3, 2016 at 6:34 AM, Silviu Calinoiu <[email protected] > > > wrote: > > > Hi JB, > > Thanks for the welcome! I come from the Python land so I am not quite > > familiar with Maven. What do you mean by a Maven module? You mean an > > artifact so you can install things? In Python, people are used to > packages > > downloaded from PyPI (pypi.python.org -- which is sort of Maven for > > Python). Whatever is the standard way of doing things in Apache we'll do > > it. Just asking for clarifications. > > > > By the way this discussion is very useful since we will have to iron out > > several details like this. > > Thanks, > > Silviu > > > > On Fri, Jun 3, 2016 at 6:19 AM, Jean-Baptiste Onofré <[email protected]> > > wrote: > > > > > Hi Silviu, > > > > > > thanks for detailed update and great work ! > > > > > > I would advice to create a: > > > > > > sdks/python > > > > > > Maven module to store the Python SDK. > > > > > > WDYT ? > > > > > > By the way, welcome aboard and great to have you all guys in the team ! > > > > > > Regards > > > JB > > > > > > On 06/03/2016 03:13 PM, Silviu Calinoiu wrote: > > > > > >> Hi all, > > >> > > >> My name is Silviu Calinoiu and I am a member of the Cloud Dataflow > team > > >> working on the Python SDK. As the original Beam proposal ( > > >> https://wiki.apache.org/incubator/BeamProposal) mentioned, we have > been > > >> planning to merge the Python SDK into Beam. The Python SDK is in an > > early > > >> stage of development (alpha milestone) and so this is a good time to > > move > > >> the code without causing too much disruption to our customers. > > >> Additionally, this enables the Beam community to contribute as soon as > > >> possible. > > >> > > >> The current state of the SDK is as follows: > > >> > > >> - > > >> > > >> Open-sourced at > > >> https://github.com/GoogleCloudPlatform/DataflowPythonSDK/ > > >> > > >> > > >> - > > >> > > >> Model: All main concepts are present. > > >> - > > >> > > >> I/O: SDK supports text (Google Cloud Storage) and BigQuery > > connectors > > >> and has a framework for adding additional sources and sinks. > > >> - > > >> > > >> Runners: SDK has two pipeline runners: direct runner (in process, > > >> local > > >> execution) and Cloud Dataflow runner for batch pipelines (submit > job > > >> to > > >> Google Dataflow service). The current direct runner is bounded > only > > >> (batch > > >> execution) but there is work in progress to support unbounded (as > in > > >> Java). > > >> - > > >> > > >> Testing: The code base has unit test coverage for all the modules > > and > > >> several integration and end to end tests (similar in coverage to > the > > >> Java > > >> SDK). Streaming is not well tested end to end yet since Cloud > > Dataflow > > >> focused first on batch. > > >> - > > >> > > >> Docs: We have matching Python documentation for the features > > currently > > >> supported by Cloud Dataflow. The docs are on cloud.google.com > > (access > > >> only by whitelist due to the alpha stage of the project). Devin is > > >> working > > >> on the transition of all docs to Apache. > > >> > > >> > > >> In the next days/weeks we would like to prepare and start migrating > the > > >> code and you should start seeing some pull requests. We also hope that > > the > > >> Beam community will shape the SDK going forward. In particular, all > the > > >> model improvements implemented for Java (Runner API, etc.) will have > > >> equivalents in Python once they stabilize. If you have any advice > before > > >> we > > >> start the journey please let us know. > > >> > > >> The team that will join the Beam effort consists of me (Silviu > > Calinoiu), > > >> Charles Chen, Ahmet Altay, Chamikara Jayalath, and last but not least > > >> Robert Bradshaw (who is already an Apache Beam committer). > > >> > > >> So let us know what you think! > > >> > > >> Best regards, > > >> > > >> Silviu > > >> > > >> > > > -- > > > Jean-Baptiste Onofré > > > [email protected] > > > http://blog.nanthrax.net > > > Talend - http://www.talend.com > > > > > >
