Welcome Python SDK, as well as Silviu, Charles, Ahmet and Chamikara!

On Fri, Jun 3, 2016 at 7:07 AM, Jean-Baptiste Onofré <[email protected]>
wrote:

> Absolutely ;)
>
>
> On 06/03/2016 03:51 PM, James Malone wrote:
>
>> Hey Silviu!
>>
>> I think JB is proposing we create a python directory in the sdks directory
>> in the root repository (and modify the configuration files accordingly):
>>
>>     https://github.com/apache/incubator-beam/tree/master/sdks
>>
>> This Beam document here titled "Apache Beam (Incubating): Repository
>> Structure" details the proposed repository structure and may be useful:
>>
>>
>>
>> https://drive.google.com/a/google.com/folderview?id=0B-IhJZh9Ab52OFBVZHpsNjc4eXc
>>
>> Best,
>>
>> James
>>
>>
>>
>> On Fri, Jun 3, 2016 at 6:34 AM, Silviu Calinoiu
>> <[email protected]>
>> wrote:
>>
>> Hi JB,
>>> Thanks for the welcome! I come from the Python land so  I am not quite
>>> familiar with Maven. What do you mean by a Maven module? You mean an
>>> artifact so you can install things? In Python, people are used to
>>> packages
>>> downloaded from PyPI (pypi.python.org -- which is sort of Maven for
>>> Python). Whatever is the standard way of doing things in Apache we'll do
>>> it. Just asking for clarifications.
>>>
>>> By the way this discussion is very useful since we will have to iron out
>>> several details like this.
>>> Thanks,
>>> Silviu
>>>
>>> On Fri, Jun 3, 2016 at 6:19 AM, Jean-Baptiste Onofré <[email protected]>
>>> wrote:
>>>
>>> Hi Silviu,
>>>>
>>>> thanks for detailed update and great work !
>>>>
>>>> I would advice to create a:
>>>>
>>>> sdks/python
>>>>
>>>> Maven module to store the Python SDK.
>>>>
>>>> WDYT ?
>>>>
>>>> By the way, welcome aboard and great to have you all guys in the team !
>>>>
>>>> Regards
>>>> JB
>>>>
>>>> On 06/03/2016 03:13 PM, Silviu Calinoiu wrote:
>>>>
>>>> Hi all,
>>>>>
>>>>> My name is Silviu Calinoiu and I am a member of the Cloud Dataflow team
>>>>> working on the Python SDK.  As the original Beam proposal (
>>>>> https://wiki.apache.org/incubator/BeamProposal) mentioned, we have
>>>>> been
>>>>> planning to merge the Python SDK into Beam. The Python SDK is in an
>>>>>
>>>> early
>>>
>>>> stage of development (alpha milestone) and so this is a good time to
>>>>>
>>>> move
>>>
>>>> the code without causing too much disruption to our customers.
>>>>> Additionally, this enables the Beam community to contribute as soon as
>>>>> possible.
>>>>>
>>>>> The current state of the SDK is as follows:
>>>>>
>>>>>      -
>>>>>
>>>>>      Open-sourced at
>>>>> https://github.com/GoogleCloudPlatform/DataflowPythonSDK/
>>>>>
>>>>>
>>>>>      -
>>>>>
>>>>>      Model: All main concepts are present.
>>>>>      -
>>>>>
>>>>>      I/O: SDK supports text (Google Cloud Storage) and BigQuery
>>>>>
>>>> connectors
>>>
>>>>      and has a framework for adding additional sources and sinks.
>>>>>      -
>>>>>
>>>>>      Runners: SDK has two pipeline runners: direct runner (in process,
>>>>> local
>>>>>      execution) and Cloud Dataflow runner for batch pipelines (submit
>>>>> job
>>>>> to
>>>>>      Google Dataflow service). The current direct runner is bounded
>>>>> only
>>>>> (batch
>>>>>      execution) but there is work in progress to support unbounded (as
>>>>> in
>>>>> Java).
>>>>>      -
>>>>>
>>>>>      Testing: The code base has unit test coverage for all the modules
>>>>>
>>>> and
>>>
>>>>      several integration and end to end tests (similar in coverage to
>>>>> the
>>>>> Java
>>>>>      SDK). Streaming is not well tested end to end yet since Cloud
>>>>>
>>>> Dataflow
>>>
>>>>      focused first on batch.
>>>>>      -
>>>>>
>>>>>      Docs: We have matching Python documentation for the features
>>>>>
>>>> currently
>>>
>>>>      supported by Cloud Dataflow. The docs are on cloud.google.com
>>>>>
>>>> (access
>>>
>>>>      only by whitelist due to the alpha stage of the project). Devin is
>>>>> working
>>>>>      on the transition of all docs to Apache.
>>>>>
>>>>>
>>>>> In the next days/weeks we would like to prepare and start migrating the
>>>>> code and you should start seeing some pull requests. We also hope that
>>>>>
>>>> the
>>>
>>>> Beam community will shape the SDK going forward. In particular, all the
>>>>> model improvements implemented for Java (Runner API, etc.) will have
>>>>> equivalents in Python once they stabilize. If you have any advice
>>>>> before
>>>>> we
>>>>> start the journey please let us know.
>>>>>
>>>>> The team that will join the Beam effort consists of me (Silviu
>>>>>
>>>> Calinoiu),
>>>
>>>> Charles Chen, Ahmet Altay, Chamikara Jayalath, and last but not least
>>>>> Robert Bradshaw (who is already an Apache Beam committer).
>>>>>
>>>>> So let us know what you think!
>>>>>
>>>>> Best regards,
>>>>>
>>>>> Silviu
>>>>>
>>>>>
>>>>> --
>>>> Jean-Baptiste Onofré
>>>> [email protected]
>>>> http://blog.nanthrax.net
>>>> Talend - http://www.talend.com
>>>>
>>>>
>>>
>>
> --
> Jean-Baptiste Onofré
> [email protected]
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>

Reply via email to