Awesome ! Thanks !

Agree with Davor to create a feature branch.

Regards
JB

On 06/14/2016 09:22 PM, Silviu Calinoiu wrote:
Thanks everybody for the welcoming and feedback. The initial code move was
proposed as pull request #461 [1].

Looking forward to working with everybody in the Beam community and
especially any Pythonistas out there.

Thanks,
Silviu

[1] https://github.com/apache/incubator-beam/pull/461

On Sat, Jun 4, 2016 at 12:35 AM, Ismaël Mejía <[email protected]> wrote:

Excellent guys, Welcome to Beam !

I am looking for ways to integrate Beam with the standard notebook tools
(Zẽppelin / Jupyter [ipython], so I am really happy to see the python SDK
arriving to Beam, Awesome.

Ismaël Mejía

On Fri, Jun 3, 2016 at 7:17 PM, Amit Sela <[email protected]> wrote:

Welcome Python people ;)

I know a few people who've been waiting for this one!

On Fri, Jun 3, 2016, 19:53 Davor Bonaci <[email protected]>
wrote:

Welcome Python SDK, as well as Silviu, Charles, Ahmet and Chamikara!

On Fri, Jun 3, 2016 at 7:07 AM, Jean-Baptiste Onofré <[email protected]>
wrote:

Absolutely ;)


On 06/03/2016 03:51 PM, James Malone wrote:

Hey Silviu!

I think JB is proposing we create a python directory in the sdks
directory
in the root repository (and modify the configuration files
accordingly):

     https://github.com/apache/incubator-beam/tree/master/sdks

This Beam document here titled "Apache Beam (Incubating): Repository
Structure" details the proposed repository structure and may be
useful:






https://drive.google.com/a/google.com/folderview?id=0B-IhJZh9Ab52OFBVZHpsNjc4eXc

Best,

James



On Fri, Jun 3, 2016 at 6:34 AM, Silviu Calinoiu
<[email protected]>
wrote:

Hi JB,
Thanks for the welcome! I come from the Python land so  I am not
quite
familiar with Maven. What do you mean by a Maven module? You mean
an
artifact so you can install things? In Python, people are used to
packages
downloaded from PyPI (pypi.python.org -- which is sort of Maven
for
Python). Whatever is the standard way of doing things in Apache
we'll
do
it. Just asking for clarifications.

By the way this discussion is very useful since we will have to
iron
out
several details like this.
Thanks,
Silviu

On Fri, Jun 3, 2016 at 6:19 AM, Jean-Baptiste Onofré <
[email protected]>
wrote:

Hi Silviu,

thanks for detailed update and great work !

I would advice to create a:

sdks/python

Maven module to store the Python SDK.

WDYT ?

By the way, welcome aboard and great to have you all guys in the
team
!

Regards
JB

On 06/03/2016 03:13 PM, Silviu Calinoiu wrote:

Hi all,

My name is Silviu Calinoiu and I am a member of the Cloud
Dataflow
team
working on the Python SDK.  As the original Beam proposal (
https://wiki.apache.org/incubator/BeamProposal) mentioned, we
have
been
planning to merge the Python SDK into Beam. The Python SDK is in
an

early

stage of development (alpha milestone) and so this is a good time
to

move

the code without causing too much disruption to our customers.
Additionally, this enables the Beam community to contribute as
soon
as
possible.

The current state of the SDK is as follows:

      -

      Open-sourced at
https://github.com/GoogleCloudPlatform/DataflowPythonSDK/


      -

      Model: All main concepts are present.
      -

      I/O: SDK supports text (Google Cloud Storage) and BigQuery

connectors

      and has a framework for adding additional sources and sinks.
      -

      Runners: SDK has two pipeline runners: direct runner (in
process,
local
      execution) and Cloud Dataflow runner for batch pipelines
(submit
job
to
      Google Dataflow service). The current direct runner is
bounded
only
(batch
      execution) but there is work in progress to support
unbounded
(as
in
Java).
      -

      Testing: The code base has unit test coverage for all the
modules

and

      several integration and end to end tests (similar in coverage
to
the
Java
      SDK). Streaming is not well tested end to end yet since
Cloud

Dataflow

      focused first on batch.
      -

      Docs: We have matching Python documentation for the features

currently

      supported by Cloud Dataflow. The docs are on
cloud.google.com

(access

      only by whitelist due to the alpha stage of the project).
Devin
is
working
      on the transition of all docs to Apache.


In the next days/weeks we would like to prepare and start
migrating
the
code and you should start seeing some pull requests. We also hope
that

the

Beam community will shape the SDK going forward. In particular,
all
the
model improvements implemented for Java (Runner API, etc.) will
have
equivalents in Python once they stabilize. If you have any advice
before
we
start the journey please let us know.

The team that will join the Beam effort consists of me (Silviu

Calinoiu),

Charles Chen, Ahmet Altay, Chamikara Jayalath, and last but not
least
Robert Bradshaw (who is already an Apache Beam committer).

So let us know what you think!

Best regards,

Silviu


--
Jean-Baptiste Onofré
[email protected]
http://blog.nanthrax.net
Talend - http://www.talend.com




--
Jean-Baptiste Onofré
[email protected]
http://blog.nanthrax.net
Talend - http://www.talend.com






--
Jean-Baptiste Onofré
[email protected]
http://blog.nanthrax.net
Talend - http://www.talend.com

Reply via email to