Awesome ! Thanks !
Agree with Davor to create a feature branch.
Regards
JB
On 06/14/2016 09:22 PM, Silviu Calinoiu wrote:
Thanks everybody for the welcoming and feedback. The initial code move was
proposed as pull request #461 [1].
Looking forward to working with everybody in the Beam community and
especially any Pythonistas out there.
Thanks,
Silviu
[1] https://github.com/apache/incubator-beam/pull/461
On Sat, Jun 4, 2016 at 12:35 AM, Ismaël Mejía <[email protected]> wrote:
Excellent guys, Welcome to Beam !
I am looking for ways to integrate Beam with the standard notebook tools
(Zẽppelin / Jupyter [ipython], so I am really happy to see the python SDK
arriving to Beam, Awesome.
Ismaël Mejía
On Fri, Jun 3, 2016 at 7:17 PM, Amit Sela <[email protected]> wrote:
Welcome Python people ;)
I know a few people who've been waiting for this one!
On Fri, Jun 3, 2016, 19:53 Davor Bonaci <[email protected]>
wrote:
Welcome Python SDK, as well as Silviu, Charles, Ahmet and Chamikara!
On Fri, Jun 3, 2016 at 7:07 AM, Jean-Baptiste Onofré <[email protected]>
wrote:
Absolutely ;)
On 06/03/2016 03:51 PM, James Malone wrote:
Hey Silviu!
I think JB is proposing we create a python directory in the sdks
directory
in the root repository (and modify the configuration files
accordingly):
https://github.com/apache/incubator-beam/tree/master/sdks
This Beam document here titled "Apache Beam (Incubating): Repository
Structure" details the proposed repository structure and may be
useful:
https://drive.google.com/a/google.com/folderview?id=0B-IhJZh9Ab52OFBVZHpsNjc4eXc
Best,
James
On Fri, Jun 3, 2016 at 6:34 AM, Silviu Calinoiu
<[email protected]>
wrote:
Hi JB,
Thanks for the welcome! I come from the Python land so I am not
quite
familiar with Maven. What do you mean by a Maven module? You mean
an
artifact so you can install things? In Python, people are used to
packages
downloaded from PyPI (pypi.python.org -- which is sort of Maven
for
Python). Whatever is the standard way of doing things in Apache
we'll
do
it. Just asking for clarifications.
By the way this discussion is very useful since we will have to
iron
out
several details like this.
Thanks,
Silviu
On Fri, Jun 3, 2016 at 6:19 AM, Jean-Baptiste Onofré <
[email protected]>
wrote:
Hi Silviu,
thanks for detailed update and great work !
I would advice to create a:
sdks/python
Maven module to store the Python SDK.
WDYT ?
By the way, welcome aboard and great to have you all guys in the
team
!
Regards
JB
On 06/03/2016 03:13 PM, Silviu Calinoiu wrote:
Hi all,
My name is Silviu Calinoiu and I am a member of the Cloud
Dataflow
team
working on the Python SDK. As the original Beam proposal (
https://wiki.apache.org/incubator/BeamProposal) mentioned, we
have
been
planning to merge the Python SDK into Beam. The Python SDK is in
an
early
stage of development (alpha milestone) and so this is a good time
to
move
the code without causing too much disruption to our customers.
Additionally, this enables the Beam community to contribute as
soon
as
possible.
The current state of the SDK is as follows:
-
Open-sourced at
https://github.com/GoogleCloudPlatform/DataflowPythonSDK/
-
Model: All main concepts are present.
-
I/O: SDK supports text (Google Cloud Storage) and BigQuery
connectors
and has a framework for adding additional sources and sinks.
-
Runners: SDK has two pipeline runners: direct runner (in
process,
local
execution) and Cloud Dataflow runner for batch pipelines
(submit
job
to
Google Dataflow service). The current direct runner is
bounded
only
(batch
execution) but there is work in progress to support
unbounded
(as
in
Java).
-
Testing: The code base has unit test coverage for all the
modules
and
several integration and end to end tests (similar in coverage
to
the
Java
SDK). Streaming is not well tested end to end yet since
Cloud
Dataflow
focused first on batch.
-
Docs: We have matching Python documentation for the features
currently
supported by Cloud Dataflow. The docs are on
cloud.google.com
(access
only by whitelist due to the alpha stage of the project).
Devin
is
working
on the transition of all docs to Apache.
In the next days/weeks we would like to prepare and start
migrating
the
code and you should start seeing some pull requests. We also hope
that
the
Beam community will shape the SDK going forward. In particular,
all
the
model improvements implemented for Java (Runner API, etc.) will
have
equivalents in Python once they stabilize. If you have any advice
before
we
start the journey please let us know.
The team that will join the Beam effort consists of me (Silviu
Calinoiu),
Charles Chen, Ahmet Altay, Chamikara Jayalath, and last but not
least
Robert Bradshaw (who is already an Apache Beam committer).
So let us know what you think!
Best regards,
Silviu
--
Jean-Baptiste Onofré
[email protected]
http://blog.nanthrax.net
Talend - http://www.talend.com
--
Jean-Baptiste Onofré
[email protected]
http://blog.nanthrax.net
Talend - http://www.talend.com
--
Jean-Baptiste Onofré
[email protected]
http://blog.nanthrax.net
Talend - http://www.talend.com