Age is the largest consideration since the Python SDK was started a few
years after the Java one was started. Another consideration was that the
Python SDK only worked on Dataflow and until recently due to the work with
portability, a few other runners have been able to execute Python
pipelines. And now that there are several runners, the excitement and
development pace around Python has sped up significantly.

Improving the documentation to show examples across multiple languages is a
simple way new contributors can really help the project.

On Wed, Jul 10, 2019 at 6:55 AM Shannon Duncan <[email protected]>
wrote:

> So I know going into this question that there will be varying opinions.
> However I've noticed some things since starting with beam full time a few
> weeks ago.
>
> 1. Python is second party SDK to Beam and doesn't seem to be to feature
> parity with Java.
> 2. Even on supporting modules like fastavro Python still doesn't match up
> with Java features.
> 3. Almost all tutorials and documentation around Beam and Big Data are
> done in Java making it harder to learn the Python side of things.
>
> So with these observations I'm curious. Is it just the age of the Python
> SDK as the reason behind the lack of feature parity?
>
> I'm also curious, are there any noticeable performance differences with
> using Python SDK vs Java SDK in dataflow?
>
> Thanks,
> Shannon
>

Reply via email to