Has anyone done any testing around the performance difference of Python SDK
vs Java SDK on Google Dataflow?

We recently dropped our requirement for sequence files in our pipeline
which opens the door to using the python SDK vs the Java SDK. But my
concern is loss of performance.

In Java we control our serialization very carefully between pipeline items
and my fear is loosing control of that in Python, so I'm curious about the
speed of serialization of generic python items like dictionaries, lists,
tuples, etc in context of dataflow.

Thanks!
Shannon Duncan

Reply via email to