Re: Replacing Python DirectRunner apply_* hooks with PTransformOverrides

2018-02-02 Thread Kenneth Knowles
Awesome, nice! On Fri, Feb 2, 2018 at 11:00 AM, Charles Chen wrote: > Thanks Kenn. We already do the Runner API roundtripping (I believe Robert > implemented this). With this change, we would start doing exactly what > you're suggesting, where we apply overrides to a

Re: Replacing Python DirectRunner apply_* hooks with PTransformOverrides

2018-02-02 Thread Ahmet Altay
+1 to this change. Thank you Charles for improving the DirectRunner, sharing your progress and seeking feedback. This change would allow us to migrate to a faster DirectRunner for Python. A long time requested feature and an important part of the first use experience for new users trying out

Re: Replacing Python DirectRunner apply_* hooks with PTransformOverrides

2018-02-02 Thread Charles Chen
Thanks Kenn. We already do the Runner API roundtripping (I believe Robert implemented this). With this change, we would start doing exactly what you're suggesting, where we apply overrides to a post-deserialization pipeline. On Thu, Feb 1, 2018 at 6:45 PM Kenneth Knowles

Re: Replacing Python DirectRunner apply_* hooks with PTransformOverrides

2018-02-01 Thread Kenneth Knowles
+1 for removing apply_* For the Java SDK, removing specialized intercepts was an important first step towards the portability framework. I wonder if there is a way for the Python SDK to leapfrog, taking advantage of some of the lessons that Java learned a bit more painfully. Most pertinent I

Replacing Python DirectRunner apply_* hooks with PTransformOverrides

2018-02-01 Thread Charles Chen
In the Python DirectRunner, we currently use apply_* overrides to override the operation of the default .expand() operation for certain transforms. For example, GroupByKey has a special implementation in the DirectRunner, so we use an apply_* override hook to replace the implementation of