[
https://issues.apache.org/jira/browse/FLINK-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14519308#comment-14519308
]
ASF GitHub Bot commented on FLINK-1927:
---------------------------------------
Github user zentol commented on the pull request:
https://github.com/apache/flink/pull/638#issuecomment-97418740
oh snap i just noticed a big flaw... well let's put this PR on hold for a
bit.
I'm simply re executing the plan file on each node, but forgot to deal with
arguments that were passed to the file -.-
> [Py] Rework operator distribution
> ---------------------------------
>
> Key: FLINK-1927
> URL: https://issues.apache.org/jira/browse/FLINK-1927
> Project: Flink
> Issue Type: Improvement
> Components: Python API
> Affects Versions: 0.9
> Reporter: Chesnay Schepler
> Assignee: Chesnay Schepler
> Priority: Minor
> Fix For: 0.9
>
>
> Currently, the python operator is created when execution the python plan
> file, serialized using dill and saved as a byte[] in the java function. It is
> then deserialized at runtime on each node.
> The current implementation is fairly hacky, and imposes certain limitations
> that make it hard to work with. Chaining, or generally saving other
> user-code, always requires a separate deserialization step after
> deserializing the operator.
> These issues can be easily circumvented by rebuilding the (python) plan on
> each node, instead of serializing the operator. The plan creation is
> deterministic, and every operator is uniquely identified by an ID that is
> already known to the java function.
> This change will allow us to easily support custom serializers.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)