[
https://issues.apache.org/jira/browse/FLINK-22913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dian Fu closed FLINK-22913.
---------------------------
Resolution: Done
> Support Python UDF chaining in Python DataStream API
> ----------------------------------------------------
>
> Key: FLINK-22913
> URL: https://issues.apache.org/jira/browse/FLINK-22913
> Project: Flink
> Issue Type: Improvement
> Components: API / Python
> Reporter: Dian Fu
> Assignee: Dian Fu
> Priority: Major
> Fix For: 1.14.0
>
>
> Currently, for the following job:
> {code}
> ds = ..
> ds.map(map_func1)
> .map(map_func2)
> {code}
> The Python function `map_func1` and `map_func2` will runs in separate Python
> workers and the result of `map_func1` will be transferred to JVM and then
> transferred to `map_func2` which may resides in another Python worker. This
> introduces redundant communication and serialization/deserialization overhead.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)