Dian Fu created FLINK-22913: ------------------------------- Summary: Support Python UDF chaining in Python DataStream API Key: FLINK-22913 URL: https://issues.apache.org/jira/browse/FLINK-22913 Project: Flink Issue Type: Improvement Components: API / Python Reporter: Dian Fu Fix For: 1.14.0
Currently, for the following job: {code} ds = .. ds.map(map_func1) .map(map_func2) {code} The Python function `map_func1` and `map_func2` will runs in separate Python workers and the result of `map_func1` will be transferred to JVM and then transferred to `map_func2` which may resides in another Python worker. This introduces redundant communication and serialization/deserialization overhead. -- This message was sent by Atlassian Jira (v8.3.4#803005)