[
https://issues.apache.org/jira/browse/FLINK-23754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dian Fu updated FLINK-23754:
----------------------------
Description:
The newly introduced Python DataStream API chaining optimization allows to
chain together multiple Python DataStream API operators to avoid serialization
and deserialization and improving the performance.
In order to test this new feature I recommend to follow the documentation:
[https://ci.apache.org/projects/flink/flink-docs-master/docs/dev/python/datastream/operators/overview/#operator-chaining|https://ci.apache.org/projects/flink/flink-docs-master/docs/dev/python/datastream/intro_to_datastream_api/]
Note that the documentation PR was just merged and so it may be still not
available. If so, you could also take a look at the documentation PR:
[https://github.com/apache/flink/pull/16953]
The testing should cover but not limited to the following items:
* Chaining could be enabled/disabled according to the documentation
* Chaining works well for operators with multiple inputs / multiple outputs
* Chaining works well in pure Python DataStream API jobs and mixing use of
Python Table API & Python DataStream API
was:
The newly introduced Python DataStream API chaining optimization allows to
chain together multiple Python DataStream API operators to avoid serialization
and deserialization and improving the performance.
In order to test this new feature I recommend to follow the documentation:
[https://ci.apache.org/projects/flink/flink-docs-master/docs/dev/python/datastream/operators/overview/#operator-chaining|https://ci.apache.org/projects/flink/flink-docs-master/docs/dev/python/datastream/intro_to_datastream_api/]
Note that the documentation PR was just merged and so it may be still not
available. If so, you could also take a look at the documentation PR:
https://github.com/apache/flink/pull/16953
> Testing Python DataStream API chaining functionality
> ----------------------------------------------------
>
> Key: FLINK-23754
> URL: https://issues.apache.org/jira/browse/FLINK-23754
> Project: Flink
> Issue Type: Sub-task
> Components: API / Python
> Reporter: Dian Fu
> Priority: Blocker
> Labels: release-testing
> Fix For: 1.14.0
>
>
> The newly introduced Python DataStream API chaining optimization allows to
> chain together multiple Python DataStream API operators to avoid
> serialization and deserialization and improving the performance.
> In order to test this new feature I recommend to follow the documentation:
> [https://ci.apache.org/projects/flink/flink-docs-master/docs/dev/python/datastream/operators/overview/#operator-chaining|https://ci.apache.org/projects/flink/flink-docs-master/docs/dev/python/datastream/intro_to_datastream_api/]
> Note that the documentation PR was just merged and so it may be still not
> available. If so, you could also take a look at the documentation PR:
> [https://github.com/apache/flink/pull/16953]
>
> The testing should cover but not limited to the following items:
> * Chaining could be enabled/disabled according to the documentation
> * Chaining works well for operators with multiple inputs / multiple outputs
> * Chaining works well in pure Python DataStream API jobs and mixing use of
> Python Table API & Python DataStream API
--
This message was sent by Atlassian Jira
(v8.3.4#803005)