[
https://issues.apache.org/jira/browse/FLINK-15875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dian Fu updated FLINK-15875:
----------------------------
Description:
Currently PyFlink depends on Beam's portability framework for Python UDF
execution. The current dependent version is 2.15.0. We should bump it to
2.19.0(the latest version) as it includes several critical features/fixes, e.g.
1) BEAM-7951: It allows to not serialize the window/timestamp/pane info between
the Java operator and the Python worker which could definitely improve the
performance a lot
2) BEAM-8935: It allows to fail fast if the Python worker start up failed.
Currently it takes 2 minutes to detect the failure if the Python worker is
started failed.
3) BEAM-7948: It supports periodically flush the data between the Java operator
and the Python worker. This feature is especially useful for streaming jobs and
could improve the latency.
was:
Currently PyFlink depends on Beam's portability framework for Python UDF
execution. The current dependent version is 2.15.0. We should bump it to 2.19.0
as it includes several critical features/fixes, e.g.
1) BEAM-7951: It allows to not serialize the window/timestamp/pane info between
the Java operator and the Python worker which could definitely improve the
performance a lot
2) BEAM-8935: It allows to fail fast if the Python worker start up failed.
Currently it takes 2 minutes to detect the failure if the Python worker is
started failed.
3) BEAM-7948: It supports periodically flush the data between the Java operator
and the Python worker. This feature is especially useful for streaming jobs and
could improve the latency.
> Bump Beam to 2.19.0
> -------------------
>
> Key: FLINK-15875
> URL: https://issues.apache.org/jira/browse/FLINK-15875
> Project: Flink
> Issue Type: Improvement
> Components: API / Python
> Reporter: Dian Fu
> Priority: Major
> Fix For: 1.11.0
>
>
> Currently PyFlink depends on Beam's portability framework for Python UDF
> execution. The current dependent version is 2.15.0. We should bump it to
> 2.19.0(the latest version) as it includes several critical features/fixes,
> e.g.
> 1) BEAM-7951: It allows to not serialize the window/timestamp/pane info
> between the Java operator and the Python worker which could definitely
> improve the performance a lot
> 2) BEAM-8935: It allows to fail fast if the Python worker start up failed.
> Currently it takes 2 minutes to detect the failure if the Python worker is
> started failed.
> 3) BEAM-7948: It supports periodically flush the data between the Java
> operator and the Python worker. This feature is especially useful for
> streaming jobs and could improve the latency.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)