[
https://issues.apache.org/jira/browse/FLINK-15875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dian Fu updated FLINK-15875:
----------------------------
Description:
Currently PyFlink depends on Beam's portability framework for Python UDF
execution. The current dependent version is 2.15.0. We should bump it to 2.19.0
as it includes several critical features/fixes, e.g.
1) BEAM-7951: It allows to not serialize the window/timestamp/pane info between
the Java operator and the Python worker which could definitely improve the
performance a lot
2) BEAM-8935: It allows to fail fast if the Python worker start up failed.
Currently it takes 2 minutes to detect the failure if the Python worker is
started failed.
3) BEAM-7948: It supports periodically flush the data between the Java operator
and the Python worker. This feature is especially useful for streaming jobs and
could improve the latency.
was:
Currently PyFlink depends on Beam's portability framework for Python UDF
execution. The current dependent version is 2.15.0. We should bump it to 2.19.0
as it includes several critical features/fixes needed, e.g.
1) BEAM-7951: It allows to not serialize the window/timestamp/pane info between
the Java operator and the Python worker which could definitely improve the
performance a lot
2) BEAM-8935: It allows to fail fast if the Python worker start up failed.
Currently it takes 2 minutes to detect the failure if the Python worker is
started failed.
3) BEAM-7948: It supports periodically flush the data between the Java operator
and the Python worker. This feature is especially useful for streaming jobs and
could improve the latency.
> Bump Beam to 2.19.0
> -------------------
>
> Key: FLINK-15875
> URL: https://issues.apache.org/jira/browse/FLINK-15875
> Project: Flink
> Issue Type: Improvement
> Components: API / Python
> Reporter: Dian Fu
> Priority: Major
> Fix For: 1.11.0
>
>
> Currently PyFlink depends on Beam's portability framework for Python UDF
> execution. The current dependent version is 2.15.0. We should bump it to
> 2.19.0 as it includes several critical features/fixes, e.g.
> 1) BEAM-7951: It allows to not serialize the window/timestamp/pane info
> between the Java operator and the Python worker which could definitely
> improve the performance a lot
> 2) BEAM-8935: It allows to fail fast if the Python worker start up failed.
> Currently it takes 2 minutes to detect the failure if the Python worker is
> started failed.
> 3) BEAM-7948: It supports periodically flush the data between the Java
> operator and the Python worker. This feature is especially useful for
> streaming jobs and could improve the latency.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)