Shammon created FLINK-25055:
-------------------------------
Summary: Support listen and notify mechanism for PartitionRequest
Key: FLINK-25055
URL: https://issues.apache.org/jira/browse/FLINK-25055
Project: Flink
Issue Type: Improvement
Components: Runtime / Network
Affects Versions: 1.13.3, 1.12.5, 1.14.0
Reporter: Shammon
We submit batch jobs to flink session cluster with eager scheduler for olap. JM
deploys subtasks to TaskManager independently, and the downstream subtasks may
start before the upstream ones are not ready. The downstream subtask sends
PartitionRequest to upstream ones, and may receive PartitionNotFoundException
from them. Then it will retry to send PartitionRequest after a few ms until
timeout.
The current approach raises two problems. First, there will be too many retry
PartitionRequest messages. Each downstream subtask will send PartitionRequest
to all its upstream subtasks and the total number of messages will be O(N*N),
where N is the parallelism of subtasks. Secondly, the interval between polling
retries will increase the delay for upstream and downstream tasks to confirm
PartitionRequest.
We want to support listen and notify mechanism for PartitionRequest when the
job needs no failover. Upstream TaskManager will add the PartitionRequest to a
listen list with a timeout checker, and notify the request when the task
register its partition in the TaskManager.
[~nkubicek] I noticed that your scenario of using flink is similar to ours.
What do you think? And hope to hear from you [~trohrmann] THX
--
This message was sent by Atlassian Jira
(v8.20.1#820001)