Julien Tournay created FLINK-31144:
--------------------------------------
Summary: Slow scheduling on large-scale batch jobs
Key: FLINK-31144
URL: https://issues.apache.org/jira/browse/FLINK-31144
Project: Flink
Issue Type: Bug
Components: Runtime / Coordination
Reporter: Julien Tournay
Attachments: flink-1.17-snapshot-1676473798013.nps
When executing a complex job graph at high parallelism
`DefaultPreferredLocationsRetriever.getPreferredLocationsBasedOnInputs` can get
slow and cause long pauses where the JobManager becomes unresponsive and all
the taskmanagers just wait. I've attached a VisualVM snapshot to illustrate the
problem.[^flink-1.17-snapshot-1676473798013.nps]
At Spotify we have complex jobs where this issue can cause batch "pause" of 40+
minutes and make the overall execution 30% slower or more.
More importantly this prevent us from running said jobs on larger cluster as
adding resources to the cluster worsen the issue.
We have successfully tested a test where
`DefaultPreferredLocationsRetriever.getPreferredLocationsBasedOnInputs` was
completely commented and simply returns an empty collection and confirmed it
solves the issue.
In the same spirit as a recent change
([https://github.com/apache/flink/blob/43f419d0eccba86ecc8040fa6f521148f1e358ff/flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/DefaultPreferredLocationsRetriever.java#L98-L102)]
there could be a mechanism in place to detect when Flink run into this
specific issue and just skip the call to `getInputLocationFutures`
[https://github.com/apache/flink/blob/43f419d0eccba86ecc8040fa6f521148f1e358ff/flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/DefaultPreferredLocationsRetriever.java#L105-L108.]
I'm not familiar enough with the internals of Flink to propose a more advanced
fix, however it seems like a configurable threshold on the number of consumer
vertices above which the preferred location is not computed would do. If this
solution is good enough, I'd be happy to submit a PR.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)