[
https://issues.apache.org/jira/browse/FLINK-32098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17722684#comment-17722684
]
Matthias Pohl commented on FLINK-32098:
---------------------------------------
I'm linking FLINK-27204 which covers moving all the JobResultStore method calls
into async calls.
> Dispatcher#submitJob calls Dispatcher#isInGloballyTerminalState up to three
> times which might be expensive due to IO
> --------------------------------------------------------------------------------------------------------------------
>
> Key: FLINK-32098
> URL: https://issues.apache.org/jira/browse/FLINK-32098
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Coordination
> Affects Versions: 1.17.0, 1.16.1, 1.18.0
> Reporter: Matthias Pohl
> Priority: Major
>
> {{Dispatcher#submitJob}} calls {{Dispatcher#isInGloballyTerminalState}} up to
> three times (1x through {{Dispatcher#isDuplicateJob}} and 2x directly) which
> calls {{JobResultStore#hasJobResultStore}}. {{hasJobResultStore}} calls
> {{hasDirtyJobResultEntry}} and {{hasCleanJobResultEntry}} if the underlying
> job hasn't completed globally, yet. Both calls run {{FileSystem#exists}} on
> an non-existing file which can be a quite expensive operation (depending on
> the {{FileSystem}} implementation for object storage) since it might require
> a full table scan.
> tbh, so far, nobody complained. But we might want to either reconsider the
> {{FileSystemJobResultStore}}/{{JobResultStore#hasJobResultEntry}}
> implementation or, at least, reduce the number of
> {{isInGloballyTerminalState}} in the {{Dispatcher}} and document the
> performance issue in the JavaDoc.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)