[
https://issues.apache.org/jira/browse/FLINK-32098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17722712#comment-17722712
]
Weijie Guo commented on FLINK-32098:
------------------------------------
Perhaps it is possible to maintain a simple cache in the
{{FileSystemJobResultStore}}, but it is uncertain whether this would have
sufficient benefits, given that it would introduce some additional complexity.
I will see if reduce the number of {{isInGloballyTerminalState}} in the
{{Dispatcher}} is enough to solve the problem.
> Dispatcher#submitJob calls Dispatcher#isInGloballyTerminalState up to three
> times which might be expensive due to IO
> --------------------------------------------------------------------------------------------------------------------
>
> Key: FLINK-32098
> URL: https://issues.apache.org/jira/browse/FLINK-32098
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Coordination
> Affects Versions: 1.17.0, 1.16.1, 1.18.0
> Reporter: Matthias Pohl
> Priority: Major
>
> {{Dispatcher#submitJob}} calls {{Dispatcher#isInGloballyTerminalState}} up to
> three times (1x through {{Dispatcher#isDuplicateJob}} and 2x directly) which
> calls {{JobResultStore#hasJobResultStore}}. {{hasJobResultStore}} calls
> {{hasDirtyJobResultEntry}} and {{hasCleanJobResultEntry}} if the underlying
> job hasn't completed globally, yet. Both calls run {{FileSystem#exists}} on
> an non-existing file which can be a quite expensive operation (depending on
> the {{FileSystem}} implementation for object storage) since it might require
> a full table scan.
> tbh, so far, nobody complained. But we might want to either reconsider the
> {{FileSystemJobResultStore}}/{{JobResultStore#hasJobResultEntry}}
> implementation or, at least, reduce the number of
> {{isInGloballyTerminalState}} in the {{Dispatcher}} and document the
> performance issue in the JavaDoc.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)