XComp commented on code in PR #19275:
URL: https://github.com/apache/flink/pull/19275#discussion_r845831177
##########
flink-runtime/src/main/java/org/apache/flink/runtime/dispatcher/Dispatcher.java:
##########
@@ -1103,23 +1096,64 @@ private void archiveExecutionGraph(ExecutionGraphInfo
executionGraphInfo) {
executionGraphInfo.getArchivedExecutionGraph().getJobID(),
e);
}
+ }
- // do not create an archive for suspended jobs, as this would
eventually lead to multiple
- // archive attempts which we currently do not support
- if
(executionGraphInfo.getArchivedExecutionGraph().getState().isGloballyTerminalState())
{
- final CompletableFuture<Acknowledge> executionGraphFuture =
-
historyServerArchivist.archiveExecutionGraph(executionGraphInfo);
-
- executionGraphFuture.whenComplete(
- (Acknowledge ignored, Throwable throwable) -> {
- if (throwable != null) {
- log.info(
- "Could not archive completed job {}({}) to
the history server.",
-
executionGraphInfo.getArchivedExecutionGraph().getJobName(),
-
executionGraphInfo.getArchivedExecutionGraph().getJobID(),
- throwable);
- }
- });
+ private CompletableFuture<Acknowledge>
archiveExecutionGraphToHistoryServer(
+ ExecutionGraphInfo executionGraphInfo) {
+
+ final CompletableFuture<Acknowledge> executionGraphFuture =
+ FutureUtils.orTimeout(
+
historyServerArchivist.archiveExecutionGraph(executionGraphInfo),
+
configuration.get(ClusterOptions.CLUSTER_SERVICES_SHUTDOWN_TIMEOUT),
+ TimeUnit.MILLISECONDS,
+ getMainThreadExecutor());
Review Comment:
But wouldn't that be an archiving issue? Looking into
`FsJobArchivist.archiveJob` I don't see why we need to list the files of the
bucket? Where is this expensive operation added? `archiveJob` defines a path
that it writes the JSON file to. No iteration over file objects or anything.
That shouldn't cause any problems in object stores, should it? Even the [delete
in
FsJobArchivist:87](https://github.com/apache/flink/blob/c6997c97c575d334679915c328792b8a3067cfb5/flink-runtime/src/main/java/org/apache/flink/runtime/history/FsJobArchivist.java#L87)
is not recursive and shouldn't be too expensive since it's operating on a
distinct object key (i.e. the path).
To me, it sounds like an issue of the FileSystem implementation. Or am I
missing something here?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]