[GitHub] [flink] Thesharing edited a comment on pull request #19275: [FLINK-24491] Make the job termination wait until the archiving of ExecutionGraphInfo finishes

GitBox Thu, 31 Mar 2022 04:49:01 -0700


Thesharing edited a comment on pull request #19275:
URL: https://github.com/apache/flink/pull/19275#issuecomment-1084449612



   > Thanks @Thesharing for your contribution. I looked into it and was 
wondering whether you also considered utilizing the chaining of the 
`CompletableFutures` within `handleJobManagerRunnerResult` as a possible 
solution. Right now (on `master`), `jobReachedTerminalState` archives the 
`ExecutionGraph` on the main thread, triggers the archiving of the 
`ExecutionGraph` in the history server if terminated globally, and adding the 
job to the `JobResultEntry` afterwards (in case of a globally terminated 
state). In your solution you're passing the result future of the history server 
archiving through this new class `JobTerminalState` and chain the history 
server archiving result later on.
   > 
   > What about making the `handleJobManagerRunnerResult` and 
`jobManagerRunnerFailed` return a `CompletableFuture<CleanupJobState>` that 
completes in the case of a globally terminal job state after the history server 
archiving took place and the JobResultStore entry was written. WDYT?
   
   Thank you so much for your review and suggestions, @XComp! 😄 
   
   
![Illustration](https://user-images.githubusercontent.com/6576831/161047726-613407d3-114e-4a28-a536-de2b61552576.jpg)
   
   I draw an illustration for two options. Option 1 chains the result future of 
archiving and the result future of resource cleanup. Option 2 makes the 
`handleJobManagerRunnerResult` and `jobManagerRunnerFailed` return a 
`CompletableFuture<CleanupJobState>`.
   
   Option 1 could parallelize two IO operations. Furthermore, if the archiving 
takes a long time in the worst case, the job may be terminated by users or 
external resource providers. In this situation, the job still get cleaned up. 
Therefore, I think maybe option 1 is better. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [flink] Thesharing edited a comment on pull request #19275: [FLINK-24491] Make the job termination wait until the archiving of ExecutionGraphInfo finishes

Reply via email to