Matthias Pohl created FLINK-31709: ------------------------------------- Summary: JobResultStore and ExecutionGraphInfoStore could be merged Key: FLINK-31709 URL: https://issues.apache.org/jira/browse/FLINK-31709 Project: Flink Issue Type: New Feature Components: Runtime / Coordination Reporter: Matthias Pohl
This is a initial proposal for an improvement in coordination layer: The {{JobResultStore}} (JRS) was introduced as part of [FLIP-194|https://cwiki.apache.org/confluence/display/FLINK/FLIP-194%3A+introduce+the+jobresultstore]. For now, it only stores the JobResult. Through the JRS, jobs can be marked as finished even when the JobManager fails and the information from the {{ExecutionGraphInfoStore}} is lost (see FLINK-11813). While implementing {{FLIP-194}}, it became apparent, that we have some redundancy between the JRS and the {{ExecutionGraphInfoStore}}. Both components store some meta information of a finished job. The {{ExecutionGraphInfoStore}} is used to make information about the finished job available in user-facing APIs (REST, web-UI). The JRS is used to expose the job's state to the cleanup logic and stores limited data. This proposal is about merging the two and making the {{ArchivedExecutionGraph}} information available even after a JobManager is restarted. That way, completed jobs can be still listed in the job overview after a Flink cluster restart. Additionally, we could provide the last checkpoint information. The JRS would be a way to access this information even after the Flink cluster is shut down. The latter feature would be also a way to improve the Flink Kubernetes Operator's latest-state handling. -- This message was sent by Atlassian Jira (v8.20.10#820010)