[
https://issues.apache.org/jira/browse/FLINK-13633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Till Rohrmann resolved FLINK-13633.
-----------------------------------
Fix Version/s: 1.10.0
Release Note: All highly available artifacts stored by Apache Flink will
now be stored under `HA_STORAGE_DIR/HA_CLUSTER_ID` with `HA_STORAGE_DIR`
configured by `high-availability.storageDir` and `HA_CLUSTER_DI` configured by
`high-availability.cluster-id`.
Resolution: Done
Done via
8393c9670246c28adc4a254d3d486c8a9857a182
96563401b9924cd8800360bdbce93230b921e1ac
> Move submittedJobGraph and completedCheckpoint to cluster-id subdirectory of
> high-availability storage
> -------------------------------------------------------------------------------------------------------
>
> Key: FLINK-13633
> URL: https://issues.apache.org/jira/browse/FLINK-13633
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / Coordination
> Reporter: Yang Wang
> Assignee: Yang Wang
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.10.0
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Currently, if we enable the high-availability, the ha storage directory
> structure is stored as below. The submittedJobGraph and completedCheckpoint
> are directly stored under the ha storage path. It is reasonable when the
> flink cluster finished normally. However, when the Yarn application is failed
> or killed, the submittedJobGraph and completedCheckpoint will exist there
> forever. Even we could not know which flink cluster(Yarn application) they
> belongs to. So i suggest to move them into application subdirectory. Some
> external tools could be used to clean up these residual files.
> Also, we need to do best effort clean-up before the flink cluster finishes.
> Current ha storage directory structure
> {code:java}
> └── <high-availability.storageDir>
> ├── submittedJobGraph
> ├ ├ <jobgraph1>(random named)
> ├ ├ <jobgraph2>(random named)
> ├── completedCheckpoint
> ├ ├ <checkpoint1>(random named)
> ├ ├ <checkpoint2>(random named)
> ├ ├ <checkpoint3>(random named)
> ├── <high-availability.cluster-id>
> ├── blob
> ├── <blob1>(named as [no_job|job_<job-id>]/blob_<blob-key>)
> {code}
>
> The new ha storage directory structure
> {code:java}
> └── <high-availability.storageDir>
> ├── <high-availability.cluster-id>
> ├── submittedJobGraph
> ├ ├ <jobgraph1>(random named)
> ├ ├ <jobgraph2>(random named)
> ├── completedCheckpoint
> ├ ├ <checkpoint1>(random named)
> ├ ├ <checkpoint2>(random named)
> ├ ├ <checkpoint1>(random named)
> ├── blob
> ├── <blob1>(named as
> [no_job|job_<job-id>]/blob_<blob-key>) {code}
--
This message was sent by Atlassian Jira
(v8.3.2#803003)