[
https://issues.apache.org/jira/browse/FLINK-11196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16744975#comment-16744975
]
Stephan Ewen commented on FLINK-11196:
--------------------------------------
I am not sure I understand the problem fully.
The current design was made explicitly to have predictable {{_metadata}} file
locations, so that checkpoints can be manually resumed. That is why the
{{_metadata}} file paths have no entropy.
When you configure {{state.checkpoints.sir}} to
"s3://bucket/checkpoints/ENTROPY_KEY/" you should get files as outlined below,
with predictable paths for the {{_metadata}} files.
- for checkpoint 1
- {{s3://bucket/checkpoints/chk-1/_metadata}}
- {{s3://bucket/checkpoints/RANDOM_STUFF/chk-1/state-file-x}}
- {{s3://bucket/checkpoints/RANDOM_STUFF/chk-1/state-file-y}}
- ...
- for checkpoint 2
- {{s3://bucket/checkpoints/chk-2/_metadata}}
- {{s3://bucket/checkpoints/RANDOM_STUFF/chk-2/state-file-x}}
- {{s3://bucket/checkpoints/RANDOM_STUFF/chk-2/state-file-y}}
- ...
> Extend S3 EntropyInjector to use key replacement (instead of key removal)
> when creating checkpoint metadata files
> -----------------------------------------------------------------------------------------------------------------
>
> Key: FLINK-11196
> URL: https://issues.apache.org/jira/browse/FLINK-11196
> Project: Flink
> Issue Type: Improvement
> Components: FileSystem
> Affects Versions: 1.7.0
> Reporter: Mark Cho
> Priority: Major
> Labels: pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
> We currently use S3 entropy injection when writing out checkpoint data.
> We also use external checkpoints so that we can resume from a checkpoint
> metadata file later.
> The current implementation of S3 entropy injector makes it difficult to
> locate the checkpoint metadata files since in the newer versions of Flink,
> `state.checkpoints.dir` configuration controls where the metadata and state
> files are written, instead of having two separate paths (one for metadata,
> one for state files).
> With entropy injection, we replace the entropy marker in the path specified
> by `state.checkpoints.dir` with entropy (for state files) or we strip out the
> marker (for metadata files).
>
> We need to extend the entropy injection so that we can replace the entropy
> marker with a predictable path (instead of removing it) so that we can do a
> prefix query for just the metadata files.
> By not using the entropy key replacement (defaults to empty string), you get
> the same behavior as it is today (entropy marker removed).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)