[ 
https://issues.apache.org/jira/browse/FLINK-17583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17106402#comment-17106402
 ] 

Stephan Ewen commented on FLINK-17583:
--------------------------------------

Thank you for digging into this and the thorough suggestion.

I need to think a bit about this - it also has some tricky implications with 
another ongoing effort to make savepoints (and non-incremental checkpoint) 
paths relative so that one can copy them around: [FLINK-5763]. The current 
design for that issue makes the assumption that all "exclusive" data is under 
the same parent path and can this be addressed relatively to the metadata 
location.


> Allow option to store a savepoint's _metadata file separate from its data 
> files
> -------------------------------------------------------------------------------
>
>                 Key: FLINK-17583
>                 URL: https://issues.apache.org/jira/browse/FLINK-17583
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Checkpointing
>    Affects Versions: 1.9.1
>            Reporter: Steve Bairos
>            Priority: Minor
>
> (In the description I mainly talk about savepoints, but the plan )
> We have a deployment framework that often needs to be able to return a list 
> of valid savepoints in S3 with a certain prefix. Our assertion is that if an 
> S3 object ends with '_metadata', then it is a valid savepoint. In order to 
> generate the list of valid savepoints, we need to locate all of the _metadata 
> files that start with a given prefix.
> For example, if our S3 bucket's paths look like this:
>  
> {code:java}
> s3://bucket/savepoints/my-job1/2020-04-01/savepoint-123456-1a2b3c4d5e/_metadata
> s3://bucket/savepoints/my-job1/2020-04-01/savepoint-123456-1a2b3c4d5e/9c165546-c326-43c0-9f47-f9a2cfd000ed
> ... thousands of other savepoint data files
> s3://bucket/savepoints/my-job1/2020-04-01/savepoint-123456-1a2b3c4d5e/9c757e5b-92b7-47b8-bfe8-cfe70eb28702
> s3://bucket/savepoints/my-job1/2020-04-01/savepoint-123456-9999999999/_metadata
> s3://bucket/savepoints/my-job1/2020-04-01/savepoint-123456-9999999999/41297fd5-40df-4683-bfb6-534bfddae92a
> ... thousands of other savepoint data files
> s3://bucket/savepoints/my-job1/2020-04-01/savepoint-123456-9999999999/acbe839a-1ec7-4b41-9d87-595d557c2ac6
> s3://bucket/savepoints/my-job1/2020-04-02/savepoint-987654-1100110011/_metadata
> s3://bucket/savepoints/my-job1/2020-04-02/savepoint-987654-1100110011/2d2f5551-56a7-4fea-b25b-b0156660c650
> .... thousands of other savepoint data files
> s3://bucket/savepoints/my-job1/2020-04-02/savepoint-987654-1100110011/c8c410df-5fb0-46a0-84c5-43e1575e8dc5
> ... dozens of other savepoint dirs
> {code}
>  
> In order to get a list of all savepoints that my-job1 could possibly start 
> with, we would want to get all the savepoints that start with the prefix:
> {code:java}
> s3://bucket/savepoints/my-job1 {code}
> Ideally, we would want to have the ability to get a list like this from S3:
> {code:java}
> s3://bucket/savepoints/my-job1/2020-04-01/savepoint-123456-1a2b3c4d5e/_metadata
> s3://bucket/savepoints/my-job1/2020-04-01/savepoint-123456-9999999999/_metadata
> s3://bucket/savepoints/my-job1/2020-04-02/savepoint-987654-1100110011/_metadata{code}
> Unfortunately there is no easy way to get this value because S3's API only 
> allows you to search based on prefix and not postfix. Listing all objects 
> with the prefix 's3://bucket/savepoints/my-job1' and then filtering the list 
> to only include the files that contain _metadata will also not work because 
> there are thousands of savepoint data files that have the same prefix such as:
> {code:java}
> s3://bucket/savepoints/my-job1/2020-04-01/savepoint-123456-1a2b3c4d5e/9c165546-c326-43c0-9f47-f9a2cfd000ed
> s3://bucket/savepoints/my-job1/2020-04-01/savepoint-123456-1a2b3c4d5e/9c757e5b-92b7-47b8-bfe8-cfe70eb28702
> s3://bucket/savepoints/my-job1/2020-04-01/savepoint-123456-9999999999/acbe839a-1ec7-4b41-9d87-595d557c2ac6
> etc.{code}
>  
> I propose that we add a configuration in a similar vein to the S3 entropy 
> injector which allows us to store the _metadata file in a separate path from 
> the savepoint's data files. For example, with this hypothetical configuration:
> {code:java}
> state.checkpoints.split.key: _datasplit_
> state.checkpoints.split.metadata.dir: metadata
> state.checkpoints.split.data.dir: data{code}
> When a user triggers a savepoint with the path
> {code:java}
> s3://bucket/savepoints/_datasplit_/my-job1/2020-05-07/ {code}
> The resulting savepoint that is created looks like:
> {code:java}
> s3://bucket/savepoints/metadata/my-job1/2020-05-07/savepoint-654321-abcdef9876/_metadata
> s3://bucket/savepoints/data/my-job1/2020-05-07/savepoint-654321-abcdef9876/a50fc483-3581-4b55-a37e-b7c61b3ee47f
> s3://bucket/savepoints/data/my-job1/2020-05-07/savepoint-654321-abcdef9876/b0c6b7c0-6b94-43ae-8678-2f7640af1523
> s3://bucket/savepoints/data/my-job1/2020-05-07/savepoint-654321-abcdef9876/c1855b35-c0b7-4347-9352-88423998e5ec{code}
> Notice that the metadata's prefix is 
> {code:java}
>  s3://bucket/savepoints/metadata/my-job1/2020-05-07/{code}
> and the data files' prefix is
> {code:java}
>  s3://bucket/savepoints/data/my-job1/2020-05-07/{code}
> That way if I want to list all the savepoints for my-job1, I can just list 
> all the objects with the prefix 
> {code:java}
>  aws s3 ls --recursive s3://bucket/savepoints/metadata/my-job1/{code}
> And I can get a clean list of just the _metadata files easily.
>  
> One alternative that we've thought about is using is the entropy injection. 
> It technically does separate the _metadata file from the rest of the data as 
> well but it kind of makes a mess of entropy dirs in S3 so it's not our ideal 
> choice. 
>  
> I'm happy to take a shot at implementing the solution I suggested if it is an 
> acceptable solution for Flink. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to