dawidwys commented on pull request #17136:
URL: https://github.com/apache/flink/pull/17136#issuecomment-988789338


   Hey @zoltar9264,
   I am afraid this solution does not work. The problem is, as far as I 
understand the change, while creating the state handles you are creating 
relative paths to the top level directory of all checkpoints. So e.g. if we 
have a path like: `<checkpoints-dir>/<job-id>/chk-42` you make paths relative 
to `<checkpoints-dir>/<job-id>` (or `<checkpoints-dir>`). However when 
restoring we use all relative paths as relative to the checkpoint's metadata. 
In case of restoring from `chk-42` it would be `<checkpoints-dir>/<job-id>`.
   
   There is one more caveat that checkpoints might share files with checkpoints 
from previous runs. So e.g. `checkpoints-2/job-2/chk-43` might depend on files 
in a directory `checkpoints-1/job-1`. Files in the first directory would have 
different base path than files in the other one.
   
   IMO, as long as a snapshot is not self contained it is very fishy to be 
relocatable. It is really hard to know which files need to be relocated as they 
might be scattered across different places.
   
   I think most of the problems might be solved with incremental savepoints we 
would like to introduce in a near future. Those would be self contained and 
relocatable.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to