pvary commented on issue #1617:
URL: https://github.com/apache/iceberg/issues/1617#issuecomment-878800783


   Thanks for the detailed answer @flyrain! This really helps to have a clear 
understanding of the tasks at hand. 
   
   I would like to share my thoughts about 2 points:
   
   > The hard part is to enable incremental sync-up between them and 
bidirectional replication, which are quite common DR(Disaster recovery) use 
cases.
   
   I think that if we chose the relative path approach then replication becomes 
quite straightforward, since we do not have to handle the mix of different 
paths (source absolute path for the new metadata files, destination absolute 
path for the old metadata files). We just need to copy the metadata files for 
one directional replication. The bidirectional replication is a different 
kettle of fish because of the commit resolution complexity, but I think it is 
also easier since we do not have to care about manifests and manifest-lists
   
   > However, the relative-path approach requires the minimal metadata file 
rewrite, probably only metadata.json per our discussion.
   
   What do we want to change in the json? Is it only the path of the table, or 
we have to rewrite something else as well? Could we use something like the 
LocationProvider (which generates new datafile locations) to make the path 
resolution pluggable and store only the config in the table? 
   
   Thanks, Peter 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to