anuragmantri commented on issue #4159:
URL: https://github.com/apache/iceberg/issues/4159#issuecomment-1081241359


   Thanks for this thread. 
   
   > One valuable thing to add in the Iceberg spec is the list (or set?) of all 
the table locations used.
   
   This makes sense but we should define what this list means. There are some 
questions that come to my mind. 
   - Do these locations mean a table can have multiple root locations 
simultaneously? 
   - How does this work when object store path is used since users may have 
same object store path bucket for multiple tables?
   - In case of DR, the table is replicated to a different location. Are all of 
these replicated locations part of this list? 
   
   Incidentally, most of these questions also came up in the [dev discussion 
](https://lists.apache.org/[email protected]:2021-8:Iceberg%20disaster%20recovery%20and%20relative%20path%20sync-up)
 on relative paths/DR strategy. Supporting multiple roots is mentioned in 
`Phase 2` in [this design 
doc](https://docs.google.com/document/d/1RDEjJAVEXg1csRzyzTuM634L88vvI0iDHNQQK3kOVR0/edit#heading=h.last3sm2iqbr).
 Implementation is yet to be discussed. It maybe a good idea to discuss it here.
   
   Coming back to the problem, assuming we have such a list of owned table 
locations, how would we make sure all the files in these locations are indeed 
owned by Iceberg table? In other words, can we safely delete `all` files in 
these locations?
   
   
   
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to