[GitHub] [iceberg] gallushi commented on issue #1655: Extend HadoopTableOperations to also work with other FS guarantees

GitBox Tue, 24 Nov 2020 03:22:32 -0800


gallushi commented on issue #1655:
URL: https://github.com/apache/iceberg/issues/1655#issuecomment-732855802



   Hi @rdblue 
   >  can you describe what you're suggesting in a bit more detail? How would 
you use atomic write instead? 
   
   Currently, when using hadoop tables, `HadoopTableOperations` assumes atomic 
rename guarantees, and (when committing) proceeds to rename the temp snapshot 
object from its temp name to the snapshot object name.
   However - atomic write can also be used - instead of using the temp object, 
we can directly (and atomically) write the snapshot object. the write will 
succeed iff the object did not exist.
   
   > And how would you detect whether to use atomic write or atomic rename?
   
   I think using a config on a scheme level makes sense (this way one can 
control on which FS to use atomic write and on which ones to stay with atomic 
rename) since atomic write/ atomic rename are FS level guarantees.
   
   > In general, I would not recommend using a file system for this guarantee. 
It's better to use a database transaction for the atomic update operation. 
That's why we want to have support for a variety of catalog plugins in addition 
to Hive, like JDBC, Nessie, and Glue.
   
   Yes; however - adding support for atomic write is a relatively low-hanging 
fruit, and storage systems such as IBM Cloud Object Storage can then be used 
even without an external catalog.
   
   while we discuss this, i'll open a PR with the changes i have in mind.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] gallushi commented on issue #1655: Extend HadoopTableOperations to also work with other FS guarantees

Reply via email to