gallushi edited a comment on issue #1655:
URL: https://github.com/apache/iceberg/issues/1655#issuecomment-732855802


   Hi @rdblue 
   >  can you describe what you're suggesting in a bit more detail? How would 
you use atomic write instead? 
   
   Currently, when using hadoop tables, `HadoopTableOperations` assumes atomic 
rename guarantees, and (when committing) proceeds to rename the temp snapshot 
object from its temp name to the snapshot object name.
   However - atomic write can also be used - instead of using the temp object, 
we can directly (and atomically) write the snapshot object. the write will 
succeed iff the object did not exist.
   
   > And how would you detect whether to use atomic write or atomic rename?
   
   I think using a config on a scheme level makes sense (this way one can 
control on which FS to use atomic write and on which ones to stay with atomic 
rename) since atomic write/ atomic rename are FS level guarantees.
   
   > In general, I would not recommend using a file system for this guarantee. 
It's better to use a database transaction for the atomic update operation. 
That's why we want to have support for a variety of catalog plugins in addition 
to Hive, like JDBC, Nessie, and Glue.
   
   Yes; however (also keeping in mind that this is all in the context of hadoop 
tables, which in any case rely on file system guarantees) - adding support for 
atomic write is a relatively low-hanging fruit, and storage systems such as IBM 
Cloud Object Storage can then be used even without an external catalog.
   
   while we discuss this, i'll open a PR with the changes i have in mind.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to