HeartSaVioR edited a comment on issue #1286:
URL: https://github.com/apache/iceberg/issues/1286#issuecomment-670816890


   At a glance of codebase, looks like applying the changes into current base 
snapshot is executed per retrial. The way to build manifests/snapshot looks to 
be always based on the operation, even Iceberg can also leverage the 
information about the delta of previous base snapshot vs new base snapshot when 
retrying.
   
   We could do differently when snapshots after base snapshots all came from 
"fast append" operations. 
   (I'm assuming we believe the information of "operation" for the snapshot 
information. If we have to read through manifest list files & manifest files 
then probably introduce more latency.)
   
   Probably the cheapest approach would be allowing reorder of snapshots - 
insert the snapshot between base snapshot and the snapshot having base snapshot 
as parent. As we look to add a new snapshot to only the tail (append), so it is 
only viable if we are OK with breaking the policy.
   
   Alternatively, we can list up manifests written from snapshots after base 
snapshots, and only add these files to manifest list file (it would be nice if 
we can simply append, but if not possible, read and merge into new file) in 
snapshot created in previous trial, and write metadata for the modified 
snapshot and commit.
   
   Does it make sense? If it makes sense for us, I'll try to do with some POC.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to