szehon-ho commented on pull request #3844:
URL: https://github.com/apache/iceberg/pull/3844#issuecomment-1069766220


   I'm a bit conflicted, it makes sense to have fast truncate, but I think the 
presence of DELETED entries in manifest is also used in other places to check 
whether data has been deleted, like example:
   
    1. checking serializable-isolation of concurrent operations (must fail if 
data they use is deleted)
    2. CDC design (to mark row as deleted row)
   
   If we truncate this way from Spark/Flink then any system using those wont 
work, is it a concern?  Or is it more like a drop -table operation where we 
dont care anymore about the table.  cc @aokolnychyi 
   
   The other thought is that we can achieve the same by doing 
DeleteFiles.deleteFromRowFilter(Expressions.alwaysTrue()), it is a bit slower 
in having to read each manifest file, but still faster than having to read data 
files, not sure what others think.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to