amogh-jahagirdar commented on code in PR #8525:
URL: https://github.com/apache/iceberg/pull/8525#discussion_r1330548802


##########
api/src/main/java/org/apache/iceberg/DeleteFiles.java:
##########
@@ -81,4 +81,8 @@ default DeleteFiles deleteFile(DataFile file) {
    * @return this for method chaining
    */
   DeleteFiles caseSensitive(boolean caseSensitive);
+
+  DeleteFiles validateFilesExist(boolean validateFilesToDeleteExist);
+
+  DeleteFiles validateFromSnapshot(long snapshot);

Review Comment:
   Oh I maybe see what you're getting at, do you mean that it's sufficient to 
simply just check the file existence (just a fileIO existence check) instead of 
leveraging the validateDataFilesExist method? Sounds reasonable. I was 
originally thinking we'd want to go through the manifests, because those would 
be legitimate operations on Iceberg tables. As opposed to someone not going 
through Iceberg and removing files and corrupting their table through some 
other process.
   
   In the end file existence is the only thing we care about in terms of 
validation for the delete to succeed. I *think* going through the manifests via 
validateDataFilesExist would be more efficient for larger groups of files we 
would want to validate. We end up going through entries in manifests vs 
multiple HEAD requests for existence. Maybe this doesn't matter in practice 
though and it's better just to do the simple existence check



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to