amogh-jahagirdar commented on code in PR #8525:
URL: https://github.com/apache/iceberg/pull/8525#discussion_r1330548802
##########
api/src/main/java/org/apache/iceberg/DeleteFiles.java:
##########
@@ -81,4 +81,8 @@ default DeleteFiles deleteFile(DataFile file) {
* @return this for method chaining
*/
DeleteFiles caseSensitive(boolean caseSensitive);
+
+ DeleteFiles validateFilesExist(boolean validateFilesToDeleteExist);
+
+ DeleteFiles validateFromSnapshot(long snapshot);
Review Comment:
Oh I maybe see what you're getting at, do you mean that it's sufficient to
simply just check the file existence (just a fileIO existence check) instead of
leveraging the validateDataFilesExist method? Sounds reasonable. I was
originally thinking we'd want to go through the manifests, because those would
be legitimate operations on Iceberg tables. As opposed to someone not going
through Iceberg and removing files and corrupting their table through some
other process.
In the end file existence is the only thing we care about in terms of
validation for the delete to succeed. I *think* going through the manifests via
validateDataFilesExist would be more efficient for larger groups of files we
would want to validate. We end up going through entries in manifests vs
multiple HEAD requests for existence. Maybe this doesn't matter in practice
though and it's better just to do the simple existence check
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]