openinx commented on a change in pull request #2294:
URL: https://github.com/apache/iceberg/pull/2294#discussion_r596629785
##########
File path: core/src/main/java/org/apache/iceberg/BaseRewriteFiles.java
##########
@@ -40,19 +40,55 @@ protected String operation() {
return DataOperations.REPLACE;
}
+ private void verifyInputAndOutputFiles(Set<DataFile> dataFilesToDelete,
Set<DeleteFile> deleteFilesToDelete,
+ Set<DataFile> dataFilesToAdd,
Set<DeleteFile> deleteFilesToAdd) {
+ int filesToDelete = 0;
+ if (dataFilesToDelete != null) {
+ filesToDelete += dataFilesToDelete.size();
+ }
+
+ if (deleteFilesToDelete != null) {
+ filesToDelete += deleteFilesToDelete.size();
+ }
+
+ Preconditions.checkArgument(filesToDelete > 0, "Files to delete cannot be
null or empty");
+
+ if (deleteFilesToDelete == null || deleteFilesToDelete.isEmpty()) {
+ // When there is no delete files in the rewrite action, data files to
add cannot be null or empty.
+ Preconditions.checkArgument(dataFilesToAdd != null &&
dataFilesToAdd.size() > 0,
+ "Data files to add can not be null or empty because there's no
delete file to rewrite");
+ Preconditions.checkArgument(deleteFilesToAdd == null ||
deleteFilesToAdd.isEmpty(),
+ "Delete files to add must be null or empty because there's no delete
file to rewrite");
+ }
+ }
+
@Override
- public RewriteFiles rewriteFiles(Set<DataFile> filesToDelete, Set<DataFile>
filesToAdd) {
- Preconditions.checkArgument(filesToDelete != null &&
!filesToDelete.isEmpty(),
- "Files to delete cannot be null or empty");
- Preconditions.checkArgument(filesToAdd != null && !filesToAdd.isEmpty(),
Review comment:
> Looks like we are changing the logic now to not enforce input sets to
be non-nullable?
Yes, in current version we are required to pass non-empty and non-null
`Set` because we must ensure the input files and output files have some rows to
rewrite & generate. After introducing the format v2, the data input files can
be empty (if we plan to convert eq-deletes to pos-deletes), then I think we
also don't have to require the user to pass a non-null empty set. Here passing
a `null` or `ImmutableMap.of()` , both of them looks good to me.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]