rdblue commented on a change in pull request #3369:
URL: https://github.com/apache/iceberg/pull/3369#discussion_r735789280



##########
File path: 
spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -212,33 +220,44 @@ private String getRowLevelOperationMode(String operation) 
{
 
   @Override
   public boolean canDeleteWhere(Filter[] filters) {
-    if (table().specs().size() > 1) {
-      // cannot guarantee a metadata delete will be successful if we have 
multiple specs
-      return false;
-    }
-
-    Set<Integer> identitySourceIds = table().spec().identitySourceIds();
-    Schema schema = table().schema();
+    Expression deleteExpr = Expressions.alwaysTrue();
 
     for (Filter filter : filters) {
-      // return false if the filter requires rewrite or if we cannot translate 
the filter
-      if (requiresRewrite(filter, schema, identitySourceIds) || 
SparkFilters.convert(filter) == null) {
+      Expression expr = SparkFilters.convert(filter);
+      if (expr != null) {
+        deleteExpr = Expressions.and(deleteExpr, expr);
+      } else {
         return false;
       }
     }
 
-    return true;
+    return deleteExpr == Expressions.alwaysTrue() || 
canDeleteUsingMetadata(deleteExpr);
   }
 
-  private boolean requiresRewrite(Filter filter, Schema schema, Set<Integer> 
identitySourceIds) {
-    // TODO: handle dots correctly via v2references
-    // TODO: detect more cases that don't require rewrites
-    Set<String> filterRefs = Sets.newHashSet(filter.references());
-    return filterRefs.stream().anyMatch(ref -> {
-      Types.NestedField field = schema.findField(ref);
-      ValidationException.check(field != null, "Cannot find field %s in 
schema", ref);
-      return !identitySourceIds.contains(field.fieldId());
-    });
+  // a metadata delete is possible iff matching files can be deleted entirely
+  private boolean canDeleteUsingMetadata(Expression deleteExpr) {

Review comment:
       Yeah, we would need the expressions to be equal for each spec that has a 
manifest matching. So we could filter manifests using the manifest list, then 
get the specs and do the projection. That way you could take advantage of some 
partition filtering to eliminate specs.
   
   I actually like what you have here. It shouldn't be a big problem. Let's see 
how it goes with this and we can always introduce a lighter-weight version 
later. Luckily, this should fail fast.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to