abhishekrb19 commented on code in PR #18953:
URL: https://github.com/apache/druid/pull/18953#discussion_r2740090401


##########
extensions-contrib/druid-iceberg-extensions/src/main/java/org/apache/druid/iceberg/input/IcebergCatalog.java:
##########
@@ -100,12 +107,44 @@ public List<String> extractSnapshotDataFiles(
 
       tableScan = tableScan.caseSensitive(isCaseSensitive());
       CloseableIterable<FileScanTask> tasks = tableScan.planFiles();
-      CloseableIterable.transform(tasks, FileScanTask::file)
-                       .forEach(dataFile -> 
dataFilePaths.add(dataFile.path().toString()));
+
+      Expression detectedResidual = null;
+      for (FileScanTask task : tasks) {
+        dataFilePaths.add(task.file().path().toString());
+
+        // Check for residual filters
+        if (detectedResidual == null) {
+          Expression residual = task.residual();
+          if (residual != null && !residual.equals(Expressions.alwaysTrue())) {
+            detectedResidual = residual;
+          }
+        }
+      }
+
+      // Handle residual filter based on mode
+      if (detectedResidual != null) {
+        String message = StringUtils.format(
+            "Iceberg filter produced residual expression that requires 
row-level filtering. "
+            + "This typically means the filter is on a non-partition column. "
+            + "Residual rows may be ingested unless filtered by transformSpec. 
"
+            + "Residual filter: [%s]",
+            detectedResidual
+        );
+
+        if (residualFilterMode == ResidualFilterMode.FAIL) {
+          throw DruidException.forPersona(DruidException.Persona.DEVELOPER)

Review Comment:
   Should this be `Admin` persona rather than `Developer`? (developer is for 
unexpected things / bugs)



##########
extensions-contrib/druid-iceberg-extensions/src/test/java/org/apache/druid/iceberg/input/IcebergInputSourceTest.java:
##########
@@ -215,6 +222,104 @@ public void testCaseInsensitiveFiltering() throws 
IOException
     Assert.assertEquals(1, localInputSourceList.size());
   }
 
+  @Test
+  public void testResidualFilterModeIgnore() throws IOException
+  {
+    // Filter on non-partition column with IGNORE mode should succeed
+    IcebergInputSource inputSource = new IcebergInputSource(
+        TABLENAME,
+        NAMESPACE,
+        new IcebergEqualsFilter("id", "123988"),
+        testCatalog,
+        new LocalInputSourceFactory(),
+        null,
+        ResidualFilterMode.IGNORE
+    );
+    Stream<InputSplit<List<String>>> splits = inputSource.createSplits(null, 
new MaxSizeSplitHintSpec(null, null));
+    Assert.assertEquals(1, splits.count());
+  }
+
+  @Test
+  public void testResidualFilterModeFail() throws IOException
+  {
+    // Filter on non-partition column with FAIL mode should throw exception
+    IcebergInputSource inputSource = new IcebergInputSource(
+        TABLENAME,
+        NAMESPACE,
+        new IcebergEqualsFilter("id", "123988"),
+        testCatalog,
+        new LocalInputSourceFactory(),
+        null,
+        ResidualFilterMode.FAIL
+    );
+    DruidException exception = Assert.assertThrows(
+        DruidException.class,
+        () -> inputSource.createSplits(null, new MaxSizeSplitHintSpec(null, 
null))
+    );
+    Assert.assertTrue(
+        "Expect residual error to be thrown",
+        exception.getMessage().contains("residual")
+    );
+  }
+
+  @Test
+  public void testResidualFilterModeFailWithPartitionedTable() throws 
IOException
+  {
+    // Create a partitioned table and filter on the partition column
+    TableIdentifier partitionedTableId = 
TableIdentifier.of(Namespace.of(NAMESPACE), "partitionedTable");

Review Comment:
   Could assign this to `tableIdentifier` and I think we can remove the try - 
finally block along with cleanup `dropTableFromCatalog(tableIdentifier)` since 
it's already part of  `teardown()`?



##########
extensions-contrib/druid-iceberg-extensions/src/main/java/org/apache/druid/iceberg/input/IcebergInputSource.java:
##########
@@ -177,6 +183,13 @@ public DateTime getSnapshotTime()
     return snapshotTime;
   }
 
+  @Nullable

Review Comment:
   no longer Nullable:
   ```suggestion
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to