Re: [PR] Allow failing on residual for Iceberg filters on non-partition cols (druid)

via GitHub Tue, 27 Jan 2026 19:30:12 -0800


jtuglu1 commented on code in PR #18953:
URL: https://github.com/apache/druid/pull/18953#discussion_r2734702058



##########
docs/ingestion/input-sources.md:
##########
@@ -1063,6 +1063,7 @@ The following is a sample spec for a S3 warehouse source:
 |icebergCatalog|The JSON Object used to define the catalog that manages the 
configured Iceberg table.|yes|
 |warehouseSource|The JSON Object that defines the native input source for 
reading the data files from the warehouse.|yes|
 |snapshotTime|Timestamp in ISO8601 DateTime format that will be used to fetch 
the most recent snapshot as of this time.|no|
+|residualFilterMode|Controls how residual filters are handled when filtering 
on non-partition columns. When an Iceberg filter targets a non-partition 
column, files may contain rows that don't match the filter (residual rows). 
Valid values are: `ignore` (default, ingest all rows), `warn` (log a warning 
but continue), `fail` (fail the ingestion job). Use `fail` to ensure filters 
only target partition columns.|no|

Review Comment:
   Sure – I think this is already clear in the iceberg.md changes:
   
   ```
   When an Iceberg filter is applied on a non-partition column, the filtering 
happens at the file metadata level only (using column statistics). Files that 
might contain matching rows are returned, but these files may include 
"residual" rows that don't actually match the filter. These residual rows would 
be ingested unless filtered by a `transformSpec` filter on the Druid side.
   
   To control this behavior, you can set the `residualFilterMode` property on 
the Iceberg input source:
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Allow failing on residual for Iceberg filters on non-partition cols (druid)

Reply via email to