Re: [PR] Allow failing on residual for Iceberg filters on non-partition cols (druid)

via GitHub Tue, 27 Jan 2026 19:02:01 -0800


abhishekrb19 commented on code in PR #18953:
URL: https://github.com/apache/druid/pull/18953#discussion_r2734649261



##########
docs/ingestion/input-sources.md:
##########
@@ -1063,6 +1063,7 @@ The following is a sample spec for a S3 warehouse source:
 |icebergCatalog|The JSON Object used to define the catalog that manages the 
configured Iceberg table.|yes|
 |warehouseSource|The JSON Object that defines the native input source for 
reading the data files from the warehouse.|yes|
 |snapshotTime|Timestamp in ISO8601 DateTime format that will be used to fetch 
the most recent snapshot as of this time.|no|
+|residualFilterMode|Controls how residual filters are handled when filtering 
on non-partition columns. When an Iceberg filter targets a non-partition 
column, files may contain rows that don't match the filter (residual rows). 
Valid values are: `ignore` (default, ingest all rows), `warn` (log a warning 
but continue), `fail` (fail the ingestion job). Use `fail` to ensure filters 
only target partition columns.|no|

Review Comment:
   fwiw, this is also the same behavior with Delta lake filtering, where the 
filter predicates are pushed down to partition columns; for filtering on 
non-partition columns, it's best-effort
   
   ---
   
   It might also make sense to update `icebergFilter` in the docs to clarify 
how filtering on partition columns vs non-partition columns behave and perhaps 
point to this new property `residualFilterMode`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Allow failing on residual for Iceberg filters on non-partition cols (druid)

Reply via email to