abhishekrb19 commented on code in PR #16288:
URL: https://github.com/apache/druid/pull/16288#discussion_r1571388932


##########
docs/ingestion/input-sources.md:
##########
@@ -1141,7 +1141,85 @@ To use the Delta Lake input source, load the extension 
[`druid-deltalake-extensi
 You can use the Delta input source to read data stored in a Delta Lake table. 
For a given table, the input source scans
 the latest snapshot from the configured table. Druid ingests the underlying 
delta files from the table.
 
-The following is a sample spec:
+ | Property|Description|Required|
+|---------|-----------|--------|
+| type|Set this value to `delta`.|yes|
+| tablePath|The location of the Delta table.|yes|
+| filter|The JSON Object that filters data files within a snapshot.|no|
+
+### Delta filter object
+
+You can use these filters to filter out data files from a snapshot, reducing 
the number of files Druid has to ingest from
+a Delta table. This input source provides the following filters: `and`, `or`, 
`not`, `=`, `>`, `>=`, `<`, `<=`.
+
+When a filter is applied on non-partitioned columns, the filtering is 
best-effort as the Delta Kernel solely relies
+on statistics collected when the non-partitioned table is created. In this 
scenario, this Druid connector may ingest
+data that doesn't match the filter. For guaranteed filtering behavior, use it 
only on partitioned columns.

Review Comment:
   Yes, reads much better. Thanks!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to