[GitHub] [arrow-datafusion] avantgardnerio commented on pull request #5362: Add index interface method

via GitHub Thu, 23 Feb 2023 13:19:17 -0800


avantgardnerio commented on PR #5362:
URL: 
https://github.com/apache/arrow-datafusion/pull/5362#issuecomment-1442445977


   > Is the idea here that we bubble information about global sort indexes up 
to the logical TableProvider so we avoid splitting those two predicates when 
pushing down to the scan?
   
   Yes, I was hoping this API would provide enough info for the planner to see 
that there is no index on `f_name` or `[f_name, l_name]` but there was one on 
`[l_name, f_name]` and call `supports_filter_pushdown()` on the latter and if 
it gets `Exact` then to use it, rather than always breaking it down into 
individual BinaryExpressions.
   
   Additionally, later, my hope was that the planner could realize:
   
   1. There is an orders table with a PK `(merch_id, order_id)`
   2. There is a line items table with a keys: `(merch_id, order_id, 
lineitem_id)`, `(merch_id, order_id)`
   3. Know that it can sort-merge-join the (pre-sorted) tables as long as it 
picks the indexes for `(merch_id, order_id)`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-datafusion] avantgardnerio commented on pull request #5362: Add index interface method

Reply via email to