GitHub user PierreZ added a comment to the discussion: Indexing Support in 
DataFusion?

Hey everyone! 👋 

Quick update, I've finally completed the initial implementation of the index 
provider we discussed: 
https://github.com/datafusion-contrib/datafusion-index-provider/pull/2

It implements the "Option 2" approach (APIs to pass additional knowledge about 
indexes) that @alamb mentioned above. The crate provides:
- Index-based query acceleration for `TableProvider` implementations
- Automatic handling of complex predicates (AND/OR/multiple indexes)
- Clean trait-based API (`Index`, `RecordFetcher`, `IndexedTableProvider`)

This has been running at my company for a few months without issues on top of 
FoundationDB. The design is somewhat oriented toward small queries and low data 
volumes due to FoundationDB's 5s transaction timeout and 10MB transaction 
limits. That said, I'd love feedback, especially on whether the approach makes 
sense for larger-scale scenarios. I don't work with query planners often and 
there are probably better ways to structure some of this.

**Since this is landing in the datafusion-contrib organization:**
- Who would be the right person(s) to review this PR?
- What are the general contribution/review guidelines for datafusion-contrib 
repos?

GitHub link: 
https://github.com/apache/datafusion/discussions/9963#discussioncomment-15011862

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: 
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to