Andy Grove created ARROW-10781: ---------------------------------- Summary: [Rust] [DataFusion] TableProvider should provide row count statistics Key: ARROW-10781 URL: https://issues.apache.org/jira/browse/ARROW-10781 Project: Apache Arrow Issue Type: New Feature Components: Rust - DataFusion Reporter: Andy Grove
In order to start building a cost-based optimizer, we need some statistics about data sources. The most basic statistic would be number of rows. I propose that we add a Statistics struct that initially just makes a total row count available but that we can later extend to support more advanced statistics. {code:java} struct Statistics { row_count: Option<usize> } {code} We can then add a method to TableProvider: {code:java} trait TableProvider { fn statistics() -> Option<Statistics>; } {code} Statistics should be optional because not all data sources can provide statistics. -- This message was sent by Atlassian Jira (v8.3.4#803005)