Andy Grove created ARROW-10781:
----------------------------------
Summary: [Rust] [DataFusion] TableProvider should provide row
count statistics
Key: ARROW-10781
URL: https://issues.apache.org/jira/browse/ARROW-10781
Project: Apache Arrow
Issue Type: New Feature
Components: Rust - DataFusion
Reporter: Andy Grove
In order to start building a cost-based optimizer, we need some statistics
about data sources. The most basic statistic would be number of rows.
I propose that we add a Statistics struct that initially just makes a total row
count available but that we can later extend to support more advanced
statistics.
{code:java}
struct Statistics {
row_count: Option<usize>
} {code}
We can then add a method to TableProvider:
{code:java}
trait TableProvider {
fn statistics() -> Option<Statistics>;
} {code}
Statistics should be optional because not all data sources can provide
statistics.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)