Andy Grove created ARROW-10781:
----------------------------------

             Summary: [Rust] [DataFusion] TableProvider should provide row 
count statistics
                 Key: ARROW-10781
                 URL: https://issues.apache.org/jira/browse/ARROW-10781
             Project: Apache Arrow
          Issue Type: New Feature
          Components: Rust - DataFusion
            Reporter: Andy Grove


In order to start building a cost-based optimizer, we need some statistics 
about data sources. The most basic statistic would be number of rows.

I propose that we add a Statistics struct that initially just makes a total row 
count available but that we can later extend to support more advanced 
statistics.
{code:java}
struct Statistics {
  row_count: Option<usize>
} {code}
We can then add a method to TableProvider:
{code:java}
trait TableProvider {
  fn statistics() -> Option<Statistics>;
} {code}
Statistics should be optional because not all data sources can provide 
statistics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to