kou commented on PR #43553:
URL: https://github.com/apache/arrow/pull/43553#issuecomment-2534647955

   > I think the C Data Interface framing came from the original use case 
(which I would like to not lose sight of); allowing engines like DuckDB to have 
a way to get statistics when given a C Data Interface stream, so that they can 
properly do query planning. Even if we reframe this as just a definition for 
the format of the statistics it might be good to mention the original use case 
so there's proper context, to help commenters who aren't familiar with query 
planning.
   
   I've added the original DuckDB use case.
   
   DuckDB may be able to get statistics without `ArrowArrayStream` (DuckDB may 
be able to call separate API to get statistics) because `duckdb::TableFunction` 
has `table_function_cardinality_t cardinality` and `table_statistics_t 
statistics`. 
   See also: 
https://github.com/duckdb/duckdb/blob/v1.1.3/src/function/table/arrow.cpp#L525-L527
   
   I'll start a discussion on the mailing list tomorrow.
   (I wanted to do it today...)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to