pdet commented on issue #38837:
URL: https://github.com/apache/arrow/issues/38837#issuecomment-2092786273

   Hey guys,
   
   Thank you very much for starting the design of Arrow statistics! That's 
exciting!
   
   We are currently interested in up-front full-column statistics. Specially: 
   * Count Distinct (Approximate)
   * Cardinality of the table
   * Min-Max
   
   As a clarification, we also utilize row-group min-max for filtering 
optimizations in DuckDB tables, but these cannot benefit Arrow. In Arrow,  we 
either pushdown filters to an Arrow Scanner or create a filter node on top of 
the scanner, and we do not utilize Mix-Max of chunks for filter optimization, 
   
    


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to