pdet commented on issue #38837: URL: https://github.com/apache/arrow/issues/38837#issuecomment-2092786273
Hey guys, Thank you very much for starting the design of Arrow statistics! That's exciting! We are currently interested in up-front full-column statistics. Specially: * Count Distinct (Approximate) * Cardinality of the table * Min-Max As a clarification, we also utilize row-group min-max for filtering optimizations in DuckDB tables, but these cannot benefit Arrow. In Arrow, we either pushdown filters to an Arrow Scanner or create a filter node on top of the scanner, and we do not utilize Mix-Max of chunks for filter optimization, -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org