Joris Van den Bossche created ARROW-9459:
--------------------------------------------
Summary: [C++][Dataset] Make collecting/parsing statistics
optional for ParquetFragment
Key: ARROW-9459
URL: https://issues.apache.org/jira/browse/ARROW-9459
Project: Apache Arrow
Issue Type: Improvement
Components: C++
Reporter: Joris Van den Bossche
See some timing checks here:
https://github.com/dask/dask/pull/6346#issuecomment-656548675
Parsing all statistics, even from a centralized {{_metadata}} file can be quite
expensive. If you know in advance that you are not going to use them (eg you
are only going to do filtering on the partition fields, and otherwise read all
data), it could be nice to have an option to disable parsing statistics.
cc [~rjzamora] [~bkietz] [~fsaintjacques]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)