This is an automated email from the ASF dual-hosted git repository.
jorisvandenbossche pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git.
from 7e503af ARROW-10389: [Rust] [DataFusion] Make the custom source
implementation API more explicit
add 868777d ARROW-10131: [C++][Dataset][Python] Lazily parse parquet
metadata
No new revisions were added by this update.
Summary of changes:
cpp/src/arrow/dataset/file_parquet.cc | 750 ++++++++++-----------------
cpp/src/arrow/dataset/file_parquet.h | 172 +++---
cpp/src/arrow/dataset/file_parquet_test.cc | 28 +-
cpp/src/arrow/dataset/filter.cc | 36 +-
cpp/src/arrow/dataset/type_fwd.h | 2 +-
cpp/src/arrow/result.h | 18 +-
cpp/src/parquet/exception.h | 29 +-
cpp/src/parquet/metadata.cc | 168 ++++--
cpp/src/parquet/metadata.h | 16 +-
cpp/src/parquet/metadata_test.cc | 9 +
cpp/src/parquet/statistics.cc | 28 +
cpp/src/parquet/statistics.h | 3 +
python/CMakeLists.txt | 1 +
python/pyarrow/_dataset.pyx | 135 ++---
python/pyarrow/_parquet.pxd | 48 ++
python/pyarrow/_parquet.pyx | 82 +--
python/pyarrow/includes/libarrow_dataset.pxd | 22 +-
python/pyarrow/tests/test_dataset.py | 72 ++-
18 files changed, 747 insertions(+), 872 deletions(-)