etseidl opened a new pull request, #6639:
URL: https://github.com/apache/arrow-rs/pull/6639
# Which issue does this PR close?
Part of #6447. Also see #6582.
# Rationale for this change
The behavior of `ParquetMetaDataReader` when requesting page indexes differs
between synchronous and asynchronous implementations. For historical reasons,
the synchronous methods currently return empty vectors for the `ColumnIndex`
and `OffsetIndex` when page indexes are requested but not present in the file.
The asynchronous methods instead return `None` in that case.
# What changes are included in this PR?
This PR changes the behavior of `ParquetMetaDataReader` to always return
`None` when page indexes are requested but not present. It also changes the
behavior and signatures of the legacy functions `read_columns_indexes` and
`read_offset_indexes`. These will now return optional vectors set to `None`
rather than empty vectors when page indexes are not present.
# Are there any user-facing changes?
Yes, as noted above.
<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->
<!---
If there are any breaking changes to public APIs, please add the `breaking
change` label.
-->
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]