This is an automated email from the ASF dual-hosted git repository.
alamb pushed a change to branch revert-5932-revert-5860-interleave-bloom
in repository https://gitbox.apache.org/repos/asf/arrow-rs.git
discard 5f78af7062 Revert "Revert "Write Bloom filters between row groups
instead of the end (#…"
add 7ef6be4cd9 Preallocate for `FixedSizeList` in `concat` (#5862)
add 13c9e9083e Add eq benchmark for StringArray/StringViewArray (#5924)
add 9413cd3ffd Add the ability for Maps to cast to another case where the
field names are different (#5703)
add 86eb191fb2 fix(ipc): set correct row count when reading struct arrays
with zero fields (#5918)
add 02fb714142 Update zstd-sys requirement from >=2.0.0, <2.0.10 to
>=2.0.0, <2.0.12 (#5913)
add 0ea074af77 Add `MultipartUpload` blanket implementation for `Box<W>`
(#5919)
add a35214f92a Fix typo in benchmarks (#5935)
add 063ac13af0 row format benches for bool & nullable int (#5943)
add 0c3a24d2a4 Implement arrow-row encoding/decoding for view types (#5922)
add c084342aef Better document support for nested comparison (#5942)
add 3139a08baf Update quick-xml requirement from 0.32.0 to 0.33.0 in
/object_store (#5946)
add 66bada54cf Implement like/ilike etc for StringViewArray (#5931)
add 460fd5506c test: Add unit test for extending slice of list array
(#5948)
add 2323c74ce2 Update quick-xml requirement from 0.33.0 to 0.34.0 in
/object_store (#5954)
add 901fbe877f Minor: fixup contribution guide (#5952)
add 0e56fd5c4d chore(5797): change default data_page_row_limit to 20k
(#5957)
add 4b326f6d05 Improve error message for unsupported nested comparison
(#5961)
add 45190ab528 feat: add max_bytes and min_bytes on PageIndex (#5950)
add 6b031629e1 Faster primitive arrays encoding into row format (#5858)
add e5604aae63 Document process for PRs with breaking changes (#5953)
add 1ef22e5a9b `like` benchmark for StringView (#5936)
add ee5572163a Expose `IntervalMonthDayNano` and `IntervalDayTime` and
update docs (#5928)
add 6bc9514aef implement sort for view types (#5963)
add 0a4d8a14b5 Fix FFI array offset handling (#5964)
add c5b5eda77b Add benchmark for reading binary/binary view from parquet
(#5968)
add a7b4a3b10b Add view buffer for parquet reader (#5970)
add 871c999601 Handle flight dictionary ID assignment automatically (#5971)
add a4d21679d4 Make ObjectStoreScheme public (#5912)
add 6230435b4f Add operation in ArrowNativeTypeOp::neg_check error message
(#5944) (#5980)
add 62c1615e8f feat: support reading OPTIONAL column in parquet_derive
(#5717)
add 8e9bdceb44 Update quick-xml requirement from 0.34.0 to 0.35.0 in
/object_store (#5983)
add 8284e5f4ff Reduce repo size by removing accumulative commits in CI job
(#5982)
add bb1250ceb8 Minor: fix clippy complaint in parquet_derive (#5984)
add cad573571b Add user defined metadata (#5915)
add 63516742e7 Provide Arrow Schema Hint to Parquet Reader - Alternative 2
(#5939)
add 3b93a4b062 WriteMultipart Abort on MultipartUpload::complete Error
(#5974)
add 859c4ad486 Implement directly build byte view array on top of parquet
buffer (#5972)
add ebc1cb1e5b fix: error in case of invalid interval expression (#5987)
add e61fb621d1 Add ParquetMetadata::memory_size size estimation (#5965)
add 5c6f857d9a feat(5851): ArrowWriter memory usage (#5967)
add 035b5899f3 Prepare arrow `52.1.0` (#5992)
add e7a0008e59 Implement dictionary support for reading ByteView from
parquet (#5973)
add 1f0b000958 implement `DataType::try_form(&str)` (#5994)
add bed37466af Add additional documentation and examples to DataType
(#5997)
add fd5e67df9f Automatically cleanup empty dirs in LocalFileSystem (#5978)
add a85768db9f Add FlightSqlServiceClient::new_from_inner (#6003)
add b9562b9550 fix doc ci in latest rust nightly version (#6012)
add 2b986dfd5d Deduplicate strings/binarys when building view types (#6005)
add af4d6b624e Fast utf8 validation when loading string view from parquet
(#6009)
add b9e4497258 Rename `Schema::all_fields` to `flattened_fields` (#6001)
add 8355823f74 Complete `StringViewArray` and `BinaryViewArray` parquet
decoder: implement delta byte array and delta length byte array encoding
(#6004)
add 76fbdbc060 Update zstd-sys requirement from >=2.0.0, <2.0.12 to
>=2.0.0, <2.0.13 (#6019)
add c47f230c9c Update clap test (#6028)
add 3ce8e842af Unsafe improvements: core `parquet` crate. (#6024)
add cb3babc9d1 Improve performance reading `ByteViewArray` from parquet by
removing an implicit copy (#6031)
add 826577a764 Update quick-xml requirement from 0.35.0 to 0.36.0 in
/object_store (#6032)
add 2424da25dd Fix `hashbrown` version in `arrow-array`, remove from
`arrow-row` (#6035)
add 50b1e30aa6 Additional tests for parquet reader utf8 validation (#6023)
add e70c16d67d Clean up unused code for view types in offset buffer (#6040)
add 920a94470d Move avoid using copy-based buffer creation (#6039)
add 199ce9190c Fix 5592: Colon (:) in in object_store::path::{Path} is not
handled on Windows (#5830)
add 9acc9fa0b8 Minor API adjustments for StringViewBuilder (#6047)
add 0002b4ded7 Fix typo in GenericByteViewArray documentation (#6054)
add 074bcb5793 Directly decode String/BinaryView types from arrow-row
format (#6044)
add 31b8ba023e Add begin/end_transaction methods in FlightSqlServiceClient
(#6026)
add 6d4e2f2cea Implement min max support for string/binary view types
(#6053)
add 66390ff8ec Add parquet `StatisticsConverter` for arrow reader (#6046)
add 741bbf6854 bump `tonic` to 0.12 and `prost` to 0.13 for `arrow-flight`
(#6041)
add 8f76248222 Remove `impl<T: AsRef<[u8]>> From<T> for Buffer` that
easily accidentally copies data (#6043)
add bb5f12bd78 Make display of interval types more pretty (#6006)
add 756b1fb26d Update snafu (#5930)
add 42c663dfc8 Revert "Revert "Write Bloom filters between row groups
instead of the end (#…"
This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version. This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:
* -- * -- B -- O -- O -- O (5f78af7062)
\
N -- N -- N refs/heads/revert-5932-revert-5860-interleave-bloom
(42c663dfc8)
You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.
Any revisions marked "omit" are not gone; other references still
refer to them. Any revisions marked "discard" are gone forever.
No new revisions were added by this update.
Summary of changes:
.github/workflows/docs.yml | 2 +
.github/workflows/object_store.yml | 13 +
CHANGELOG-old.md | 182 ++
CHANGELOG.md | 267 +--
CONTRIBUTING.md | 21 +-
Cargo.toml | 32 +-
README.md | 6 +
arrow-arith/src/aggregate.rs | 188 +-
arrow-arith/src/numeric.rs | 8 +-
arrow-array/Cargo.toml | 2 +-
arrow-array/src/arithmetic.rs | 2 +-
arrow-array/src/array/byte_view_array.rs | 73 +-
arrow-array/src/array/dictionary_array.rs | 4 +-
arrow-array/src/array/fixed_size_list_array.rs | 2 +-
arrow-array/src/array/map_array.rs | 18 +-
arrow-array/src/array/primitive_array.rs | 38 +-
arrow-array/src/array/struct_array.rs | 4 +-
.../src/builder/generic_bytes_view_builder.rs | 161 +-
arrow-array/src/ffi.rs | 63 +-
arrow-array/src/types.rs | 76 +-
arrow-buffer/src/buffer/immutable.rs | 35 +-
arrow-buffer/src/builder/null.rs | 8 +
arrow-buffer/src/interval.rs | 70 +
arrow-buffer/src/util/bit_chunk_iterator.rs | 2 +-
arrow-cast/src/base64.rs | 8 +-
arrow-cast/src/cast/map.rs | 74 +
arrow-cast/src/cast/mod.rs | 317 ++-
arrow-cast/src/display.rs | 196 +-
arrow-cast/src/parse.rs | 4 +-
arrow-cast/src/pretty.rs | 54 +-
arrow-data/src/data.rs | 4 +-
arrow-data/src/ffi.rs | 55 +-
arrow-data/src/transform/mod.rs | 20 +-
arrow-flight/Cargo.toml | 12 +-
arrow-flight/examples/flight_sql_server.rs | 6 +-
arrow-flight/gen/Cargo.toml | 4 +-
arrow-flight/src/arrow.flight.protocol.rs | 36 +-
arrow-flight/src/encode.rs | 373 ++-
arrow-flight/src/sql/arrow.flight.protocol.sql.rs | 12 +-
arrow-flight/src/sql/client.rs | 76 +-
arrow-flight/src/sql/metadata/sql_info.rs | 4 +-
arrow-flight/src/sql/mod.rs | 2 +
arrow-flight/src/utils.rs | 8 +-
arrow-flight/tests/common/trailers_layer.rs | 32 +-
arrow-flight/tests/flight_sql_client_cli.rs | 376 +--
arrow-integration-test/src/lib.rs | 6 +-
arrow-integration-testing/Cargo.toml | 4 +-
.../flight_client_scenarios/integration_test.rs | 4 +-
.../flight_server_scenarios/integration_test.rs | 7 +-
arrow-ipc/src/compression.rs | 4 +-
arrow-ipc/src/reader.rs | 35 +-
arrow-ipc/src/writer.rs | 134 +-
arrow-json/src/reader/mod.rs | 8 +-
arrow-json/src/writer.rs | 18 +-
arrow-ord/src/cmp.rs | 212 +-
arrow-ord/src/ord.rs | 17 +
arrow-ord/src/sort.rs | 70 +-
arrow-row/Cargo.toml | 1 -
arrow-row/src/fixed.rs | 95 +-
arrow-row/src/lib.rs | 70 +-
arrow-row/src/variable.rs | 80 +
arrow-schema/src/datatype.rs | 112 +-
arrow-schema/src/datatype_parse.rs | 783 ++++++
arrow-schema/src/lib.rs | 1 +
arrow-schema/src/schema.rs | 44 +-
arrow-select/src/concat.rs | 89 +-
arrow-select/src/dictionary.rs | 2 +-
arrow-string/src/like.rs | 646 +++--
arrow-string/src/predicate.rs | 11 +-
arrow-string/src/substring.rs | 4 +-
arrow/benches/comparison_kernels.rs | 102 +-
arrow/benches/concatenate_kernel.rs | 20 +
arrow/benches/row_format.rs | 28 +-
arrow/examples/builders.rs | 4 +-
arrow/examples/tensor_builder.rs | 2 +-
arrow/tests/array_equal.rs | 24 +-
arrow/tests/array_transform.rs | 92 +-
arrow/tests/array_validation.rs | 2 +-
dev/release/update_change_log.sh | 4 +-
object_store/Cargo.toml | 4 +-
object_store/src/attributes.rs | 17 +-
object_store/src/aws/client.rs | 6 +
object_store/src/azure/client.rs | 6 +
object_store/src/buffered.rs | 7 +-
object_store/src/client/get.rs | 64 +-
object_store/src/client/header.rs | 4 +
object_store/src/gcp/client.rs | 6 +
object_store/src/http/client.rs | 3 +
object_store/src/integration.rs | 1 +
object_store/src/lib.rs | 2 +-
object_store/src/local.rs | 127 +-
object_store/src/parse.rs | 48 +-
object_store/src/upload.rs | 25 +-
parquet/Cargo.toml | 16 +-
parquet/benches/arrow_reader.rs | 229 +-
parquet/benches/arrow_statistics.rs | 269 +++
parquet/src/arrow/array_reader/builder.rs | 2 +-
parquet/src/arrow/array_reader/byte_array.rs | 177 +-
parquet/src/arrow/array_reader/byte_view_array.rs | 751 ++++++
parquet/src/arrow/array_reader/list_array.rs | 2 +-
parquet/src/arrow/array_reader/mod.rs | 5 +-
parquet/src/arrow/arrow_reader/mod.rs | 636 ++++-
parquet/src/arrow/arrow_reader/statistics.rs | 2536 ++++++++++++++++++++
parquet/src/arrow/arrow_writer/byte_array.rs | 38 +
parquet/src/arrow/arrow_writer/levels.rs | 4 +-
parquet/src/arrow/arrow_writer/mod.rs | 111 +-
parquet/src/arrow/async_reader/mod.rs | 2 +-
parquet/src/arrow/async_writer/mod.rs | 25 +-
parquet/src/arrow/buffer/mod.rs | 1 +
parquet/src/arrow/buffer/offset_buffer.rs | 100 +-
parquet/src/arrow/buffer/view_buffer.rs | 193 ++
parquet/src/arrow/mod.rs | 26 +
parquet/src/bin/parquet-fromcsv.rs | 7 +-
parquet/src/bloom_filter/mod.rs | 8 +-
parquet/src/column/writer/encoder.rs | 29 +-
parquet/src/column/writer/mod.rs | 22 +-
parquet/src/data_type.rs | 58 +-
.../encoding/byte_stream_split_encoder.rs | 5 +
parquet/src/encodings/encoding/dict_encoder.rs | 17 +
parquet/src/encodings/encoding/mod.rs | 40 +
parquet/src/encodings/rle.rs | 7 +
parquet/src/file/metadata/memory.rs | 228 ++
parquet/src/file/{metadata.rs => metadata/mod.rs} | 90 +-
parquet/src/file/page_index/index.rs | 50 +-
parquet/src/file/properties.rs | 6 +-
parquet/src/schema/types.rs | 36 +
parquet/src/util/bit_util.rs | 50 +-
parquet/src/util/interner.rs | 12 +
parquet/tests/arrow_reader/mod.rs | 1003 ++++++++
parquet/tests/arrow_reader/statistics.rs | 2143 +++++++++++++++++
parquet_derive/src/parquet_field.rs | 59 +-
parquet_derive_test/src/lib.rs | 60 +
132 files changed, 13085 insertions(+), 1973 deletions(-)
create mode 100644 arrow-cast/src/cast/map.rs
create mode 100644 arrow-schema/src/datatype_parse.rs
create mode 100644 parquet/benches/arrow_statistics.rs
create mode 100644 parquet/src/arrow/array_reader/byte_view_array.rs
create mode 100644 parquet/src/arrow/arrow_reader/statistics.rs
create mode 100644 parquet/src/arrow/buffer/view_buffer.rs
create mode 100644 parquet/src/file/metadata/memory.rs
rename parquet/src/file/{metadata.rs => metadata/mod.rs} (92%)
create mode 100644 parquet/tests/arrow_reader/mod.rs
create mode 100644 parquet/tests/arrow_reader/statistics.rs