This is an automated email from the ASF dual-hosted git repository.
nevime pushed a change to branch rust-parquet-arrow-writer
in repository https://gitbox.apache.org/repos/asf/arrow.git.
omit 16f95e9 ARROW-8426: [Rust] [Parquet] Add support for writing
dictionary types
omit 3fb8bfa ARROW-10095: [Rust] Update rust-parquet-arrow-writer branch's
encode_arrow_schema with ipc changes
omit 28b075d ARROW-8423: [Rust] [Parquet] Serialize Arrow schema metadata
omit 32d328b ARROW-8289: [Rust] Parquet Arrow writer with nested support
add a2beceb ARROW-10059: [R][Doc] Give more advice on how to set up C++
build
add 3fc37f4 ARROW-9869: [R] Implement full S3FileSystem/S3Options
constructor
add 5f8a792 ARROW-9752: [Rust] [DataFusion] Add support for User-Defined
Aggregate Functions.
add 536a1d9 ARROW-10096: [Rust] [DataFusion] Removed unused code
add d806160 ARROW-10086: [Rust] Renamed min/max_large_string kernels
add fe862a4 ARROW-9981: [Rust] [Flight] Expose IpcWriteOptions on utils
add 97ade81 ARROW-8601: [Go][FOLLOWUP] Fix RAT violations related to
Flight in Go
add c646738 ARROW-10098: [R][Doc] Fix copy_files doc mismatch
add 87dd7e9 ARROW-9992: [C++][Python] Refactor python to arrow
conversions based on a reusable conversion API
add a0c2e81 ARROW-9769: [Python] Un-skip tests with fsspec in-memory
filesystems
add cac5e62 ARROW-10071: [R] segfault with ArrowObject from previous
session, or saved
add ae1d24e ARROW-9839: [Rust] [DataFusion] Implement ExecutionPlan.as_any
add 75cdad4 ARROW-9754: [Rust] [DataFusion] Implement async in
ExecutionPlan trait
add d3786a1 ARROW-10069: [Java] Support running Java benchmarks from
command line
add c4b0d0e ARROW-10019: [Rust] Add substring kernel
add 848c225 ARROW-9965: [Java] Improve performance of
BaseFixedWidthVector.setSafe by optimizing capacity calculations
add 515daab ARROW-8618: [C++] Clean up some redundant std::move()s
add 477c102 ARROW-9924: [C++][Dataset] Enable per-column parallelism for
single ParquetFileFragment scans
add 4fd0664 ARROW-10090: [C++][Compute] Improve mode kernel
add 09dc0cc ARROW-10104: [Python] Separate tests into its own conda
package
add 057a87f ARROW-10125: [R] Int64 downcast check doesn't consider all
chunks
add 571d48e ARROW-10119: [C++] Fix Parquet crashes on invalid input
add 4b0448b ARROW-10124: [C++] Don't restrict permissions when creating
files
add 25c736d ARROW-10116: [Python][Packaging] Fix gRPC linking error in
macOS wheels builds
add 3f3b604 ARROW-9641: [C++][Gandiva] Implement round() for floating
point and double floating point numbers
add 7984cf7 ARROW-10130: [C++][Dataset] Ensure
ParquetFileFragment::SplitByRowGroup preserves the 'has_complete_metadata'
status
add c5fa23e ARROW-9640: [C++][Gandiva] Implement round() for integers and
long integers
add ffb6e28 ARROW-10070: [C++][Compute] Implement var and std aggregate
kernel
add 3fae71b ARROW-10050: [C++][Gandiva] Implement concat() in Gandiva for
up to 10 arguments
add 991a55f ARROW-10137: [C++][R] Move nameof.h into R subproject
add 97879eb ARROW-9761: [C/C++] Add experimental C stream inferface
add 4e563bf ARROW-7372: [C++] Allow creating dictionary array from simple
JSON
add 424bcc6 ARROW-10102: [C++] Refactor BasicDecimal128 Multiplication to
use unsigned helper
add fa44134 ARROW-10150: [C++] Fix crashes on invalid Parquet file
add a1157b7 ARROW-10136: [Rust]: Fix null handling in StringArray and
BinaryArray filtering, add BinaryArray::from_opt_vec
add 9bff7c4 ARROW-10054: [Python] don't crash when slice offset > length
add 1b70f65 ARROW-10148: [Rust] Improved rust/lib.rs that is shown in
docs.rs
add 55028a8 ARROW-10084: [Rust] [DataFusion] Added length of
LargeStringArray and fixed undefined behavior.
add 0597f48 ARROW-10046: [Rust] [DataFusion] Made `RecordBatchReader`
implement Iterator
add ad712e5 ARROW-10103: [Rust] Add contains kernel
add c68a76c ARROW-10057: [C++] Add hand-written Parquet nested tests
add 9fb1704 ARROW-10157: [Rust] Add an example to the take kernel
add 238a949 ARROW-10160: [Rust] Improve DictionaryType documentation
(clarify which type is which)
add 51a3c88 ARROW-10127: Update specification for Decimal to allow for
256-bits
new e8ac2bf ARROW-8289: [Rust] Parquet Arrow writer with nested support
new 5394b3e ARROW-8423: [Rust] [Parquet] Serialize Arrow schema metadata
new 852b5ed ARROW-10095: [Rust] Update rust-parquet-arrow-writer branch's
encode_arrow_schema with ipc changes
new 7f743c2 ARROW-8426: [Rust] [Parquet] Add support for writing
dictionary types
This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version. This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:
* -- * -- B -- O -- O -- O (16f95e9)
\
N -- N -- N refs/heads/rust-parquet-arrow-writer (7f743c2)
You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.
Any revisions marked "omit" are not gone; other references still
refer to them. Any revisions marked "discard" are gone forever.
The 4 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails. The revisions
listed as "add" were already present in the repository and have only
been added to this reference.
Summary of changes:
c_glib/test/dataset/test-scan-options.rb | 2 +-
cpp/cmake_modules/ThirdpartyToolchain.cmake | 7 +
cpp/src/arrow/CMakeLists.txt | 1 +
cpp/src/arrow/array/array_base.cc | 8 +-
cpp/src/arrow/array/array_binary_test.cc | 22 +
cpp/src/arrow/array/array_list_test.cc | 32 +
cpp/src/arrow/array/builder_base.cc | 6 +
cpp/src/arrow/array/builder_base.h | 9 +
cpp/src/arrow/array/builder_binary.h | 48 +-
cpp/src/arrow/array/builder_dict.h | 6 +
cpp/src/arrow/array/builder_nested.cc | 25 +
cpp/src/arrow/array/builder_nested.h | 37 +-
cpp/src/arrow/array/builder_primitive.h | 3 +
cpp/src/arrow/builder.cc | 1 +
cpp/src/arrow/builder.h | 4 +-
cpp/src/arrow/c/abi.h | 38 +
cpp/src/arrow/c/bridge.cc | 194 +++
cpp/src/arrow/c/bridge.h | 34 +
cpp/src/arrow/c/bridge_test.cc | 252 +++-
cpp/src/arrow/c/helpers.h | 30 +
cpp/src/arrow/c/util_internal.h | 7 +
cpp/src/arrow/compute/api_aggregate.cc | 10 +
cpp/src/arrow/compute/api_aggregate.h | 42 +-
cpp/src/arrow/compute/kernels/aggregate_basic.cc | 2 +
.../compute/kernels/aggregate_basic_internal.h | 2 +
cpp/src/arrow/compute/kernels/aggregate_mode.cc | 61 +-
cpp/src/arrow/compute/kernels/aggregate_test.cc | 206 +++
cpp/src/arrow/compute/kernels/aggregate_var_std.cc | 202 +++
cpp/src/arrow/dataset/file_base.cc | 4 +-
cpp/src/arrow/dataset/file_csv_test.cc | 8 +-
cpp/src/arrow/dataset/file_ipc_test.cc | 8 +-
cpp/src/arrow/dataset/file_parquet.cc | 10 +-
cpp/src/arrow/dataset/file_parquet.h | 6 +
cpp/src/arrow/dataset/file_parquet_test.cc | 12 +-
cpp/src/arrow/dataset/scanner.cc | 2 +-
cpp/src/arrow/dataset/scanner.h | 4 +-
cpp/src/arrow/dataset/test_util.h | 4 +-
cpp/src/arrow/ipc/json_simple.cc | 149 +-
cpp/src/arrow/ipc/json_simple_test.cc | 205 ++-
cpp/src/arrow/python/CMakeLists.txt | 1 +
cpp/src/arrow/python/common.h | 93 +-
cpp/src/arrow/python/datetime.cc | 9 +
cpp/src/arrow/python/extension_type.h | 1 +
cpp/src/arrow/python/inference.cc | 30 +-
cpp/src/arrow/python/inference.h | 10 +-
cpp/src/arrow/python/ipc.cc | 67 +
cpp/src/arrow/python/{benchmark.h => ipc.h} | 30 +-
cpp/src/arrow/python/numpy_to_arrow.cc | 10 +-
cpp/src/arrow/python/python_test.cc | 27 +-
cpp/src/arrow/python/python_to_arrow.cc | 1544 ++++++++------------
cpp/src/arrow/python/python_to_arrow.h | 28 +-
cpp/src/arrow/python/serialize.cc | 4 +-
cpp/src/arrow/result.h | 4 +-
cpp/src/arrow/scalar.h | 4 +-
cpp/src/arrow/scalar_test.cc | 8 +
cpp/src/arrow/testing/gtest_util.h | 4 +-
cpp/src/arrow/type_traits.h | 47 +-
cpp/src/arrow/util/basic_decimal.cc | 110 +-
cpp/src/arrow/util/converter.h | 324 ++++
cpp/src/arrow/util/decimal_test.cc | 8 +
cpp/src/arrow/util/hashing.h | 5 +
cpp/src/arrow/util/io_util.cc | 11 +-
cpp/src/arrow/util/iterator.h | 4 +-
cpp/src/arrow/util/iterator_test.cc | 2 +-
cpp/src/gandiva/function_registry_arithmetic.cc | 6 +
cpp/src/gandiva/function_registry_string.cc | 73 +
cpp/src/gandiva/precompiled/extended_math_ops.cc | 119 ++
.../gandiva/precompiled/extended_math_ops_test.cc | 32 +
cpp/src/gandiva/precompiled/string_ops.cc | 484 ++++++
cpp/src/gandiva/precompiled/string_ops_test.cc | 107 +-
cpp/src/gandiva/precompiled/types.h | 110 ++
cpp/src/parquet/arrow/arrow_reader_writer_test.cc | 140 ++
cpp/src/parquet/arrow/reader.cc | 49 +-
cpp/src/parquet/arrow/schema.cc | 24 +-
cpp/src/parquet/arrow/test_util.h | 6 +
cpp/src/parquet/column_reader.cc | 70 +-
dev/release/rat_exclude_files.txt | 2 +
.../linux_aarch64_python3.6.____cpython.yaml | 6 +-
.../linux_aarch64_python3.7.____cpython.yaml | 6 +-
.../linux_aarch64_python3.8.____cpython.yaml | 6 +-
...a_compiler_version9.2python3.6.____cpython.yaml | 6 +-
...a_compiler_version9.2python3.7.____cpython.yaml | 6 +-
...a_compiler_version9.2python3.8.____cpython.yaml | 6 +-
..._compiler_versionNonepython3.6.____cpython.yaml | 6 +-
..._compiler_versionNonepython3.7.____cpython.yaml | 6 +-
..._compiler_versionNonepython3.8.____cpython.yaml | 6 +-
.../.ci_support/osx_python3.6.____cpython.yaml | 6 +-
.../.ci_support/osx_python3.7.____cpython.yaml | 6 +-
.../.ci_support/osx_python3.8.____cpython.yaml | 6 +-
.../.ci_support/win_python3.6.____cpython.yaml | 4 +-
.../.ci_support/win_python3.7.____cpython.yaml | 4 +-
.../.ci_support/win_python3.8.____cpython.yaml | 4 +-
dev/tasks/conda-recipes/arrow-cpp/bld-pyarrow.bat | 4 +
dev/tasks/conda-recipes/arrow-cpp/build-pyarrow.sh | 4 +
dev/tasks/conda-recipes/arrow-cpp/meta.yaml | 50 +
dev/tasks/homebrew-formulae/travis.osx.r.yml | 6 +-
dev/tasks/python-wheels/osx-build.sh | 4 +-
dev/tasks/python-wheels/travis.osx.yml | 15 +-
docs/source/cpp/api.rst | 1 +
docs/source/cpp/api/{utilities.rst => c_abi.rst} | 44 +-
docs/source/cpp/compute.rst | 4 +
docs/source/format/CDataInterface.rst | 34 +-
docs/source/format/CStreamInterface.rst | 218 +++
docs/source/index.rst | 1 +
docs/source/python/filesystems.rst | 2 +-
format/Schema.fbs | 10 +-
java/performance/pom.xml | 36 +
.../apache/arrow/vector/BaseFixedWidthVector.java | 23 +-
.../java/org/apache/arrow/vector/BitVector.java | 10 +-
python/pyarrow/_csv.pyx | 4 +-
python/pyarrow/_dataset.pyx | 37 +-
python/pyarrow/_flight.pyx | 15 +-
python/pyarrow/array.pxi | 56 +-
python/pyarrow/cffi.py | 12 +
python/pyarrow/dataset.py | 2 +-
python/pyarrow/includes/libarrow.pxd | 43 +-
python/pyarrow/includes/libarrow_dataset.pxd | 1 +
python/pyarrow/ipc.pxi | 121 +-
python/pyarrow/ipc.py | 26 +-
python/pyarrow/lib.pxd | 2 +-
python/pyarrow/parquet.py | 73 +-
python/pyarrow/scalar.pxi | 53 +-
python/pyarrow/table.pxi | 3 +
python/pyarrow/tests/strategies.py | 93 +-
python/pyarrow/tests/test_array.py | 19 +-
python/pyarrow/tests/test_cffi.py | 110 +-
python/pyarrow/tests/test_convert_builtin.py | 376 ++++-
python/pyarrow/tests/test_dataset.py | 5 +
python/pyarrow/tests/test_fs.py | 8 -
python/pyarrow/tests/test_io.py | 16 +
python/pyarrow/tests/test_ipc.py | 35 +
python/pyarrow/tests/test_pandas.py | 32 +
python/pyarrow/tests/test_scalars.py | 17 +-
python/pyarrow/tests/test_table.py | 24 +
python/pyarrow/tests/test_types.py | 11 +
r/NAMESPACE | 3 +
r/NEWS.md | 2 +-
r/R/arrow-package.R | 2 +-
r/R/arrowExports.R | 8 +-
r/R/csv.R | 9 +-
r/R/feather.R | 18 +-
r/R/filesystem.R | 94 +-
r/R/io.R | 10 +-
r/R/ipc_stream.R | 10 +-
r/R/json.R | 13 +-
r/R/parquet.R | 13 +-
r/README.md | 58 +
r/man/CsvTableReader.Rd | 2 +-
r/man/FileSystem.Rd | 37 +-
r/man/copy_files.Rd | 33 +
r/man/read_delim_arrow.Rd | 4 +
r/man/read_feather.Rd | 11 +-
r/man/read_ipc_stream.Rd | 5 +-
r/man/read_json_arrow.Rd | 11 +-
r/man/read_parquet.Rd | 4 +
r/man/write_feather.Rd | 4 +
r/man/write_ipc_stream.Rd | 5 +-
r/man/write_parquet.Rd | 9 +-
r/src/array_to_vector.cpp | 8 +-
r/src/arrowExports.cpp | 46 +-
r/src/arrow_cpp11.h | 15 +-
r/src/filesystem.cpp | 43 +-
r/src/nameof.h | 86 ++
r/tests/testthat/helper-skip.R | 5 +
r/tests/testthat/test-arrow.R | 13 +
r/tests/testthat/test-chunked-array.R | 7 +
r/tests/testthat/test-filesystem.R | 7 +
r/tests/testthat/test-s3-minio.R | 165 +++
r/vignettes/fs.Rmd | 92 +-
rust/arrow-flight/src/utils.rs | 107 +-
rust/arrow/README.md | 67 +-
rust/arrow/src/array/array.rs | 85 +-
rust/arrow/src/compute/kernels/aggregate.rs | 73 +-
rust/arrow/src/compute/kernels/comparison.rs | 234 ++-
rust/arrow/src/compute/kernels/filter.rs | 43 +-
rust/arrow/src/compute/kernels/length.rs | 249 ++--
rust/arrow/src/compute/kernels/mod.rs | 1 +
rust/arrow/src/compute/kernels/substring.rs | 274 ++++
rust/arrow/src/compute/kernels/take.rs | 26 +-
rust/arrow/src/compute/util.rs | 78 +-
rust/arrow/src/csv/reader.rs | 81 +-
rust/arrow/src/datatypes.rs | 12 +-
rust/arrow/src/ipc/gen/Message.rs | 2 +-
rust/arrow/src/ipc/gen/Schema.rs | 9 +-
rust/arrow/src/ipc/gen/SparseTensor.rs | 10 +-
rust/arrow/src/ipc/reader.rs | 170 ++-
rust/arrow/src/ipc/writer.rs | 22 +-
rust/arrow/src/lib.rs | 107 +-
rust/arrow/src/memory.rs | 4 +-
rust/arrow/src/record_batch.rs | 10 +-
rust/arrow/src/util/integration_util.rs | 4 +-
rust/benchmarks/Cargo.toml | 2 +-
rust/benchmarks/src/bin/nyctaxi.rs | 13 +-
rust/benchmarks/src/bin/tpch.rs | 9 +-
rust/datafusion/Cargo.toml | 3 +-
rust/datafusion/README.md | 3 +-
rust/datafusion/benches/aggregate_query_sql.rs | 20 +-
rust/datafusion/benches/math_query_sql.rs | 27 +-
rust/datafusion/benches/sort_limit_query_sql.rs | 47 +-
rust/datafusion/examples/csv_sql.rs | 5 +-
rust/datafusion/examples/dataframe.rs | 5 +-
rust/datafusion/examples/flight_client.rs | 5 +-
rust/datafusion/examples/flight_server.rs | 5 +-
rust/datafusion/examples/memory_table_api.rs | 5 +-
rust/datafusion/examples/parquet_sql.rs | 5 +-
rust/datafusion/examples/simple_udaf.rs | 169 +++
rust/datafusion/examples/simple_udf.rs | 5 +-
rust/datafusion/src/bin/repl.rs | 9 +-
rust/datafusion/src/dataframe.rs | 15 +-
rust/datafusion/src/datasource/csv.rs | 61 +-
rust/datafusion/src/datasource/memory.rs | 26 +-
rust/datafusion/src/datasource/mod.rs | 2 +-
rust/datafusion/src/datasource/parquet.rs | 72 +-
rust/datafusion/src/execution/context.rs | 326 +++--
rust/datafusion/src/execution/dataframe_impl.rs | 7 +-
rust/datafusion/src/logical_plan/mod.rs | 48 +
rust/datafusion/src/optimizer/utils.rs | 6 +
rust/datafusion/src/physical_plan/aggregates.rs | 13 +-
rust/datafusion/src/physical_plan/common.rs | 37 +-
rust/datafusion/src/physical_plan/csv.rs | 40 +-
rust/datafusion/src/physical_plan/empty.rs | 25 +-
rust/datafusion/src/physical_plan/explain.rs | 14 +-
rust/datafusion/src/physical_plan/expressions.rs | 24 +-
rust/datafusion/src/physical_plan/filter.rs | 93 +-
rust/datafusion/src/physical_plan/functions.rs | 17 +-
.../datafusion/src/physical_plan/hash_aggregate.rs | 374 +++--
rust/datafusion/src/physical_plan/limit.rs | 38 +-
rust/datafusion/src/physical_plan/memory.rs | 36 +-
rust/datafusion/src/physical_plan/merge.rs | 70 +-
rust/datafusion/src/physical_plan/mod.rs | 10 +-
rust/datafusion/src/physical_plan/parquet.rs | 65 +-
rust/datafusion/src/physical_plan/planner.rs | 25 +-
rust/datafusion/src/physical_plan/projection.rs | 69 +-
rust/datafusion/src/physical_plan/sort.rs | 25 +-
rust/datafusion/src/physical_plan/udaf.rs | 156 ++
rust/datafusion/src/sql/planner.rs | 37 +-
rust/datafusion/src/test/mod.rs | 4 +-
rust/datafusion/tests/sql.rs | 297 ++--
rust/datafusion/tests/user_defined_plan.rs | 156 +-
.../src/bin/arrow-file-to-stream.rs | 12 +-
.../src/bin/arrow-json-integration-test.rs | 15 +-
.../src/bin/arrow-stream-to-file.rs | 9 +-
rust/parquet/src/arrow/arrow_reader.rs | 71 +-
rust/parquet/src/arrow/mod.rs | 4 +-
testing | 2 +-
245 files changed, 9067 insertions(+), 3145 deletions(-)
create mode 100644 cpp/src/arrow/compute/kernels/aggregate_var_std.cc
create mode 100644 cpp/src/arrow/python/ipc.cc
copy cpp/src/arrow/python/{benchmark.h => ipc.h} (56%)
create mode 100644 cpp/src/arrow/util/converter.h
copy docs/source/cpp/api/{utilities.rst => c_abi.rst} (66%)
create mode 100644 docs/source/format/CStreamInterface.rst
create mode 100644 r/man/copy_files.Rd
create mode 100644 r/src/nameof.h
create mode 100644 r/tests/testthat/test-s3-minio.R
create mode 100644 rust/arrow/src/compute/kernels/substring.rs
create mode 100644 rust/datafusion/examples/simple_udaf.rs
create mode 100644 rust/datafusion/src/physical_plan/udaf.rs