This is an automated email from the ASF dual-hosted git repository.
nevime pushed a change to branch rust-parquet-arrow-writer
in repository https://gitbox.apache.org/repos/asf/arrow.git.
omit 2bfa2d3 ARROW-8423: [Rust] [Parquet] Serialize Arrow schema metadata
omit f11b322 ARROW-8289: [Rust] Parquet Arrow writer with nested support
add 8150008 ARROW-9722: [Rust] Shorten key lifetime for dict lookup key
add 586c060 ARROW-9615: [Rust] Added kernel to compute length of a string.
add 525a5e9 ARROW-9693: [CI][Docs] Nightly docs build fails
add 570184b ARROW-9727: [C++] Fix crashes on invalid IPC input (OSS-Fuzz)
add 7efc4f3 ARROW-9714: [Rust] [DataFusion] Implement type coercion rule
for limit and sort
add b2788c5 ARROW-9725: [Rust] [DataFusion] SortExec and LimitExec re-use
MergeExec
add d23f0a6 ARROW-9706: [Java] Tests of TestLargeListVector correctly
read offset
add cf1c749 ARROW-9681: [Java] Fix test failures of Arrow Memory - Core
on big-endian platform
add 3368159 ARROW-9734: [Rust] [DataFusion] TableProvider.scan now
returns partitions instead of iterators
add ecba35c ARROW-9726: [Rust] [DataFusion] Do not create parquet reader
thread until execute is called
add 2f36cc4 ARROW-9716: [Rust] [DataFusion] Implement limit on concurrent
threads in MergeExec
add 4e06c1e ARROW-9711: [Rust] Add new benchmark derived from TPC-H
add e553b73 ARROW-9743: [R] Sanitize paths in open_dataset
add 2dcc9a1 ARROW-9654: [Rust][DataFusion] Add `EXPLAIN <SQL>` statement
add 5677f9e ARROW-8581: [C#] Accept and return DateTime from DateXXArray
add 3941b66 ARROW-9739: [CI][Ruby] Don't install gem documents
add 222859d ARROW-9358: [Integration] remove generated_large_batch.json
add 0d0a0cf ARROW-9377: [Java] Support unsigned dictionary indices
add 5d88f10 ARROW-8402: [Java] Support ValidateFull methods in Java
add afa3eed ARROW-9729: [Java] Disable Error Prone when project is
imported into …
add 597ad62 ARROW-9617: [Rust] [DataFusion] Add length of string array
add 613ab4a ARROW-9742: [Rust] [DataFusion] Improved DataFrame trait
(formerly known as the Table trait)
add 2c58141 ARROW-9758: [Rust] [DataFusion] Allow physical planner to be
replaced
add a94f2b3 ARROW-9673: [Rust] [DataFusion] Add a param "dialect" for
DFParser::parse_sql
add 58b38a6 ARROW-9618: [Rust] [DataFusion] Made it easier to write
optimizers
add 2e3d7ec ARROW-9528: [Python] Honor tzinfo when converting from
datetime
add 9bd3d50 ARROW-9759: [Rust] [DataFusion] Implement DataFrame.sort()
add 51e574f ARROW-9764: [CI][Java] Fix wrong image name for push
add 4d836ef ARROW-9757: [Rust] [DataFusion] Add prelude.rs
add 7593c9a ARROW-9556: [Python][C++] Segfaults in UnionArray with null
values
add 1018a4f ARROW-9517: [C++/Python] Add support for temporary
credentials to S3Options
add 18181fe ARROW-9768 [Rust] [DataFusion] Rename PhysicalPlannerImpl to
DefaultPhysicalPlanner
add c4f8436 ARROW-9495: [C++] Equality assertions don't handle Inf / -Inf
properly
add 2f98d1e ARROW-9710: [C++] Improve performance of Decimal128::ToString
by 10x, and make the implementation reusable for Decimal256.
add 8a0db9e ARROW-9783: [Rust] [DataFusion] Remove aggregate expression
data type
new ddaac0a ARROW-8289: [Rust] Parquet Arrow writer with nested support
new 7afa648 ARROW-8423: [Rust] [Parquet] Serialize Arrow schema metadata
This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version. This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:
* -- * -- B -- O -- O -- O (2bfa2d3)
\
N -- N -- N refs/heads/rust-parquet-arrow-writer (7afa648)
You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.
Any revisions marked "omit" are not gone; other references still
refer to them. Any revisions marked "discard" are gone forever.
The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails. The revisions
listed as "add" were already present in the repository and have only
been added to this reference.
Summary of changes:
.github/workflows/java.yml | 2 +-
ci/conda_env_sphinx.yml | 4 +-
ci/docker/linux-apt-c-glib.dockerfile | 2 +-
ci/docker/linux-apt-docs.dockerfile | 5 +-
ci/scripts/integration_spark.sh | 3 +
cpp/cmake_modules/ThirdpartyToolchain.cmake | 27 +-
cpp/src/arrow/array/array_test.cc | 74 ++-
cpp/src/arrow/compare.cc | 7 +-
cpp/src/arrow/filesystem/s3fs.cc | 48 +-
cpp/src/arrow/filesystem/s3fs.h | 37 +-
cpp/src/arrow/filesystem/s3fs_test.cc | 52 +-
cpp/src/arrow/flight/flight_test.cc | 28 +-
cpp/src/arrow/flight/test_util.cc | 29 +
cpp/src/arrow/flight/test_util.h | 6 +
cpp/src/arrow/ipc/metadata_internal.cc | 3 +-
cpp/src/arrow/python/arrow_to_pandas.cc | 53 +-
cpp/src/arrow/python/arrow_to_pandas.h | 5 +-
cpp/src/arrow/python/datetime.cc | 172 ++++-
cpp/src/arrow/python/datetime.h | 26 +
cpp/src/arrow/python/inference.cc | 22 +-
cpp/src/arrow/python/python_to_arrow.cc | 151 +++--
cpp/src/arrow/python/python_to_arrow.h | 8 +-
cpp/src/arrow/type.cc | 12 +
cpp/src/arrow/type.h | 3 +
cpp/src/arrow/util/decimal.cc | 217 ++++---
cpp/src/arrow/util/decimal_benchmark.cc | 46 +-
cpp/src/arrow/util/decimal_test.cc | 141 ++++-
csharp/src/Apache.Arrow/Arrays/Date32Array.cs | 66 +-
csharp/src/Apache.Arrow/Arrays/Date64Array.cs | 77 ++-
csharp/src/Apache.Arrow/Arrays/DateArrayBuilder.cs | 209 ++++++
.../Apache.Arrow/Arrays/DelegatingArrayBuilder.cs | 102 +++
csharp/test/Apache.Arrow.Tests/ArrowArrayTests.cs | 4 +-
csharp/test/Apache.Arrow.Tests/Date32ArrayTests.cs | 115 +++-
csharp/test/Apache.Arrow.Tests/Date64ArrayTests.cs | 133 ++++
.../test/Apache.Arrow.Tests/TestDateAndTimeData.cs | 83 +++
dev/archery/archery/integration/datagen.py | 7 +-
dev/archery/archery/integration/runner.py | 5 +-
dev/release/verify-release-candidate.sh | 2 +-
.../apache/arrow/flight/TestBasicOperation.java | 74 ++-
.../java/org/apache/arrow/memory/ArrowBuf.java | 4 +-
.../arrow/memory/util/ByteFunctionHelpers.java | 34 +-
java/pom.xml | 51 +-
.../main/codegen/templates/DenseUnionVector.java | 2 +-
.../src/main/codegen/templates/UnionVector.java | 2 +-
.../org/apache/arrow/vector/DurationVector.java | 8 +
.../java/org/apache/arrow/vector/UInt1Vector.java | 14 +-
.../java/org/apache/arrow/vector/UInt2Vector.java | 6 +
.../java/org/apache/arrow/vector/UInt4Vector.java | 15 +-
.../java/org/apache/arrow/vector/UInt8Vector.java | 6 +
.../arrow/vector/util/ValueVectorUtility.java | 84 ++-
.../apache/arrow/vector/validate/ValidateUtil.java | 61 ++
.../validate/ValidateVectorBufferVisitor.java | 239 +++++++
.../vector/validate/ValidateVectorDataVisitor.java | 173 +++++
.../vector/validate/ValidateVectorTypeVisitor.java | 355 +++++++++++
.../apache/arrow/vector/TestDictionaryVector.java | 104 +++
.../apache/arrow/vector/TestLargeListVector.java | 50 +-
.../org/apache/arrow/vector/TestValueVector.java | 43 ++
.../vector/ipc/TestUIntDictionaryRoundTrip.java | 246 ++++++++
.../vector/testing/ValueVectorDataPopulator.java | 32 +
...eVectorVisitor.java => TestValidateVector.java} | 73 ++-
.../vector/validate/TestValidateVectorFull.java | 234 +++++++
.../validate/TestValidateVectorSchemaRoot.java | 101 +++
.../validate/TestValidateVectorTypeVisitor.java | 301 +++++++++
python/manylinux1/scripts/build_aws_sdk.sh | 2 +-
python/manylinux201x/scripts/build_aws_sdk.sh | 2 +-
python/pyarrow/_s3fs.pyx | 84 ++-
python/pyarrow/array.pxi | 7 +-
python/pyarrow/includes/libarrow.pxd | 5 +
python/pyarrow/includes/libarrow_fs.pxd | 17 +-
python/pyarrow/scalar.pxi | 2 +-
python/pyarrow/tests/test_array.py | 44 +-
python/pyarrow/tests/test_convert_builtin.py | 234 +++++--
python/pyarrow/tests/test_fs.py | 17 +-
python/pyarrow/tests/test_pandas.py | 60 +-
python/pyarrow/tests/test_types.py | 117 ++++
python/pyarrow/types.pxi | 40 +-
r/NEWS.md | 1 +
r/R/dataset.R | 1 +
rust/arrow/Cargo.toml | 4 +
.../{array_from_vec.rs => length_kernel.rs} | 33 +-
rust/arrow/src/array/array.rs | 2 +-
rust/arrow/src/compute/kernels/length.rs | 186 ++++++
rust/arrow/src/compute/kernels/mod.rs | 1 +
rust/benchmarks/README.md | 22 +-
rust/benchmarks/src/{main.rs => bin/nyctaxi.rs} | 25 +-
rust/benchmarks/src/bin/tpch.rs | 167 +++++
rust/datafusion/Cargo.toml | 1 +
rust/datafusion/README.md | 3 +-
rust/datafusion/benches/aggregate_query_sql.rs | 2 +-
rust/datafusion/examples/csv_sql.rs | 17 +-
.../examples/{parquet_sql.rs => dataframe.rs} | 25 +-
rust/datafusion/examples/flight_server.rs | 4 +-
rust/datafusion/examples/memory_table_api.rs | 11 +-
rust/datafusion/examples/parquet_sql.rs | 17 +-
rust/datafusion/src/bin/repl.rs | 15 +-
rust/datafusion/src/dataframe.rs | 177 ++++++
rust/datafusion/src/datasource/csv.rs | 13 +-
rust/datafusion/src/datasource/datasource.rs | 10 +-
rust/datafusion/src/datasource/memory.rs | 31 +-
rust/datafusion/src/datasource/mod.rs | 2 +-
rust/datafusion/src/datasource/parquet.rs | 39 +-
rust/datafusion/src/execution/context.rs | 702 +++++++++------------
.../execution/{table_impl.rs => dataframe_impl.rs} | 133 ++--
rust/datafusion/src/execution/mod.rs | 2 +-
.../src/execution/physical_plan/datasource.rs | 41 +-
.../src/execution/physical_plan/explain.rs | 99 +++
.../src/execution/physical_plan/hash_aggregate.rs | 2 +-
.../src/execution/physical_plan/limit.rs | 109 +++-
.../execution/physical_plan/math_expressions.rs | 46 +-
.../src/execution/physical_plan/memory.rs | 2 +-
.../src/execution/physical_plan/merge.rs | 107 +++-
rust/datafusion/src/execution/physical_plan/mod.rs | 36 +-
.../src/execution/physical_plan/parquet.rs | 62 +-
.../src/execution/physical_plan/planner.rs | 468 ++++++++++++++
.../datafusion/src/execution/physical_plan/sort.rs | 47 +-
rust/datafusion/src/lib.rs | 40 +-
rust/datafusion/src/logicalplan.rs | 336 +++++++---
rust/datafusion/src/optimizer/optimizer.rs | 2 +
.../src/optimizer/projection_push_down.rs | 98 +--
rust/datafusion/src/optimizer/type_coercion.rs | 235 +++----
rust/datafusion/src/optimizer/utils.rs | 248 +++++++-
.../util.h => rust/datafusion/src/prelude.rs | 33 +-
rust/datafusion/src/sql/parser.rs | 88 ++-
rust/datafusion/src/sql/planner.rs | 36 +-
rust/datafusion/src/table.rs | 79 ---
rust/datafusion/src/test/mod.rs | 1 -
rust/datafusion/tests/example.csv | 2 +
rust/datafusion/tests/sql.rs | 75 ++-
testing | 2 +-
129 files changed, 7133 insertions(+), 1736 deletions(-)
create mode 100644 csharp/src/Apache.Arrow/Arrays/DateArrayBuilder.cs
create mode 100644 csharp/src/Apache.Arrow/Arrays/DelegatingArrayBuilder.cs
create mode 100644 csharp/test/Apache.Arrow.Tests/Date64ArrayTests.cs
create mode 100644 csharp/test/Apache.Arrow.Tests/TestDateAndTimeData.cs
create mode 100644
java/vector/src/main/java/org/apache/arrow/vector/validate/ValidateUtil.java
create mode 100644
java/vector/src/main/java/org/apache/arrow/vector/validate/ValidateVectorBufferVisitor.java
create mode 100644
java/vector/src/main/java/org/apache/arrow/vector/validate/ValidateVectorDataVisitor.java
create mode 100644
java/vector/src/main/java/org/apache/arrow/vector/validate/ValidateVectorTypeVisitor.java
create mode 100644
java/vector/src/test/java/org/apache/arrow/vector/ipc/TestUIntDictionaryRoundTrip.java
rename
java/vector/src/test/java/org/apache/arrow/vector/validate/{TestValidateVectorVisitor.java
=> TestValidateVector.java} (71%)
create mode 100644
java/vector/src/test/java/org/apache/arrow/vector/validate/TestValidateVectorFull.java
create mode 100644
java/vector/src/test/java/org/apache/arrow/vector/validate/TestValidateVectorSchemaRoot.java
create mode 100644
java/vector/src/test/java/org/apache/arrow/vector/validate/TestValidateVectorTypeVisitor.java
copy rust/arrow/benches/{array_from_vec.rs => length_kernel.rs} (58%)
create mode 100644 rust/arrow/src/compute/kernels/length.rs
rename rust/benchmarks/src/{main.rs => bin/nyctaxi.rs} (87%)
create mode 100644 rust/benchmarks/src/bin/tpch.rs
copy rust/datafusion/examples/{parquet_sql.rs => dataframe.rs} (66%)
create mode 100644 rust/datafusion/src/dataframe.rs
rename rust/datafusion/src/execution/{table_impl.rs => dataframe_impl.rs} (63%)
create mode 100644 rust/datafusion/src/execution/physical_plan/explain.rs
create mode 100644 rust/datafusion/src/execution/physical_plan/planner.rs
copy cpp/src/arrow/dbi/hiveserver2/util.h => rust/datafusion/src/prelude.rs
(58%)
delete mode 100644 rust/datafusion/src/table.rs
create mode 100644 rust/datafusion/tests/example.csv