[arrow] branch master updated (6ee1ca4 -> 1519ee1)
This is an automated email from the ASF dual-hosted git repository. npr pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/arrow.git. from 6ee1ca4 ARROW-12003: [R] Fix NOTE re undefined global function group_by_drop_default add 1519ee1 ARROW-12005: [R] Fix a bash typo in configure No new revisions were added by this update. Summary of changes: r/configure | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
[arrow] branch master updated (7e711c9 -> 6ee1ca4)
This is an automated email from the ASF dual-hosted git repository. npr pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/arrow.git. from 7e711c9 ARROW-10880: [Java] Support compressing RecordBatch IPC buffers by LZ4 add 6ee1ca4 ARROW-12003: [R] Fix NOTE re undefined global function group_by_drop_default No new revisions were added by this update. Summary of changes: r/R/dplyr.R | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
[arrow] branch master updated (946bfd9 -> 7e711c9)
This is an automated email from the ASF dual-hosted git repository. emkornfield pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/arrow.git. from 946bfd9 ARROW-11066: [FlightRPC][Java] Make zero-copy writes a configurable option add 7e711c9 ARROW-10880: [Java] Support compressing RecordBatch IPC buffers by LZ4 No new revisions were added by this update. Summary of changes: dev/archery/archery/integration/runner.py | 3 +- java/{algorithm => compression}/pom.xml| 22 +-- .../compression/CommonsCompressionFactory.java}| 23 ++- .../arrow/compression/Lz4CompressionCodec.java | 157 .../arrow/compression/TestCompressionCodec.java| 209 + .../org/apache/arrow/memory/util/MemoryUtil.java | 8 +- java/pom.xml | 1 + java/tools/pom.xml | 5 + .../java/org/apache/arrow/tools/Integration.java | 3 +- .../java/org/apache/arrow/tools/StreamToFile.java | 3 +- .../java/org/apache/arrow/vector/VectorLoader.java | 39 +++- .../arrow/vector/compression/CompressionCodec.java | 23 ++- .../arrow/vector/compression/CompressionUtil.java | 81 ++-- .../vector/compression/NoCompressionCodec.java | 21 ++- .../apache/arrow/vector/ipc/ArrowFileReader.java | 16 +- .../org/apache/arrow/vector/ipc/ArrowReader.java | 11 +- .../apache/arrow/vector/ipc/ArrowStreamReader.java | 42 - .../arrow/vector/ipc/message/ArrowRecordBatch.java | 4 +- 18 files changed, 605 insertions(+), 66 deletions(-) copy java/{algorithm => compression}/pom.xml (82%) copy java/{flight/flight-core/src/main/java/org/apache/arrow/flight/auth2/ClientBearerHeaderHandler.java => compression/src/main/java/org/apache/arrow/compression/CommonsCompressionFactory.java} (55%) create mode 100644 java/compression/src/main/java/org/apache/arrow/compression/Lz4CompressionCodec.java create mode 100644 java/compression/src/test/java/org/apache/arrow/compression/TestCompressionCodec.java
[arrow] branch master updated (3decc46 -> 946bfd9)
This is an automated email from the ASF dual-hosted git repository. emkornfield pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/arrow.git. from 3decc46 ARROW-11997: [Python] concat_tables crashes python interpreter add 946bfd9 ARROW-11066: [FlightRPC][Java] Make zero-copy writes a configurable option No new revisions were added by this update. Summary of changes: .../java/org/apache/arrow/flight/ArrowMessage.java | 114 + .../arrow/flight/OutboundStreamListener.java | 16 +++ .../arrow/flight/OutboundStreamListenerImpl.java | 8 +- .../arrow/flight/grpc/AddWritableBuffer.java | 18 +++- .../apache/arrow/flight/TestBasicOperation.java| 9 +- .../org/apache/arrow/flight/TestDoExchange.java| 57 +++ .../arrow/flight/perf/PerformanceTestServer.java | 3 +- 7 files changed, 199 insertions(+), 26 deletions(-)
[arrow] branch master updated (fcaa422 -> 3decc46)
This is an automated email from the ASF dual-hosted git repository. apitrou pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/arrow.git. from fcaa422 ARROW-11996: [R] Make r/configure run successfully on Solaris add 3decc46 ARROW-11997: [Python] concat_tables crashes python interpreter No new revisions were added by this update. Summary of changes: python/pyarrow/lib.pyx | 5 +++-- python/pyarrow/table.pxi | 12 ++-- python/pyarrow/tests/parquet/test_dataset.py | 1 + python/pyarrow/tests/test_table.py | 9 - 4 files changed, 18 insertions(+), 9 deletions(-)
[arrow] branch master updated (5b14d53 -> fcaa422)
This is an automated email from the ASF dual-hosted git repository. npr pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/arrow.git. from 5b14d53 ARROW-9318: [C++] Parquet encryption key management add fcaa422 ARROW-11996: [R] Make r/configure run successfully on Solaris No new revisions were added by this update. Summary of changes: r/configure | 6 +++--- r/tools/nixlibs.R | 8 +++- 2 files changed, 10 insertions(+), 4 deletions(-)
[arrow] branch master updated (b710f21 -> 5b14d53)
This is an automated email from the ASF dual-hosted git repository. apitrou pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/arrow.git. from b710f21 ARROW-11976: [C++] Fix sporadic TSAN error with GatingTask add 5b14d53 ARROW-9318: [C++] Parquet encryption key management No new revisions were added by this update. Summary of changes: cpp/cmake_modules/ThirdpartyToolchain.cmake| 4 + cpp/src/arrow/CMakeLists.txt | 2 + cpp/src/arrow/json/chunked_builder.h | 9 +- cpp/src/arrow/json/chunker.cc | 7 +- cpp/src/arrow/json/converter.cc| 5 +- cpp/src/arrow/json/object_parser.cc| 83 cpp/src/arrow/json/object_parser.h | 49 +++ cpp/src/arrow/json/object_writer.cc| 82 cpp/src/arrow/json/object_writer.h | 48 ++ cpp/src/arrow/json/parser.cc | 13 +- cpp/src/arrow/json/reader.cc | 5 +- cpp/src/arrow/util/concurrent_map.h| 68 +++ cpp/src/arrow/util/string.cc | 14 + cpp/src/arrow/util/string.h| 4 + cpp/src/arrow/util/string_test.cc | 38 ++ cpp/src/parquet/CMakeLists.txt | 39 +- cpp/src/parquet/column_reader.cc | 4 +- cpp/src/parquet/column_writer.cc | 4 +- cpp/src/parquet/encryption/CMakeLists.txt | 19 + cpp/src/parquet/encryption/crypto_factory.cc | 175 cpp/src/parquet/encryption/crypto_factory.h| 135 ++ cpp/src/parquet/{ => encryption}/encryption.cc | 8 +- cpp/src/parquet/{ => encryption}/encryption.h | 6 +- .../{ => encryption}/encryption_internal.cc| 2 +- .../parquet/{ => encryption}/encryption_internal.h | 10 +- .../{ => encryption}/encryption_internal_nossl.cc | 2 +- .../parquet/encryption/file_key_material_store.h | 31 ++ cpp/src/parquet/encryption/file_key_unwrapper.cc | 114 + cpp/src/parquet/encryption/file_key_unwrapper.h| 66 +++ cpp/src/parquet/encryption/file_key_wrapper.cc | 109 + cpp/src/parquet/encryption/file_key_wrapper.h | 82 .../{ => encryption}/internal_file_decryptor.cc| 6 +- .../{ => encryption}/internal_file_decryptor.h | 0 .../{ => encryption}/internal_file_encryptor.cc| 6 +- .../{ => encryption}/internal_file_encryptor.h | 2 +- cpp/src/parquet/encryption/key_encryption_key.h| 61 +++ cpp/src/parquet/encryption/key_management_test.cc | 225 ++ cpp/src/parquet/encryption/key_material.cc | 159 +++ cpp/src/parquet/encryption/key_material.h | 131 ++ cpp/src/parquet/encryption/key_metadata.cc | 89 cpp/src/parquet/encryption/key_metadata.h | 94 cpp/src/parquet/encryption/key_metadata_test.cc| 77 cpp/src/parquet/encryption/key_toolkit.cc | 52 +++ cpp/src/parquet/encryption/key_toolkit.h | 76 cpp/src/parquet/encryption/key_toolkit_internal.cc | 80 cpp/src/parquet/encryption/key_toolkit_internal.h | 58 +++ cpp/src/parquet/encryption/key_wrapping_test.cc| 103 + cpp/src/parquet/encryption/kms_client.cc | 44 ++ cpp/src/parquet/encryption/kms_client.h| 95 cpp/src/parquet/encryption/kms_client_factory.h| 40 ++ .../parquet/encryption/local_wrap_kms_client.cc| 116 + cpp/src/parquet/encryption/local_wrap_kms_client.h | 96 .../properties_test.cc}| 7 +- .../read_configurations_test.cc} | 294 + cpp/src/parquet/encryption/test_encryption_util.cc | 482 + cpp/src/parquet/encryption/test_encryption_util.h | 113 + cpp/src/parquet/encryption/test_in_memory_kms.cc | 81 cpp/src/parquet/encryption/test_in_memory_kms.h| 89 .../encryption/two_level_cache_with_expiration.h | 159 +++ .../two_level_cache_with_expiration_test.cc| 177 .../write_configurations_test.cc} | 175 +--- cpp/src/parquet/file_reader.cc | 4 +- cpp/src/parquet/file_writer.cc | 4 +- cpp/src/parquet/metadata.cc| 4 +- cpp/src/parquet/properties.h | 2 +- cpp/src/parquet/test_encryption_util.h | 82 cpp/src/parquet/thrift_internal.h | 4 +- 67 files changed, 3933 insertions(+), 591 deletions(-) create mode 100644 cpp/src/arrow/json/object_parser.cc create mode 100644 cpp/src/arrow/json/object_parser.h create mode 100644 cpp/src/arrow/json/object_writer.cc create mode 100644 cpp/src/arrow/json/object_writer.h create mode 100644 cpp/src/arrow/util/concurrent_map.h create mode 100644 cpp/src/parquet/encryption/CMakeLists.txt create mode
[arrow] branch master updated (06cb1a6 -> b710f21)
This is an automated email from the ASF dual-hosted git repository. apitrou pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/arrow.git. from 06cb1a6 ARROW-10372: [Dataset][C++][Python][R] Support reading compressed CSV add b710f21 ARROW-11976: [C++] Fix sporadic TSAN error with GatingTask No new revisions were added by this update. Summary of changes: cpp/src/arrow/testing/gtest_util.cc | 61 +++-- cpp/src/arrow/testing/gtest_util.h | 2 +- 2 files changed, 33 insertions(+), 30 deletions(-)
[arrow] branch master updated (651aafc -> 06cb1a6)
This is an automated email from the ASF dual-hosted git repository. bkietz pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/arrow.git. from 651aafc ARROW-11971: [Packaging] Vcpkg patch doesn't apply on windows due to line endings add 06cb1a6 ARROW-10372: [Dataset][C++][Python][R] Support reading compressed CSV No new revisions were added by this update. Summary of changes: cpp/src/arrow/dataset/file_base.cc | 28 + cpp/src/arrow/dataset/file_base.h | 6 +++ cpp/src/arrow/dataset/file_csv.cc | 23 ++- cpp/src/arrow/dataset/file_csv.h | 1 + cpp/src/arrow/dataset/file_csv_test.cc | 74 +- python/pyarrow/tests/test_dataset.py | 27 + r/tests/testthat/test-dataset.R| 23 +++ 7 files changed, 163 insertions(+), 19 deletions(-)
[arrow-testing] branch master updated: ARROW-11838: files for testing IPC reads with shared dictionaries. (#59)
This is an automated email from the ASF dual-hosted git repository. apitrou pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/arrow-testing.git The following commit(s) were added to refs/heads/master by this push: new 815fe4f ARROW-11838: files for testing IPC reads with shared dictionaries. (#59) 815fe4f is described below commit 815fe4fb8f17011aa52282a08d3f858b87f3b2ea Author: jmgpeeters AuthorDate: Wed Mar 17 12:30:40 2021 + ARROW-11838: files for testing IPC reads with shared dictionaries. (#59) * golden files for testing IPC reads with shared dictionaries. * use integration format instead. Co-authored-by: jpeeter --- .../4.0.0-shareddict/generated_shared_dict.arrow_file| Bin 0 -> 1050 bytes .../4.0.0-shareddict/generated_shared_dict.json.gz | Bin 0 -> 433 bytes .../4.0.0-shareddict/generated_shared_dict.stream| Bin 0 -> 712 bytes 3 files changed, 0 insertions(+), 0 deletions(-) diff --git a/data/arrow-ipc-stream/integration/4.0.0-shareddict/generated_shared_dict.arrow_file b/data/arrow-ipc-stream/integration/4.0.0-shareddict/generated_shared_dict.arrow_file new file mode 100644 index 000..cc18358 Binary files /dev/null and b/data/arrow-ipc-stream/integration/4.0.0-shareddict/generated_shared_dict.arrow_file differ diff --git a/data/arrow-ipc-stream/integration/4.0.0-shareddict/generated_shared_dict.json.gz b/data/arrow-ipc-stream/integration/4.0.0-shareddict/generated_shared_dict.json.gz new file mode 100644 index 000..0a9e2c3 Binary files /dev/null and b/data/arrow-ipc-stream/integration/4.0.0-shareddict/generated_shared_dict.json.gz differ diff --git a/data/arrow-ipc-stream/integration/4.0.0-shareddict/generated_shared_dict.stream b/data/arrow-ipc-stream/integration/4.0.0-shareddict/generated_shared_dict.stream new file mode 100644 index 000..837f890 Binary files /dev/null and b/data/arrow-ipc-stream/integration/4.0.0-shareddict/generated_shared_dict.stream differ
[arrow] branch master updated: ARROW-11971: [Packaging] Vcpkg patch doesn't apply on windows due to line endings
This is an automated email from the ASF dual-hosted git repository. kszucs pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/arrow.git The following commit(s) were added to refs/heads/master by this push: new 651aafc ARROW-11971: [Packaging] Vcpkg patch doesn't apply on windows due to line endings 651aafc is described below commit 651aafc029262da6a3ae5b1640fd8e0fe301916e Author: Krisztián Szűcs AuthorDate: Wed Mar 17 12:14:54 2021 +0100 ARROW-11971: [Packaging] Vcpkg patch doesn't apply on windows due to line endings Closes #9713 from kszucs/vcpkg-patch Authored-by: Krisztián Szűcs Signed-off-by: Krisztián Szűcs --- .env | 2 +- ci/docker/python-wheel-manylinux-201x.dockerfile | 2 +- ci/docker/python-wheel-windows-vs2017.dockerfile | 2 +- ci/vcpkg/ports.patch | 12 ++-- 4 files changed, 9 insertions(+), 9 deletions(-) diff --git a/.env b/.env index 535a765..3c14623 100644 --- a/.env +++ b/.env @@ -70,4 +70,4 @@ R_TAG=latest DEVTOOLSET_VERSION=-1 # Used for the manylinux and windows wheels -VCPKG=c7e96f2a5b73b3278b004aa88abec2f8ebfb43b5 +VCPKG=fced4bef1606260f110d74de1ae1975c2b9ac549 diff --git a/ci/docker/python-wheel-manylinux-201x.dockerfile b/ci/docker/python-wheel-manylinux-201x.dockerfile index d9b3826..19246a4 100644 --- a/ci/docker/python-wheel-manylinux-201x.dockerfile +++ b/ci/docker/python-wheel-manylinux-201x.dockerfile @@ -59,7 +59,7 @@ RUN git clone https://github.com/microsoft/vcpkg /opt/vcpkg && \ # Patch ports files as needed COPY ci/vcpkg arrow/ci/vcpkg -RUN cd /opt/vcpkg && patch -p1 -i /arrow/ci/vcpkg/ports.patch +RUN cd /opt/vcpkg && git apply --ignore-whitespace /arrow/ci/vcpkg/ports.patch ARG build_type=release ENV CMAKE_BUILD_TYPE=${build_type} \ diff --git a/ci/docker/python-wheel-windows-vs2017.dockerfile b/ci/docker/python-wheel-windows-vs2017.dockerfile index c0b85d4..0f66a20 100644 --- a/ci/docker/python-wheel-windows-vs2017.dockerfile +++ b/ci/docker/python-wheel-windows-vs2017.dockerfile @@ -35,7 +35,7 @@ RUN git clone https://github.com/Microsoft/vcpkg && \ # Patch ports files as needed COPY ci/vcpkg arrow/ci/vcpkg -RUN cd vcpkg && patch -p1 -i C:/arrow/ci/vcpkg/ports.patch +RUN cd vcpkg && git apply --ignore-whitespace C:/arrow/ci/vcpkg/ports.patch # Configure vcpkg and install dependencies # NOTE: use windows batch environment notation for build arguments in RUN diff --git a/ci/vcpkg/ports.patch b/ci/vcpkg/ports.patch index b1f4f09..14b9678 100644 --- a/ci/vcpkg/ports.patch +++ b/ci/vcpkg/ports.patch @@ -3,15 +3,15 @@ index f3704ef05..3af543058 100644 --- a/ports/aws-c-common/portfile.cmake +++ b/ports/aws-c-common/portfile.cmake @@ -1,8 +1,8 @@ - vcpkg_from_github( - OUT_SOURCE_PATH SOURCE_PATH - REPO awslabs/aws-c-common + vcpkg_from_github( + OUT_SOURCE_PATH SOURCE_PATH + REPO awslabs/aws-c-common -REF 4a21a1c0757083a16497fea27886f5f20ccdf334 # v0.4.56 --SHA512 68898a8ac15d5490f45676eabfbe0df9e45370a74c543a28909fd0d85fed48dfcf4bcd6ea2d01d1a036dd352e2e4e0b08c48c63ab2a2b477fe150b46a827136e +-SHA512 68898a8ac15d5490f45676eabfbe0df9e45370a74c543a28909fd0d85fed48dfcf4bcd6ea2d01d1a036dd352e2e4e0b08c48c63ab2a2b477fe150b46a827136e +REF 13adef72b7813ec878817c6d50a7a3f241015d8a # v0.4.57 +SHA512 28256522ac6af544d7464e3e7dcd4dc802ae2b09728bf8f167f86a6487bb756d0cad5eb4a2480610b2967b9c24c4a7f70621894517aa2828ffdeb0479453803b - HEAD_REF master - PATCHES + HEAD_REF master + PATCHES disable-error-4068.patch # This patch fixes dependency port compilation failure diff --git a/ports/curl/portfile.cmake b/ports/curl/portfile.cmake index 6e18aecd0..2ccecf33c 100644
[arrow] branch master updated: ARROW-11907: [C++] Use our own executor in S3FileSystem
This is an automated email from the ASF dual-hosted git repository. apitrou pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/arrow.git The following commit(s) were added to refs/heads/master by this push: new 27d21d4 ARROW-11907: [C++] Use our own executor in S3FileSystem 27d21d4 is described below commit 27d21d4d3b18280805ed7118c089d930fd018f11 Author: Antoine Pitrou AuthorDate: Wed Mar 17 11:26:24 2021 +0100 ARROW-11907: [C++] Use our own executor in S3FileSystem The async APIs in the AWS SDK merely spawn a new thread each time they are called. By using our own executor, we schedule requests on our IO thread pool, and we allow for potential cancellation. Closes #9678 from pitrou/ARROW-11907-s3fs-executor Authored-by: Antoine Pitrou Signed-off-by: Antoine Pitrou --- cpp/src/arrow/filesystem/s3fs.cc | 211 +++--- cpp/src/arrow/filesystem/s3fs_test.cc | 5 +- cpp/src/arrow/util/future.cc | 4 + cpp/src/arrow/util/future.h | 4 + cpp/src/arrow/util/future_test.cc | 16 ++- 5 files changed, 145 insertions(+), 95 deletions(-) diff --git a/cpp/src/arrow/filesystem/s3fs.cc b/cpp/src/arrow/filesystem/s3fs.cc index 6b2b708..1940f4d 100644 --- a/cpp/src/arrow/filesystem/s3fs.cc +++ b/cpp/src/arrow/filesystem/s3fs.cc @@ -79,6 +79,7 @@ #include "arrow/util/future.h" #include "arrow/util/logging.h" #include "arrow/util/optional.h" +#include "arrow/util/thread_pool.h" #include "arrow/util/windows_fixup.h" namespace arrow { @@ -488,7 +489,7 @@ class ClientBuilder { Aws::Client::ClientConfiguration* mutable_config() { return _config_; } - Result> BuildClient() { + Result> BuildClient() { credentials_provider_ = options_.credentials_provider; if (!options_.region.empty()) { client_config_.region = ToAwsString(options_.region); @@ -510,10 +511,10 @@ class ClientBuilder { } const bool use_virtual_addressing = options_.endpoint_override.empty(); -return std::unique_ptr( -new S3Client(credentials_provider_, client_config_, - Aws::Client::AWSAuthV4Signer::PayloadSigningPolicy::Never, - use_virtual_addressing)); +return std::make_shared( +credentials_provider_, client_config_, +Aws::Client::AWSAuthV4Signer::PayloadSigningPolicy::Never, +use_virtual_addressing); } const S3Options& options() const { return options_; } @@ -585,7 +586,7 @@ class RegionResolver { } ClientBuilder builder_; - std::unique_ptr client_; + std::shared_ptr client_; std::mutex cache_mutex_; // XXX Should cache size be bounded? It must be quite unusual to query millions @@ -630,11 +631,10 @@ Result GetObjectRange(Aws::S3::S3Client* client, // A RandomAccessFile that reads from a S3 object class ObjectInputFile final : public io::RandomAccessFile { public: - ObjectInputFile(std::shared_ptr fs, Aws::S3::S3Client* client, + ObjectInputFile(std::shared_ptr client, const io::IOContext& io_context, const S3Path& path, int64_t size = kNoSize) - : fs_(std::move(fs)), -client_(client), + : client_(std::move(client)), io_context_(io_context), path_(path), content_length_(size) {} @@ -687,7 +687,6 @@ class ObjectInputFile final : public io::RandomAccessFile { // RandomAccessFile APIs Status Close() override { -fs_.reset(); client_ = nullptr; closed_ = true; return Status::OK(); @@ -724,7 +723,7 @@ class ObjectInputFile final : public io::RandomAccessFile { // Read the desired range of bytes ARROW_ASSIGN_OR_RAISE(S3Model::GetObjectResult result, - GetObjectRange(client_, path_, position, nbytes, out)); + GetObjectRange(client_.get(), path_, position, nbytes, out)); auto& stream = result.GetBody(); stream.ignore(nbytes); @@ -763,8 +762,7 @@ class ObjectInputFile final : public io::RandomAccessFile { } protected: - std::shared_ptr fs_; // Owner of S3Client - Aws::S3::S3Client* client_; + std::shared_ptr client_; const io::IOContext io_context_; S3Path path_; @@ -785,11 +783,10 @@ class ObjectOutputStream final : public io::OutputStream { struct UploadState; public: - ObjectOutputStream(std::shared_ptr fs, Aws::S3::S3Client* client, + ObjectOutputStream(std::shared_ptr client, const io::IOContext& io_context, const S3Path& path, const S3Options& options) - : fs_(std::move(fs)), -client_(client), + : client_(std::move(client)), io_context_(io_context), path_(path), options_(options) {} @@ -837,7 +834,6 @@ class ObjectOutputStream final : public io::OutputStream { outcome.GetError()); } current_part_.reset(); -fs_.reset();
[arrow] branch master updated (c171b27 -> 4a1985d)
This is an automated email from the ASF dual-hosted git repository. alamb pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/arrow.git. from c171b27 ARROW-11659: [R] Preserve group_by .drop argument add 4a1985d ARROW-11955: [Rust][DataFusion] Support Union No new revisions were added by this update. Summary of changes: rust/datafusion/src/dataframe.rs | 14 ++ rust/datafusion/src/execution/dataframe_impl.rs| 7 + rust/datafusion/src/logical_plan/builder.rs| 63 - rust/datafusion/src/logical_plan/plan.rs | 26 rust/datafusion/src/optimizer/constant_folding.rs | 1 + .../src/optimizer/hash_build_probe_order.rs| 5 + rust/datafusion/src/optimizer/limit_push_down.rs | 104 +++ .../src/optimizer/projection_push_down.rs | 1 + rust/datafusion/src/optimizer/utils.rs | 5 + rust/datafusion/src/physical_plan/mod.rs | 1 + rust/datafusion/src/physical_plan/planner.rs | 14 +- rust/datafusion/src/physical_plan/union.rs | 143 + rust/datafusion/src/sql/planner.rs | 130 +-- rust/datafusion/tests/sql.rs | 21 +++ 14 files changed, 491 insertions(+), 44 deletions(-) create mode 100644 rust/datafusion/src/physical_plan/union.rs