[GitHub] [arrow] kiszk commented on a change in pull request #7295: ARROW-8968: [C++][Gandiva] set data layout for pre-compiled IR to llvm::module
kiszk commented on a change in pull request #7295: URL: https://github.com/apache/arrow/pull/7295#discussion_r431566905 ## File path: cpp/src/gandiva/engine.cc ## @@ -150,6 +152,27 @@ Status Engine::Make(const std::shared_ptr& conf, return Status::OK(); } +static void setDataLayout(llvm::Module* module) { Review comment: Thank you, addressed all of them. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] maxburke commented on a change in pull request #7291: ARROW-8455: [Rust] Parquet Arrow column read on partially compatible files FIX
maxburke commented on a change in pull request #7291: URL: https://github.com/apache/arrow/pull/7291#discussion_r431553463 ## File path: rust/parquet/src/arrow/array_reader.rs ## @@ -609,23 +609,34 @@ where { let mut leaves = HashMap::<*const Type, usize>::new(); -let mut filtered_fields: Vec> = Vec::new(); +let mut filtered_root_names = HashSetnew(); for c in column_indices { let column = parquet_schema.column(c).self_type() as *const Type; leaves.insert(column, c); let root = parquet_schema.get_column_root_ptr(c); -filtered_fields.push(root); +filtered_root_names.insert(root.name().to_string()); } if leaves.is_empty() { return Err(general_err!("Can't build array reader without columns!")); } +// Only pass root fields that take part in the projection +// to avoid traversal of columns that are not read. +// TODO: also prune unread parts of the tree in child structures +let filtered_root_fields = parquet_schema +.root_schema() +.get_fields() +.into_iter() +.filter(|field| filtered_root_names.contains(field.name())) +.map(|field| field.clone()) Review comment: I think this could probably use `.cloned()` instead of `.map(|field| field.clone())` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] sunchao commented on pull request #7291: ARROW-8455: [Rust] Parquet Arrow column read on partially compatible files FIX
sunchao commented on pull request #7291: URL: https://github.com/apache/arrow/pull/7291#issuecomment-635063404 Merged. Thanks @rdettai for the fix and @maxburke for reporting this! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] sunchao closed pull request #7291: ARROW-8455: [Rust] Parquet Arrow column read on partially compatible files FIX
sunchao closed pull request #7291: URL: https://github.com/apache/arrow/pull/7291 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] maxburke commented on pull request #7291: ARROW-8455: [Rust] Parquet Arrow column read on partially compatible files FIX
maxburke commented on pull request #7291: URL: https://github.com/apache/arrow/pull/7291#issuecomment-635062239 @rdettai Your change works for me! Thank you so much for looking into this! :smile: This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] liyafan82 commented on pull request #6425: ARROW-6111: [Java] Support LargeVarChar and LargeBinary types
liyafan82 commented on pull request #6425: URL: https://github.com/apache/arrow/pull/6425#issuecomment-635050822 @BryanCutler @siddharthteotia @emkornfield I am aware that this PR is large, and consumes lots of effort. So thanks a lot for your effort and good comments! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] liyafan82 commented on a change in pull request #6425: ARROW-6111: [Java] Support LargeVarChar and LargeBinary types
liyafan82 commented on a change in pull request #6425: URL: https://github.com/apache/arrow/pull/6425#discussion_r431542563 ## File path: java/vector/src/main/java/org/apache/arrow/vector/validate/ValidateVectorVisitor.java ## @@ -90,6 +91,11 @@ public Void visit(BaseVariableWidthVector vector, Void value) { return null; } + @Override + public Void visit(BaseLargeVariableWidthVector left, Void value) { +return null; Review comment: Sure. I will fix this problem in ARROW-8402, and will open other JIRAs to track other TODOs. Thanks for your reminder. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] liyafan82 commented on a change in pull request #7287: ARROW-8771: [C++] Add boost/process library to build support
liyafan82 commented on a change in pull request #7287: URL: https://github.com/apache/arrow/pull/7287#discussion_r431540394 ## File path: cpp/build-support/trim-boost.sh ## @@ -41,6 +41,8 @@ BOOST_LIBS="$BOOST_LIBS regex.hpp" BOOST_LIBS="$BOOST_LIBS multiprecision/cpp_int.hpp" # These are for Thrift when Thrift_SOURCE=BUNDLED BOOST_LIBS="$BOOST_LIBS algorithm/string.hpp locale.hpp noncopyable.hpp numeric/conversion/cast.hpp scope_exit.hpp scoped_array.hpp shared_array.hpp tokenizer.hpp version.hpp" +#These are for flight +BOOST_LIBS="$BOOST_LIBS process.hpp process asio fusion" Review comment: Thanks for your comments. Some other header files are referenced (directly or indirectly) from process.hpp. I tried to refine sub-directories in process, asio and fusion, but gave up because too many individual individual header files were required. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] kiszk edited a comment on pull request #7295: ARROW-8968: [C++][Gandiva] set data layout for pre-compiled IR to llvm::module
kiszk edited a comment on pull request #7295: URL: https://github.com/apache/arrow/pull/7295#issuecomment-635036852 Regarding the licence, I will add a link to the file at https://github.com/llvm/llvm-project/blob/master/mlir/LICENSE.TXT. Could you please let me know if we need to create a separate file as https://github.com/apache/arrow/blob/master/cpp/src/arrow/util/bpacking_default.h This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] kiszk commented on pull request #7295: ARROW-8968: [C++][Gandiva] set data layout for pre-compiled IR to llvm::module
kiszk commented on pull request #7295: URL: https://github.com/apache/arrow/pull/7295#issuecomment-635036852 Regarding the licence, I will add a link to the file at https://github.com/llvm/llvm-project/blob/master/mlir/LICENSE.TXT. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] github-actions[bot] commented on pull request #7280: ARROW-8158: [C++] Upgrade to LLVM 9
github-actions[bot] commented on pull request #7280: URL: https://github.com/apache/arrow/pull/7280#issuecomment-635028483 Revision: a47ab63fab16246cac07c2ed7bed3050744d2144 Submitted crossbow builds: [ursa-labs/crossbow @ actions-269](https://github.com/ursa-labs/crossbow/branches/all?query=actions-269) |Task|Status| ||--| |centos-6-amd64|[![Github Actions](https://github.com/ursa-labs/crossbow/workflows/Crossbow/badge.svg?branch=actions-269-github-centos-6-amd64)](https://github.com/ursa-labs/crossbow/actions?query=branch:actions-269-github-centos-6-amd64)| |centos-7-aarch64|[![TravisCI](https://img.shields.io/travis/ursa-labs/crossbow/actions-269-travis-centos-7-aarch64.svg)](https://travis-ci.org/ursa-labs/crossbow/branches)| |centos-7-amd64|[![Github Actions](https://github.com/ursa-labs/crossbow/workflows/Crossbow/badge.svg?branch=actions-269-github-centos-7-amd64)](https://github.com/ursa-labs/crossbow/actions?query=branch:actions-269-github-centos-7-amd64)| |centos-8-aarch64|[![TravisCI](https://img.shields.io/travis/ursa-labs/crossbow/actions-269-travis-centos-8-aarch64.svg)](https://travis-ci.org/ursa-labs/crossbow/branches)| |centos-8-amd64|[![Github Actions](https://github.com/ursa-labs/crossbow/workflows/Crossbow/badge.svg?branch=actions-269-github-centos-8-amd64)](https://github.com/ursa-labs/crossbow/actions?query=branch:actions-269-github-centos-8-amd64)| |conda-linux-gcc-py36|[![Azure](https://dev.azure.com/ursa-labs/crossbow/_apis/build/status/ursa-labs.crossbow?branchName=actions-269-azure-conda-linux-gcc-py36)](https://dev.azure.com/ursa-labs/crossbow/_build/latest?definitionId=1=actions-269-azure-conda-linux-gcc-py36)| |conda-linux-gcc-py37|[![Azure](https://dev.azure.com/ursa-labs/crossbow/_apis/build/status/ursa-labs.crossbow?branchName=actions-269-azure-conda-linux-gcc-py37)](https://dev.azure.com/ursa-labs/crossbow/_build/latest?definitionId=1=actions-269-azure-conda-linux-gcc-py37)| |conda-linux-gcc-py38|[![Azure](https://dev.azure.com/ursa-labs/crossbow/_apis/build/status/ursa-labs.crossbow?branchName=actions-269-azure-conda-linux-gcc-py38)](https://dev.azure.com/ursa-labs/crossbow/_build/latest?definitionId=1=actions-269-azure-conda-linux-gcc-py38)| |conda-osx-clang-py36|[![Azure](https://dev.azure.com/ursa-labs/crossbow/_apis/build/status/ursa-labs.crossbow?branchName=actions-269-azure-conda-osx-clang-py36)](https://dev.azure.com/ursa-labs/crossbow/_build/latest?definitionId=1=actions-269-azure-conda-osx-clang-py36)| |conda-osx-clang-py37|[![Azure](https://dev.azure.com/ursa-labs/crossbow/_apis/build/status/ursa-labs.crossbow?branchName=actions-269-azure-conda-osx-clang-py37)](https://dev.azure.com/ursa-labs/crossbow/_build/latest?definitionId=1=actions-269-azure-conda-osx-clang-py37)| |conda-osx-clang-py38|[![Azure](https://dev.azure.com/ursa-labs/crossbow/_apis/build/status/ursa-labs.crossbow?branchName=actions-269-azure-conda-osx-clang-py38)](https://dev.azure.com/ursa-labs/crossbow/_build/latest?definitionId=1=actions-269-azure-conda-osx-clang-py38)| |conda-win-vs2015-py36|[![Azure](https://dev.azure.com/ursa-labs/crossbow/_apis/build/status/ursa-labs.crossbow?branchName=actions-269-azure-conda-win-vs2015-py36)](https://dev.azure.com/ursa-labs/crossbow/_build/latest?definitionId=1=actions-269-azure-conda-win-vs2015-py36)| |conda-win-vs2015-py37|[![Azure](https://dev.azure.com/ursa-labs/crossbow/_apis/build/status/ursa-labs.crossbow?branchName=actions-269-azure-conda-win-vs2015-py37)](https://dev.azure.com/ursa-labs/crossbow/_build/latest?definitionId=1=actions-269-azure-conda-win-vs2015-py37)| |conda-win-vs2015-py38|[![Azure](https://dev.azure.com/ursa-labs/crossbow/_apis/build/status/ursa-labs.crossbow?branchName=actions-269-azure-conda-win-vs2015-py38)](https://dev.azure.com/ursa-labs/crossbow/_build/latest?definitionId=1=actions-269-azure-conda-win-vs2015-py38)| |debian-buster-amd64|[![Github Actions](https://github.com/ursa-labs/crossbow/workflows/Crossbow/badge.svg?branch=actions-269-github-debian-buster-amd64)](https://github.com/ursa-labs/crossbow/actions?query=branch:actions-269-github-debian-buster-amd64)| |debian-buster-arm64|[![TravisCI](https://img.shields.io/travis/ursa-labs/crossbow/actions-269-travis-debian-buster-arm64.svg)](https://travis-ci.org/ursa-labs/crossbow/branches)| |debian-stretch-amd64|[![Github Actions](https://github.com/ursa-labs/crossbow/workflows/Crossbow/badge.svg?branch=actions-269-github-debian-stretch-amd64)](https://github.com/ursa-labs/crossbow/actions?query=branch:actions-269-github-debian-stretch-amd64)| |debian-stretch-arm64|[![TravisCI](https://img.shields.io/travis/ursa-labs/crossbow/actions-269-travis-debian-stretch-arm64.svg)](https://travis-ci.org/ursa-labs/crossbow/branches)|
[GitHub] [arrow] kou commented on pull request #7280: ARROW-8158: [C++] Upgrade to LLVM 9
kou commented on pull request #7280: URL: https://github.com/apache/arrow/pull/7280#issuecomment-635025560 @github-actions crossbow submit -g nightly This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] kou commented on pull request #7294: ARROW-8971: [Python] Upgrade pip
kou commented on pull request #7294: URL: https://github.com/apache/arrow/pull/7294#issuecomment-635004181 @github-actions crossbow submit wheel-manylinux1-* This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] kou commented on pull request #7294: ARROW-8971: [Python] Upgrade pip
kou commented on pull request #7294: URL: https://github.com/apache/arrow/pull/7294#issuecomment-635002228 @github-actions crossbow submit -g wheel This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] kou commented on pull request #7294: ARROW-8971: [Python] Upgrade pip
kou commented on pull request #7294: URL: https://github.com/apache/arrow/pull/7294#issuecomment-634970128 @github-actions crossbow submit wheel-manylinux1-* This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] github-actions[bot] commented on pull request #7294: ARROW-8971: [Python] Upgrade pip
github-actions[bot] commented on pull request #7294: URL: https://github.com/apache/arrow/pull/7294#issuecomment-634967465 https://issues.apache.org/jira/browse/ARROW-8971 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] BryanCutler commented on pull request #6425: ARROW-6111: [Java] Support LargeVarChar and LargeBinary types
BryanCutler commented on pull request #6425: URL: https://github.com/apache/arrow/pull/6425#issuecomment-634963132 merged to master, thanks @liyafan82 ! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] BryanCutler closed pull request #6425: ARROW-6111: [Java] Support LargeVarChar and LargeBinary types
BryanCutler closed pull request #6425: URL: https://github.com/apache/arrow/pull/6425 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] BryanCutler commented on a change in pull request #6425: ARROW-6111: [Java] Support LargeVarChar and LargeBinary types
BryanCutler commented on a change in pull request #6425: URL: https://github.com/apache/arrow/pull/6425#discussion_r431461407 ## File path: java/vector/src/main/java/org/apache/arrow/vector/validate/ValidateVectorVisitor.java ## @@ -90,6 +91,11 @@ public Void visit(BaseVariableWidthVector vector, Void value) { return null; } + @Override + public Void visit(BaseLargeVariableWidthVector left, Void value) { +return null; Review comment: is this another TODO? Would you mind making JIRAs for this and the other TODOs here so they can be tracked? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] sunchao commented on pull request #6935: ARROW-8455: [Rust] Parquet Arrow column read on partially compatible files
sunchao commented on pull request #6935: URL: https://github.com/apache/arrow/pull/6935#issuecomment-634961391 @rdettai looks like the fix is ready? I think we can just get the fix in (probably today) instead of reverting the other one. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] BinduAggarwal commented on a change in pull request #7294: ARROW-${TODO}: [Python] Upgrade pip
BinduAggarwal commented on a change in pull request #7294: URL: https://github.com/apache/arrow/pull/7294#discussion_r431459925 ## File path: python/manylinux1/scripts/requirements.txt ## @@ -23,12 +23,14 @@ # pip requirements for all cpythons # NOTE: pip has GPG signatures; could download and verify independently. -pip==19.0.3 \ - --hash=sha256:6e6f197a1abfb45118dbb878b5c859a0edbdd33fd250100bc015b67fded4b9f2 \ - --hash=sha256:bd812612bbd8ba84159d9ddc0266b7fbce712fc9bc98c82dee5750546ec8ec64 -wheel==0.31.1 \ - --hash=sha256:80044e51ec5bbf6c894ba0bc48d26a8c20a9ba629f4ca19ea26ecfcf87685f5f \ - --hash=sha256:0a2e54558a0628f2145d2fc822137e322412115173e8a2ddbe1c9024338ae83c -setuptools==40.7.3 \ - --hash=sha256:4f4acaf06d617dccfd3fbbc9fbd83cf4749759a1fa2bdf589206a3278e0d537a \ - --hash=sha256:702fdd31cb10a65a94beba1a7d89219a58d2587a349e0a1b7827b133e99ca430 +# pip requirements for all cpythons +# NOTE: pip has GPG signatures; could download and verify independently. Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] kou commented on pull request #7294: ARROW-7293: [Python] Upgrade pip
kou commented on pull request #7294: URL: https://github.com/apache/arrow/pull/7294#issuecomment-634958008 Could you open a new issue on https://issues.apache.org/jira/projects/ARROW/issues/ and use the JIRA issue ID (not pull request ID) in pull request title? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] kou closed issue #7293: update Pip version
kou closed issue #7293: URL: https://github.com/apache/arrow/issues/7293 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] kou commented on a change in pull request #7294: ARROW-7293: [Python] upgrading pip
kou commented on a change in pull request #7294: URL: https://github.com/apache/arrow/pull/7294#discussion_r431457963 ## File path: python/manylinux1/scripts/requirements.txt ## @@ -23,12 +23,14 @@ # pip requirements for all cpythons # NOTE: pip has GPG signatures; could download and verify independently. -pip==19.0.3 \ - --hash=sha256:6e6f197a1abfb45118dbb878b5c859a0edbdd33fd250100bc015b67fded4b9f2 \ - --hash=sha256:bd812612bbd8ba84159d9ddc0266b7fbce712fc9bc98c82dee5750546ec8ec64 -wheel==0.31.1 \ - --hash=sha256:80044e51ec5bbf6c894ba0bc48d26a8c20a9ba629f4ca19ea26ecfcf87685f5f \ - --hash=sha256:0a2e54558a0628f2145d2fc822137e322412115173e8a2ddbe1c9024338ae83c -setuptools==40.7.3 \ - --hash=sha256:4f4acaf06d617dccfd3fbbc9fbd83cf4749759a1fa2bdf589206a3278e0d537a \ - --hash=sha256:702fdd31cb10a65a94beba1a7d89219a58d2587a349e0a1b7827b133e99ca430 +# pip requirements for all cpythons +# NOTE: pip has GPG signatures; could download and verify independently. Review comment: Could you remove this duplicated comment? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] wesm commented on a change in pull request #7295: ARROW-8968: [C++][Gandiva] set data layout for pre-compiled IR to llvm::module
wesm commented on a change in pull request #7295: URL: https://github.com/apache/arrow/pull/7295#discussion_r431454890 ## File path: cpp/src/gandiva/engine.cc ## @@ -150,6 +152,27 @@ Status Engine::Make(const std::shared_ptr& conf, return Status::OK(); } +static void setDataLayout(llvm::Module* module) { Review comment: nit: SetDataLayout ## File path: cpp/src/gandiva/engine.cc ## @@ -150,6 +152,27 @@ Status Engine::Make(const std::shared_ptr& conf, return Status::OK(); } +static void setDataLayout(llvm::Module* module) { + auto targetTriple = llvm::sys::getDefaultTargetTriple(); Review comment: nit: use lower_with_underscores for local variables ## File path: cpp/src/gandiva/engine.cc ## @@ -150,6 +152,27 @@ Status Engine::Make(const std::shared_ptr& conf, return Status::OK(); } +static void setDataLayout(llvm::Module* module) { + auto targetTriple = llvm::sys::getDefaultTargetTriple(); + std::string errorMessage; + auto target = llvm::TargetRegistry::lookupTarget(targetTriple, errorMessage); + if (!target) { +return; + } + + std::string cpu(llvm::sys::getHostCPUName()); + llvm::SubtargetFeatures features; + llvm::StringMap hostFeatures; + + if (llvm::sys::getHostCPUFeatures(hostFeatures)) +for (auto& f : hostFeatures) features.AddFeature(f.first(), f.second); Review comment: nit: please use braces This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] github-actions[bot] commented on pull request #7296: ARROW-8793: [C++] Do not inline BitUtil::SetBitsTo
github-actions[bot] commented on pull request #7296: URL: https://github.com/apache/arrow/pull/7296#issuecomment-634952786 https://issues.apache.org/jira/browse/ARROW-8793 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] wesm opened a new pull request #7296: ARROW-8793: [C++] Do not inline BitUtil::SetBitsTo
wesm opened a new pull request #7296: URL: https://github.com/apache/arrow/pull/7296 Local benchmarks suggest that there is 3-4 ns of overhead for manipulating small bitmaps. inline (argument is byte size of bitmap): ``` Benchmark Time CPU Iterations SetBitsTo/2 3 ns 3 ns 271439203 733.294MB/s SetBitsTo/16 2 ns 2 ns 308813758 6.49485GB/s SetBitsTo/10249 ns 9 ns 79821710 109.078GB/s SetBitsTo/131072 2029 ns 2029 ns 325563 60.1566GB/s ``` non-inline: ``` Benchmark Time CPU Iterations SetBitsTo/2 6 ns 6 ns 129335891334.62MB/s SetBitsTo/16 6 ns 6 ns 122741527 2.53134GB/s SetBitsTo/1024 11 ns 11 ns 64547137 87.1395GB/s SetBitsTo/131072 2010 ns 2010 ns 332558 60.7215GB/s ``` If it can be demonstrated that inlining can meaningfully improve the macroperformance of some function, then we could lift the implementation into a `SetBitsToInline`, but until then I think it makes sense to keep it in bit_util.cc This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] kou commented on a change in pull request #7284: ARROW-7409: [C++][Python] Windows link error LNK1104: cannot open file 'python37_d.lib'
kou commented on a change in pull request #7284: URL: https://github.com/apache/arrow/pull/7284#discussion_r431438314 ## File path: cpp/cmake_modules/DefineOptions.cmake ## @@ -120,6 +120,8 @@ if("${CMAKE_SOURCE_DIR}" STREQUAL "${CMAKE_CURRENT_SOURCE_DIR}") define_option(ARROW_GGDB_DEBUG "Pass -ggdb flag to debug builds" ON) + define_option(ARROW_USE_DEPRECATED_FIND_PYTHONLIBS "Force using deprecated FindPythonLibsNew script" OFF) Review comment: Could you remove this? This isn't used for forcing using deprecated `FindPythonLibsNew`. ## File path: cpp/src/arrow/python/CMakeLists.txt ## @@ -60,7 +60,11 @@ endif() set(ARROW_PYTHON_SHARED_LINK_LIBS arrow_shared) if(WIN32) - list(APPEND ARROW_PYTHON_SHARED_LINK_LIBS ${PYTHON_LIBRARIES} ${PYTHON_OTHER_LIBS}) + if(ARROW_USE_DEPRECATED_FIND_PYTHONLIBS) +list(APPEND ARROW_PYTHON_SHARED_LINK_LIBS $<$:${PYTHON_DEBUG_LIBRARIES}> $<$:${PYTHON_LIBRARIES}>) + else() +list(APPEND ARROW_PYTHON_SHARED_LINK_LIBS ${PYTHON_OTHER_LIBS}) Review comment: `${PYTHON_OTHER_LIBS}` should be always used. ## File path: cpp/cmake_modules/FindPythonLibsNew.cmake ## @@ -246,7 +248,7 @@ FUNCTION(PYTHON_ADD_MODULE _NAME ) SET_TARGET_PROPERTIES(${_NAME} PROPERTIES LINK_FLAGS "-undefined dynamic_lookup") ELSEIF(MSVC) - target_link_libraries(${_NAME} ${PYTHON_LIBRARIES}) + target_link_libraries(${_NAME} ${PYTHON_LIBRARIES} $<$:${PYTHON_DEBUG_LIBRARIES}> $<$:${PYTHON_LIBRARIES}>) Review comment: `${PYTHON_LIBRARIES}` is duplicated in the `Release` configuration. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] wesm closed pull request #7277: ARROW-8957: [FlightRPC][C++] directly use IpcWriteOptions
wesm closed pull request #7277: URL: https://github.com/apache/arrow/pull/7277 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] github-actions[bot] commented on pull request #7280: ARROW-8158: [C++] Upgrade to LLVM 9
github-actions[bot] commented on pull request #7280: URL: https://github.com/apache/arrow/pull/7280#issuecomment-634929177 Revision: dc97db6abe1291c8b3808b62c29808f04ca33c43 Submitted crossbow builds: [ursa-labs/crossbow @ actions-268](https://github.com/ursa-labs/crossbow/branches/all?query=actions-268) |Task|Status| ||--| |centos-6-amd64|[![Github Actions](https://github.com/ursa-labs/crossbow/workflows/Crossbow/badge.svg?branch=actions-268-github-centos-6-amd64)](https://github.com/ursa-labs/crossbow/actions?query=branch:actions-268-github-centos-6-amd64)| |centos-7-aarch64|[![TravisCI](https://img.shields.io/travis/ursa-labs/crossbow/actions-268-travis-centos-7-aarch64.svg)](https://travis-ci.org/ursa-labs/crossbow/branches)| |centos-7-amd64|[![Github Actions](https://github.com/ursa-labs/crossbow/workflows/Crossbow/badge.svg?branch=actions-268-github-centos-7-amd64)](https://github.com/ursa-labs/crossbow/actions?query=branch:actions-268-github-centos-7-amd64)| |centos-8-aarch64|[![TravisCI](https://img.shields.io/travis/ursa-labs/crossbow/actions-268-travis-centos-8-aarch64.svg)](https://travis-ci.org/ursa-labs/crossbow/branches)| |centos-8-amd64|[![Github Actions](https://github.com/ursa-labs/crossbow/workflows/Crossbow/badge.svg?branch=actions-268-github-centos-8-amd64)](https://github.com/ursa-labs/crossbow/actions?query=branch:actions-268-github-centos-8-amd64)| |conda-linux-gcc-py36|[![Azure](https://dev.azure.com/ursa-labs/crossbow/_apis/build/status/ursa-labs.crossbow?branchName=actions-268-azure-conda-linux-gcc-py36)](https://dev.azure.com/ursa-labs/crossbow/_build/latest?definitionId=1=actions-268-azure-conda-linux-gcc-py36)| |conda-linux-gcc-py37|[![Azure](https://dev.azure.com/ursa-labs/crossbow/_apis/build/status/ursa-labs.crossbow?branchName=actions-268-azure-conda-linux-gcc-py37)](https://dev.azure.com/ursa-labs/crossbow/_build/latest?definitionId=1=actions-268-azure-conda-linux-gcc-py37)| |conda-linux-gcc-py38|[![Azure](https://dev.azure.com/ursa-labs/crossbow/_apis/build/status/ursa-labs.crossbow?branchName=actions-268-azure-conda-linux-gcc-py38)](https://dev.azure.com/ursa-labs/crossbow/_build/latest?definitionId=1=actions-268-azure-conda-linux-gcc-py38)| |conda-osx-clang-py36|[![Azure](https://dev.azure.com/ursa-labs/crossbow/_apis/build/status/ursa-labs.crossbow?branchName=actions-268-azure-conda-osx-clang-py36)](https://dev.azure.com/ursa-labs/crossbow/_build/latest?definitionId=1=actions-268-azure-conda-osx-clang-py36)| |conda-osx-clang-py37|[![Azure](https://dev.azure.com/ursa-labs/crossbow/_apis/build/status/ursa-labs.crossbow?branchName=actions-268-azure-conda-osx-clang-py37)](https://dev.azure.com/ursa-labs/crossbow/_build/latest?definitionId=1=actions-268-azure-conda-osx-clang-py37)| |conda-osx-clang-py38|[![Azure](https://dev.azure.com/ursa-labs/crossbow/_apis/build/status/ursa-labs.crossbow?branchName=actions-268-azure-conda-osx-clang-py38)](https://dev.azure.com/ursa-labs/crossbow/_build/latest?definitionId=1=actions-268-azure-conda-osx-clang-py38)| |conda-win-vs2015-py36|[![Azure](https://dev.azure.com/ursa-labs/crossbow/_apis/build/status/ursa-labs.crossbow?branchName=actions-268-azure-conda-win-vs2015-py36)](https://dev.azure.com/ursa-labs/crossbow/_build/latest?definitionId=1=actions-268-azure-conda-win-vs2015-py36)| |conda-win-vs2015-py37|[![Azure](https://dev.azure.com/ursa-labs/crossbow/_apis/build/status/ursa-labs.crossbow?branchName=actions-268-azure-conda-win-vs2015-py37)](https://dev.azure.com/ursa-labs/crossbow/_build/latest?definitionId=1=actions-268-azure-conda-win-vs2015-py37)| |conda-win-vs2015-py38|[![Azure](https://dev.azure.com/ursa-labs/crossbow/_apis/build/status/ursa-labs.crossbow?branchName=actions-268-azure-conda-win-vs2015-py38)](https://dev.azure.com/ursa-labs/crossbow/_build/latest?definitionId=1=actions-268-azure-conda-win-vs2015-py38)| |debian-buster-amd64|[![Github Actions](https://github.com/ursa-labs/crossbow/workflows/Crossbow/badge.svg?branch=actions-268-github-debian-buster-amd64)](https://github.com/ursa-labs/crossbow/actions?query=branch:actions-268-github-debian-buster-amd64)| |debian-buster-arm64|[![TravisCI](https://img.shields.io/travis/ursa-labs/crossbow/actions-268-travis-debian-buster-arm64.svg)](https://travis-ci.org/ursa-labs/crossbow/branches)| |debian-stretch-amd64|[![Github Actions](https://github.com/ursa-labs/crossbow/workflows/Crossbow/badge.svg?branch=actions-268-github-debian-stretch-amd64)](https://github.com/ursa-labs/crossbow/actions?query=branch:actions-268-github-debian-stretch-amd64)| |debian-stretch-arm64|[![TravisCI](https://img.shields.io/travis/ursa-labs/crossbow/actions-268-travis-debian-stretch-arm64.svg)](https://travis-ci.org/ursa-labs/crossbow/branches)|
[GitHub] [arrow] kou commented on pull request #7280: ARROW-8158: [C++] Upgrade to LLVM 9
kou commented on pull request #7280: URL: https://github.com/apache/arrow/pull/7280#issuecomment-634928303 @github-actions crossbow submit -g nightly This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] nealrichardson commented on a change in pull request #7279: ARROW-8938: [R] Provide binding for arrow::compute::CallFunction
nealrichardson commented on a change in pull request #7279: URL: https://github.com/apache/arrow/pull/7279#discussion_r431423981 ## File path: r/src/expression.cpp ## @@ -104,57 +104,9 @@ std::shared_ptr dataset___expr__is_valid( // [[arrow::export]] std::shared_ptr dataset___expr__scalar(SEXP x) { - switch (TYPEOF(x)) { -case NILSXP: - return ds::scalar(std::make_shared()); -case LGLSXP: - return ds::scalar(Rf_asLogical(x)); -case REALSXP: - if (Rf_inherits(x, "Date")) { -return ds::scalar(std::make_shared(REAL(x)[0])); - } else if (Rf_inherits(x, "POSIXct")) { -return ds::scalar(std::make_shared( -REAL(x)[0], arrow::timestamp(arrow::TimeUnit::SECOND))); - } else if (Rf_inherits(x, "integer64")) { -int64_t value = *reinterpret_cast(REAL(x)); -return ds::scalar(value); - } else if (Rf_inherits(x, "difftime")) { -int multiplier = 0; -// TODO: shared with TimeConverter<> in array_from_vector.cpp -std::string unit(CHAR(STRING_ELT(Rf_getAttrib(x, arrow::r::symbols::units), 0))); -if (unit == "secs") { - multiplier = 1; -} else if (unit == "mins") { - multiplier = 60; -} else if (unit == "hours") { - multiplier = 3600; -} else if (unit == "days") { - multiplier = 86400; -} else if (unit == "weeks") { - multiplier = 604800; -} else { - Rcpp::stop("unknown difftime unit"); -} -return ds::scalar(std::make_shared( -static_cast(REAL(x)[0] * multiplier), -arrow::time32(arrow::TimeUnit::SECOND))); - } - return ds::scalar(Rf_asReal(x)); -case INTSXP: - if (Rf_inherits(x, "factor")) { -// TODO: This does not use the actual value, just the levels -auto type = arrow::r::InferArrowTypeFromFactor(x); -return ds::scalar(std::make_shared(type)); - } - return ds::scalar(Rf_asInteger(x)); -case STRSXP: - return ds::scalar(CHAR(STRING_ELT(x, 0))); -default: - Rcpp::stop( - tfm::format("R object of type %s not supported", Rf_type2char(TYPEOF(x; - } - - return nullptr; + // defined in scalar.cpp + std::shared_ptr Scalar__create(SEXP x); Review comment: 路 if it works, I'm fine with it, I don't have opinions about proper C++ style--I just do what the linter tells me to do. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] bkietz commented on a change in pull request #7279: ARROW-8938: [R] Provide binding for arrow::compute::CallFunction
bkietz commented on a change in pull request #7279: URL: https://github.com/apache/arrow/pull/7279#discussion_r431417510 ## File path: r/src/compute.cpp ## @@ -206,60 +206,91 @@ std::shared_ptr Table__FilterChunked( return tab; } +template +std::shared_ptr MaybeUnbox(const char* class_name, SEXP x) { + if (Rf_inherits(x, "ArrowObject") && Rf_inherits(x, class_name)) { +Rcpp::ConstReferenceSmartPtrInputParameter> obj(x); +return static_cast>(obj); + } + return nullptr; +} + arrow::Datum to_datum(SEXP x) { - // TODO: this is repetitive, can we DRY it out? - if (Rf_inherits(x, "ArrowObject") && Rf_inherits(x, "Array")) { -Rcpp::ConstReferenceSmartPtrInputParameter> obj(x); -return static_cast>(obj); - } else if (Rf_inherits(x, "ArrowObject") && Rf_inherits(x, "ChunkedArray")) { - Rcpp::ConstReferenceSmartPtrInputParameter> obj(x); -return static_cast>(obj); - } else if (Rf_inherits(x, "ArrowObject") && Rf_inherits(x, "RecordBatch")) { - Rcpp::ConstReferenceSmartPtrInputParameter> obj(x); -return static_cast>(obj); - } else if (Rf_inherits(x, "ArrowObject") && Rf_inherits(x, "Table")) { -Rcpp::ConstReferenceSmartPtrInputParameter> obj(x); -return static_cast>(obj); - } else { -// TODO: scalar? -// This assumes that R objects have already been converted to Arrow objects; -// that seems right but should we do the wrapping here too/instead? -Rcpp::stop("to_datum: Not implemented"); + if (auto array = MaybeUnbox("Array", x)) { +return array; + } + + if (auto chunked_array = MaybeUnbox("ChunkedArray", x)) { +return chunked_array; + } + + if (auto batch = MaybeUnbox("RecordBatch", x)) { +return batch; + } + + if (auto table = MaybeUnbox("Table", x)) { +return table; + } + + if (auto scalar = MaybeUnbox("Scalar", x)) { +return scalar; } + + // This assumes that R objects have already been converted to Arrow objects; + // that seems right but should we do the wrapping here too/instead? + Rcpp::stop("to_datum: Not implemented for type %s", Rf_type2char(TYPEOF(x))); Review comment: Alright This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] nealrichardson commented on a change in pull request #7279: ARROW-8938: [R] Provide binding for arrow::compute::CallFunction
nealrichardson commented on a change in pull request #7279: URL: https://github.com/apache/arrow/pull/7279#discussion_r431404447 ## File path: r/R/scalar.R ## @@ -0,0 +1,53 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +#' @include arrowExports.R + +#' Arrow scalars +#' +#' @description +#' `Scalar`s are used to store a singular value of an arrow `DataType` +#' +#' @name Scalar +#' @rdname Scalar +#' @export +Scalar <- R6Class("Scalar", inherit = ArrowObject, + public = list( +ToString = function() Scalar__ToString(self), +as_vector = function() Scalar__as_vector(self) + ), + active = list( +is_valid = function() Scalar__is_valid(self), +type = function() DataType$create(Scalar__type(self)) + ) +) +Scalar$create <- function(x) { + # TODO: it would probably be best if an explicit type could be provided + if (!inherits(x, "externalptr")) { +x <- Scalar__create(x) + } + shared_ptr(Scalar, x) +} + +#' @export +length.Scalar <- function(x) 1 Review comment: ```suggestion length.Scalar <- function(x) 1L ``` `length` should be an explicit integer, for consistency ## File path: r/src/expression.cpp ## @@ -104,57 +104,9 @@ std::shared_ptr dataset___expr__is_valid( // [[arrow::export]] std::shared_ptr dataset___expr__scalar(SEXP x) { - switch (TYPEOF(x)) { -case NILSXP: - return ds::scalar(std::make_shared()); -case LGLSXP: - return ds::scalar(Rf_asLogical(x)); -case REALSXP: - if (Rf_inherits(x, "Date")) { -return ds::scalar(std::make_shared(REAL(x)[0])); - } else if (Rf_inherits(x, "POSIXct")) { -return ds::scalar(std::make_shared( -REAL(x)[0], arrow::timestamp(arrow::TimeUnit::SECOND))); - } else if (Rf_inherits(x, "integer64")) { -int64_t value = *reinterpret_cast(REAL(x)); -return ds::scalar(value); - } else if (Rf_inherits(x, "difftime")) { -int multiplier = 0; -// TODO: shared with TimeConverter<> in array_from_vector.cpp -std::string unit(CHAR(STRING_ELT(Rf_getAttrib(x, arrow::r::symbols::units), 0))); -if (unit == "secs") { - multiplier = 1; -} else if (unit == "mins") { - multiplier = 60; -} else if (unit == "hours") { - multiplier = 3600; -} else if (unit == "days") { - multiplier = 86400; -} else if (unit == "weeks") { - multiplier = 604800; -} else { - Rcpp::stop("unknown difftime unit"); -} -return ds::scalar(std::make_shared( -static_cast(REAL(x)[0] * multiplier), -arrow::time32(arrow::TimeUnit::SECOND))); - } - return ds::scalar(Rf_asReal(x)); -case INTSXP: - if (Rf_inherits(x, "factor")) { -// TODO: This does not use the actual value, just the levels -auto type = arrow::r::InferArrowTypeFromFactor(x); -return ds::scalar(std::make_shared(type)); - } - return ds::scalar(Rf_asInteger(x)); -case STRSXP: - return ds::scalar(CHAR(STRING_ELT(x, 0))); -default: - Rcpp::stop( - tfm::format("R object of type %s not supported", Rf_type2char(TYPEOF(x; - } - - return nullptr; + // defined in scalar.cpp + std::shared_ptr Scalar__create(SEXP x); Review comment: Does this need to be added to the header file that is included here? I seem to recall having to do that before. ## File path: r/R/scalar.R ## @@ -0,0 +1,53 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License
[GitHub] [arrow] github-actions[bot] commented on pull request #7295: ARROW-8968: [C++][gandiva] set data layout for pre-compiled IR to llvm::module
github-actions[bot] commented on pull request #7295: URL: https://github.com/apache/arrow/pull/7295#issuecomment-634893457 https://issues.apache.org/jira/browse/ARROW-8968 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] kiszk commented on a change in pull request #7274: ARROW-8960: [MINOR] [FORMAT] fix typo
kiszk commented on a change in pull request #7274: URL: https://github.com/apache/arrow/pull/7274#discussion_r431387864 ## File path: CHANGELOG.md ## @@ -866,7 +866,7 @@ * ARROW-5809 - [Rust] Dockerize (add to docker-compose) Rust Travis CI build * ARROW-5831 - [Release] Migrate and improve binary release verification script * ARROW-5855 - [Python] Add support for Duration type -* ARROW-5859 - [Python] Support ExtentionType on conversion to numpy/pandas +* ARROW-5859 - [Python] Support ExtensionType on conversion to numpy/pandas Review comment: According to https://github.com/apache/arrow/pull/7274#discussion_r430541033, it looks necessary to revert all of changes in `CHANGELOG.md`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] kiszk opened a new pull request #7295: ARROW-8968: [C++][gandiva] set data layout for pre-compiled IR to llvm::module
kiszk opened a new pull request #7295: URL: https://github.com/apache/arrow/pull/7295 This PR explicitly sets data layout for pre-compiled IR to `llvm:module`. Without the PR, the following warning message is shown at link time of gandiva. Most of this code comes from https://reviews.llvm.org/D80130 by @imaihal in llvm. ``` ~/arrow/cpp/src/gandiva$ ../../build/debug/gandiva-binary-test -V Running main() from /home/ishizaki/arrow/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest_main.cc [==] Running 1 test from 1 test case. [--] Global test environment set-up. [--] 1 test from TestBinary [ RUN ] TestBinary.TestSimple warning: Linking two modules of different data layouts: 'precompiled' is 'E-m:e-i1:8:16-i8:8:16-i64:64-f128:64-a:8:16-n32:64' whereas 'codegen' is 'E-m:e-i1:8:16-i8:8:16-i64:64-f128:64-v128:64-a:8:16-n32:64' [ OK ] TestBinary.TestSimple (41 ms) [--] 1 test from TestBinary (41 ms total) [--] Global test environment tear-down [==] 1 test from 1 test case ran. (41 ms total) [ PASSED ] 1 test. ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] itamarst commented on a change in pull request #7169: ARROW-5359: [Python] Support loading non-nanosecond out-of-range timestamps
itamarst commented on a change in pull request #7169: URL: https://github.com/apache/arrow/pull/7169#discussion_r431376285 ## File path: python/pyarrow/tests/test_pandas.py ## @@ -3965,8 +3964,32 @@ def test_timestamp_as_object(): np.datetime64('2050-05-03 15:42', 'ns'), ], }) +# Not part of what we're testing, just ensuring that the inputs are what we +# expect. +assert (df.dateTimeMs.dtype, df.dateTimeNs.dtype) == ( +# O == object,
[GitHub] [arrow] jorisvandenbossche commented on pull request #7169: ARROW-5359: [Python] Support loading non-nanosecond out-of-range timestamps
jorisvandenbossche commented on pull request #7169: URL: https://github.com/apache/arrow/pull/7169#issuecomment-634876862 @itamarst sorry for the slow follow-up. I think I am OK with the current semantics of the option This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] itamarst commented on pull request #7169: ARROW-5359: [Python] Support loading non-nanosecond out-of-range timestamps
itamarst commented on pull request #7169: URL: https://github.com/apache/arrow/pull/7169#issuecomment-634853818 @jorisvandenbossche are you happy with this as is, or do you want semantic changes to the flag? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] github-actions[bot] commented on pull request #7294: upgrading pip/wheel/setuptools
github-actions[bot] commented on pull request #7294: URL: https://github.com/apache/arrow/pull/7294#issuecomment-634842663 Thanks for opening a pull request! Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW Then could you also rename pull request title in the following format? ARROW-${JIRA_ID}: [${COMPONENT}] ${SUMMARY} See also: * [Other pull requests](https://github.com/apache/arrow/pulls/) * [Contribution Guidelines - How to contribute patches](https://arrow.apache.org/docs/developers/contributing.html#how-to-contribute-patches) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] BinduAggarwal opened a new pull request #7294: upgrading pip/wheel/setuptools
BinduAggarwal opened a new pull request #7294: URL: https://github.com/apache/arrow/pull/7294 This is for the fix of CVE issue, upgraded the pip version CVE-2018-20225 An issue was discovered in pip (all versions) because it installs the version with the highest version number, even if the user had intended to obtain a private package from a private index. This only affects use of the --extra-index-url option, and exploitation requires that the package does not already exist in the public index (and thus the attacker can put the package there with an arbitrary version number). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] BinduAggarwal commented on issue #7293: update Pip version
BinduAggarwal commented on issue #7293: URL: https://github.com/apache/arrow/issues/7293#issuecomment-634825586 sure I will do it This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] fsaintjacques closed pull request #7286: ARROW-8962: [C++] Add explicit implementation for junk values
fsaintjacques closed pull request #7286: URL: https://github.com/apache/arrow/pull/7286 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] nealrichardson commented on issue #7293: update Pip version
nealrichardson commented on issue #7293: URL: https://github.com/apache/arrow/issues/7293#issuecomment-634814027 @BinduAggarwal would you be interested in opening a JIRA issue and a pull request? Looks like the source of this requirements.txt file has been updated as you describe: https://github.com/pypa/manylinux/blob/manylinux1/docker/build_scripts/requirements.txt This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] BinduAggarwal opened a new issue #7293: update Pip version
BinduAggarwal opened a new issue #7293: URL: https://github.com/apache/arrow/issues/7293 Could you please update the pip latest version 20.1 https://github.com/apache/arrow/blob/2688a62f8179f20c20c06a10fcd22fe8a714ae48/python/manylinux1/scripts/requirements.txt CVE-2018-20225 An issue was discovered in pip (all versions) because it installs the version with the highest version number, even if the user had intended to obtain a private package from a private index. This only affects use of the --extra-index-url option, and exploitation requires that the package does not already exist in the public index (and thus the attacker can put the package there with an arbitrary version number). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] nealrichardson commented on a change in pull request #7287: ARROW-8771: [C++] Add boost/process library to build support
nealrichardson commented on a change in pull request #7287: URL: https://github.com/apache/arrow/pull/7287#discussion_r431282877 ## File path: cpp/build-support/trim-boost.sh ## @@ -41,6 +41,8 @@ BOOST_LIBS="$BOOST_LIBS regex.hpp" BOOST_LIBS="$BOOST_LIBS multiprecision/cpp_int.hpp" # These are for Thrift when Thrift_SOURCE=BUNDLED BOOST_LIBS="$BOOST_LIBS algorithm/string.hpp locale.hpp noncopyable.hpp numeric/conversion/cast.hpp scope_exit.hpp scoped_array.hpp shared_array.hpp tokenizer.hpp version.hpp" +#These are for flight +BOOST_LIBS="$BOOST_LIBS process.hpp process asio fusion" Review comment: I thought you only needed process.hpp? Why also `process asio fusion`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] nealrichardson commented on pull request #7287: ARROW-8771: [C++] Add boost/process library to build support
nealrichardson commented on pull request #7287: URL: https://github.com/apache/arrow/pull/7287#issuecomment-634786539 Yes, we will. Good test of the process we (hopefully) set up before. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] lidavidm commented on pull request #7277: ARROW-8957: [FlightRPC][C++] directly use IpcWriteOptions
lidavidm commented on pull request #7277: URL: https://github.com/apache/arrow/pull/7277#issuecomment-634783693 I've removed `std::move`. Sorry for the trouble. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] wesm commented on a change in pull request #7279: ARROW-8938: [R] Provide binding for arrow::compute::CallFunction
wesm commented on a change in pull request #7279: URL: https://github.com/apache/arrow/pull/7279#discussion_r431269117 ## File path: r/src/compute.cpp ## @@ -223,4 +205,66 @@ std::shared_ptr Table__FilterChunked( } return tab; } + +arrow::Datum to_datum(SEXP x) { Review comment: I also don't think it needs to be exposed ## File path: r/src/compute.cpp ## @@ -223,4 +205,66 @@ std::shared_ptr Table__FilterChunked( } return tab; } + +arrow::Datum to_datum(SEXP x) { + // TODO: this is repetitive, can we DRY it out? + if (Rf_inherits(x, "ArrowObject") && Rf_inherits(x, "Array")) { +Rcpp::ConstReferenceSmartPtrInputParameter> obj(x); +return static_cast>(obj); + } else if (Rf_inherits(x, "ArrowObject") && Rf_inherits(x, "ChunkedArray")) { + Rcpp::ConstReferenceSmartPtrInputParameter> obj(x); +return static_cast>(obj); + } else if (Rf_inherits(x, "ArrowObject") && Rf_inherits(x, "RecordBatch")) { + Rcpp::ConstReferenceSmartPtrInputParameter> obj(x); +return static_cast>(obj); + } else if (Rf_inherits(x, "ArrowObject") && Rf_inherits(x, "Table")) { +Rcpp::ConstReferenceSmartPtrInputParameter> obj(x); +return static_cast>(obj); + } else { +// TODO: scalar? Review comment: Can we create an R6 class wrapper for `arrow::Scalar`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] kiszk edited a comment on pull request #7277: ARROW-8957: [FlightRPC][C++] directly use IpcWriteOptions
kiszk edited a comment on pull request #7277: URL: https://github.com/apache/arrow/pull/7277#issuecomment-634737470 Thanks, we still have build failure in another place at https://travis-ci.org/github/apache/arrow/jobs/691713153#L1015 @lidavidm can we remove `std::move()` as a compiler suggested? In https://github.com/apache/arrow/blob/master/cpp/src/arrow/flight/server.cc#L118, `std::move()` is not called. ``` /arrow/cpp/src/arrow/flight/client.cc:391:50: required from here /arrow/cpp/src/arrow/flight/client.cc:411:28: error: moving a local object in a return statement prevents copy elision [-Werror=pessimizing-move] 411 | return std::move(result); |^ /arrow/cpp/src/arrow/flight/client.cc:411:28: note: remove 'std::move' call /arrow/cpp/src/arrow/flight/client.cc: In instantiation of 'arrow::Result > arrow::flight::GrpcIpcMessageReader::ReadNextMessage() [with Reader = grpc_impl::ClientReaderWriter]': /arrow/cpp/src/arrow/flight/client.cc:391:50: required from here /arrow/cpp/src/arrow/flight/client.cc:411:28: error: moving a local object in a return statement prevents copy elision [-Werror=pessimizing-move] /arrow/cpp/src/arrow/flight/client.cc:411:28: note: remove 'std::move' call cc1plus: all warnings being treated as errors ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] bkietz commented on a change in pull request #7279: ARROW-8938: [R] Provide binding for arrow::compute::CallFunction
bkietz commented on a change in pull request #7279: URL: https://github.com/apache/arrow/pull/7279#discussion_r431273570 ## File path: r/src/compute.cpp ## @@ -223,4 +205,66 @@ std::shared_ptr Table__FilterChunked( } return tab; } + +arrow::Datum to_datum(SEXP x) { Review comment: How's this: ```c++ template std::shared_ptr MaybeUnbox(const char* class_name, SEXP x) { if (Rf_inherits(x, "ArrowObject") && Rf_inherits(x, class_name)) { Rcpp::ConstReferenceSmartPtrInputParameter> obj(x); return static_cast>(obj); } return nullptr; } arrow::Datum to_datum(SEXP x) { if (auto array = MaybeUnbox("Array", x)) { return array; } if (auto chunked_array = MaybeUnbox("ChunkedArray", x)) { return chunked_array; } // ... ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] fsaintjacques commented on pull request #7287: ARROW-8771: [C++] Add boost/process library to build support
fsaintjacques commented on pull request #7287: URL: https://github.com/apache/arrow/pull/7287#issuecomment-634777086 @nealrichardson we might need a re-upload. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] bkietz commented on a change in pull request #7279: ARROW-8938: [R] Provide binding for arrow::compute::CallFunction
bkietz commented on a change in pull request #7279: URL: https://github.com/apache/arrow/pull/7279#discussion_r431266235 ## File path: r/src/compute.cpp ## @@ -223,4 +205,66 @@ std::shared_ptr Table__FilterChunked( } return tab; } + +arrow::Datum to_datum(SEXP x) { + // TODO: this is repetitive, can we DRY it out? + if (Rf_inherits(x, "ArrowObject") && Rf_inherits(x, "Array")) { +Rcpp::ConstReferenceSmartPtrInputParameter> obj(x); +return static_cast>(obj); + } else if (Rf_inherits(x, "ArrowObject") && Rf_inherits(x, "ChunkedArray")) { + Rcpp::ConstReferenceSmartPtrInputParameter> obj(x); +return static_cast>(obj); + } else if (Rf_inherits(x, "ArrowObject") && Rf_inherits(x, "RecordBatch")) { + Rcpp::ConstReferenceSmartPtrInputParameter> obj(x); +return static_cast>(obj); + } else if (Rf_inherits(x, "ArrowObject") && Rf_inherits(x, "Table")) { +Rcpp::ConstReferenceSmartPtrInputParameter> obj(x); +return static_cast>(obj); + } else { +// TODO: scalar? +// This assumes that R objects have already been converted to Arrow objects; +// that seems right but should we do the wrapping here too/instead? +Rcpp::stop("to_datum: Not implemented"); + } +} + +SEXP from_datum(arrow::Datum datum) { + if (datum.is_array()) { +return Rcpp::wrap(datum.make_array()); + } else if (datum.is_arraylike()) { +return Rcpp::wrap(datum.chunked_array()); + } else { +// TODO: the other datum types +Rcpp::stop("from_datum: Not implemented"); + } +} + +std::shared_ptr make_compute_options(std::string func_name, +List_ options) { + if (func_name == "filter") { +auto out = std::make_shared(arrow::compute::FilterOptions::Defaults()); +if (!Rf_isNull(options["keep_na"]) && options["keep_na"]) { + out->null_selection_behavior = arrow::compute::FilterOptions::EMIT_NULL; +} +return out; + } else if (func_name == "take") { +auto out = std::make_shared(arrow::compute::TakeOptions::Defaults()); +return out; + } else { +return nullptr; + } + // TODO: make sure the correct destructor is called? Review comment: ```suggestion ``` confirmed: `make_shared` sets the destructor to `deleter` (so we aren't relying on `~FunctionOptions` being virtual) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] bkietz commented on a change in pull request #7279: ARROW-8938: [R] Provide binding for arrow::compute::CallFunction
bkietz commented on a change in pull request #7279: URL: https://github.com/apache/arrow/pull/7279#discussion_r431263416 ## File path: r/src/compute.cpp ## @@ -223,4 +205,66 @@ std::shared_ptr Table__FilterChunked( } return tab; } + +arrow::Datum to_datum(SEXP x) { + // TODO: this is repetitive, can we DRY it out? + if (Rf_inherits(x, "ArrowObject") && Rf_inherits(x, "Array")) { +Rcpp::ConstReferenceSmartPtrInputParameter> obj(x); +return static_cast>(obj); + } else if (Rf_inherits(x, "ArrowObject") && Rf_inherits(x, "ChunkedArray")) { + Rcpp::ConstReferenceSmartPtrInputParameter> obj(x); +return static_cast>(obj); + } else if (Rf_inherits(x, "ArrowObject") && Rf_inherits(x, "RecordBatch")) { + Rcpp::ConstReferenceSmartPtrInputParameter> obj(x); +return static_cast>(obj); + } else if (Rf_inherits(x, "ArrowObject") && Rf_inherits(x, "Table")) { +Rcpp::ConstReferenceSmartPtrInputParameter> obj(x); +return static_cast>(obj); + } else { +// TODO: scalar? Review comment: I would recommend refactoring that function into `std::shared_ptr to_scalar(SEXP)`. That function need not be exported (same logic as `Datum`; `SEXP` already holds whatever value). The climax of `to_scalar` should be a call to `arrow::MakeScalar`. After this: ```c++ std::shared_ptr dataset___expr__scalar(SEXP x) { return ds::scalar(to_scalar(x)); } ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] nealrichardson commented on pull request #7279: ARROW-8938: [R] Provide binding for arrow::compute::CallFunction
nealrichardson commented on pull request #7279: URL: https://github.com/apache/arrow/pull/7279#issuecomment-634755810 @github-actions autotune This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] github-actions[bot] commented on pull request #7292: ARROW-8471: [C++][Integration] Represent 64 bit integers as strings
github-actions[bot] commented on pull request #7292: URL: https://github.com/apache/arrow/pull/7292#issuecomment-634752661 https://issues.apache.org/jira/browse/ARROW-8471 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] bkietz opened a new pull request #7292: ARROW-8471: [C++][Integration] Represent 64 bit integers as strings
bkietz opened a new pull request #7292: URL: https://github.com/apache/arrow/pull/7292 Some 64 bit integer values are not representable as JSON numbers, so store these as strings. This is issue is a regression; original fix was #5267 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] rdettai commented on pull request #7291: ARROW-8455: [Rust] Parquet Arrow column read on partially compatible files FIX
rdettai commented on pull request #7291: URL: https://github.com/apache/arrow/pull/7291#issuecomment-634741622 @maxburke this should definitively fix your issue ! Sorry for the inconvenience. There are really very few safety nets on this codebase :-) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] kiszk commented on pull request #7277: ARROW-8957: [FlightRPC][C++] directly use IpcWriteOptions
kiszk commented on pull request #7277: URL: https://github.com/apache/arrow/pull/7277#issuecomment-634737470 Thanks, we still have build failure in another place at https://travis-ci.org/github/apache/arrow/jobs/691713153#L1015 ``` /arrow/cpp/src/arrow/flight/client.cc:391:50: required from here /arrow/cpp/src/arrow/flight/client.cc:411:28: error: moving a local object in a return statement prevents copy elision [-Werror=pessimizing-move] 411 | return std::move(result); |^ /arrow/cpp/src/arrow/flight/client.cc:411:28: note: remove 'std::move' call /arrow/cpp/src/arrow/flight/client.cc: In instantiation of 'arrow::Result > arrow::flight::GrpcIpcMessageReader::ReadNextMessage() [with Reader = grpc_impl::ClientReaderWriter]': /arrow/cpp/src/arrow/flight/client.cc:391:50: required from here /arrow/cpp/src/arrow/flight/client.cc:411:28: error: moving a local object in a return statement prevents copy elision [-Werror=pessimizing-move] /arrow/cpp/src/arrow/flight/client.cc:411:28: note: remove 'std::move' call cc1plus: all warnings being treated as errors ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] github-actions[bot] commented on pull request #7291: ARROW-8455: [Rust] Parquet Arrow column read on partially compatible files FIX
github-actions[bot] commented on pull request #7291: URL: https://github.com/apache/arrow/pull/7291#issuecomment-634732266 https://issues.apache.org/jira/browse/ARROW-8455 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] rdettai opened a new pull request #7291: ARROW-8455: [Rust] Parquet Arrow column read on partially compatible files FIX
rdettai opened a new pull request #7291: URL: https://github.com/apache/arrow/pull/7291 This is a fix for the faulty https://github.com/apache/arrow/pull/6935 It addresses https://issues.apache.org/jira/browse/ARROW-8455 >Seen behavior: When reading a Parquet file into Arrow with get_record_reader_by_columns, it will fail if one of the column of the file is a list (or any other unsupported type). >Expected behavior: it should only fail if you are actually reading the column with unsuported type. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] wesm commented on pull request #7258: ARROW-8918: [C++][Python] Implement cast metafunction to allow use of "cast" with CallFunction, use in Python
wesm commented on pull request #7258: URL: https://github.com/apache/arrow/pull/7258#issuecomment-634725333 Could someone review this? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] github-actions[bot] commented on pull request #7290: ARROW-1692: [Java] UnionArray round trip not working
github-actions[bot] commented on pull request #7290: URL: https://github.com/apache/arrow/pull/7290#issuecomment-634652668 https://issues.apache.org/jira/browse/ARROW-1692 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] lidavidm commented on pull request #6656: ARROW-8297: [FlightRPC][C++] Implement Flight DoExchange for C++
lidavidm commented on pull request #6656: URL: https://github.com/apache/arrow/pull/6656#issuecomment-634650667 Sorry about that, for posterity this is being worked on in https://github.com/apache/arrow/pull/7277. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] rymurr opened a new pull request #7290: ARROW-1692: [Java] UnionArray round trip not working
rymurr opened a new pull request #7290: URL: https://github.com/apache/arrow/pull/7290 This fixes the integration tests for Union Arrays. Both Dense and Sparse unions now work: * validity buffer added to Sparse Union * both Union types now track logical types rather than MinorType enum ordinal Dependent on #7289 for duplicate field names. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] hn5092 commented on pull request #7288: ARROW-8963: [C++][Parquet] optimize LeafReader::NextBatch to save memory
hn5092 commented on pull request #7288: URL: https://github.com/apache/arrow/pull/7288#issuecomment-634647561 retest this please! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] lidavidm commented on a change in pull request #7277: ARROW-8957: [FlightRPC][C++] directly use IpcWriteOptions
lidavidm commented on a change in pull request #7277: URL: https://github.com/apache/arrow/pull/7277#discussion_r431066490 ## File path: cpp/src/arrow/flight/server.cc ## @@ -352,7 +352,7 @@ class DoExchangeMessageWriter : public FlightMessageWriter { } grpc::ServerReaderWriter* stream_; - ipc::IpcOptions ipc_options_; + ::arrow::ipc::IpcOptions ipc_options_; Review comment: Yes, my mistake, thank you both for pointing it out. I've updated this PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] github-actions[bot] commented on pull request #7289: ARROW-8948: [Java][Integration] enable duplicate field names integration tests
github-actions[bot] commented on pull request #7289: URL: https://github.com/apache/arrow/pull/7289#issuecomment-634593683 https://issues.apache.org/jira/browse/ARROW-8948 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] rymurr opened a new pull request #7289: ARROW-8948: [Java][Integration] enable duplicate field names integration tests
rymurr opened a new pull request #7289: URL: https://github.com/apache/arrow/pull/7289 This addresses the integration test error where Java cannot process duplicate field names. This extends StructVector and VectorSchemaRoot to support duplicate field names. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] cyb70289 commented on pull request #7285: ARROW-8843: [C++] Compare bitmaps in words
cyb70289 commented on pull request #7285: URL: https://github.com/apache/arrow/pull/7285#issuecomment-634555761 Before (apply only benchmark code) ``` BitmapEqualsWithoutOffset/8192195 ns 195 ns 3618366 bytes_per_second=39.1055G/s BitmapEqualsWithOffset/8192 91272 ns91235 ns 7673 bytes_per_second=85.6307M/s ``` After ``` BitmapEqualsWithoutOffset/8192193 ns 193 ns 3635971 bytes_per_second=39.6261G/s BitmapEqualsWithOffset/8192 1332 ns 1331 ns 510343 bytes_per_second=5.73058G/s This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] github-actions[bot] commented on pull request #7288: ARROW-8963: [C++][Parquet] optimize LeafReader::NextBatch to save memory
github-actions[bot] commented on pull request #7288: URL: https://github.com/apache/arrow/pull/7288#issuecomment-634551846 https://issues.apache.org/jira/browse/ARROW-8963 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] github-actions[bot] commented on pull request #7287: ARROW-8771: [C++] Add boost/process library to build support
github-actions[bot] commented on pull request #7287: URL: https://github.com/apache/arrow/pull/7287#issuecomment-634544583 https://issues.apache.org/jira/browse/ARROW-8771 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] hn5092 opened a new pull request #7288: Optimize parquet cpp reader
hn5092 opened a new pull request #7288: URL: https://github.com/apache/arrow/pull/7288 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] liyafan82 opened a new pull request #7287: ARROW-8771: [C++] Add boost/process library to build support
liyafan82 opened a new pull request #7287: URL: https://github.com/apache/arrow/pull/7287 Some of our test source code requires the process.hpp file (and its dependent libraries). Our current build support does not include these files, causing build failures like: `fatal error: boost/process.hpp: No such file or directory` In this PR, we add required header files which are sufficient to make flight build successful. The added header files are verified in my local machine. However, it seems the script is supposed to run from the server side, so we cannot test it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] rdettai commented on pull request #6935: ARROW-8455: [Rust] Parquet Arrow column read on partially compatible files
rdettai commented on pull request #6935: URL: https://github.com/apache/arrow/pull/6935#issuecomment-634495368 @maxburke there is indeed an issue with struct fields. @sunchao I don't know how you usually operate: should we reverse the merge then re-apply it with the fix or create a new PR with the fix directly ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] kiszk commented on a change in pull request #7277: ARROW-8957: [FlightRPC][C++] directly use IpcWriteOptions
kiszk commented on a change in pull request #7277: URL: https://github.com/apache/arrow/pull/7277#discussion_r430857565 ## File path: cpp/src/arrow/flight/server.cc ## @@ -352,7 +352,7 @@ class DoExchangeMessageWriter : public FlightMessageWriter { } grpc::ServerReaderWriter* stream_; - ipc::IpcOptions ipc_options_; + ::arrow::ipc::IpcOptions ipc_options_; Review comment: @lidavidm How about `::arrow::ipc::IpcWriteOptions ipc_options_;`? It can succeed to compile arrow-flight with and without `-DARROW_NO_DEPRECATED_API=ON`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] github-actions[bot] commented on pull request #7286: ARROW-8962: [C++] Add explicit implementation for junk values
github-actions[bot] commented on pull request #7286: URL: https://github.com/apache/arrow/pull/7286#issuecomment-634463719 https://issues.apache.org/jira/browse/ARROW-8962 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] xhochy opened a new pull request #7286: ARROW-8962: [C++] Add explicit implementation for junk values
xhochy opened a new pull request #7286: URL: https://github.com/apache/arrow/pull/7286 When linking the tests with `clang-4.0`, I get ``` Undefined symbols for architecture x86_64: "arrow::internal::(anonymous namespace)::StringToFloatConverterImpl::main_junk_value_", referenced from: arrow::internal::StringToFloat(char const*, unsigned long, float*) in libarrow.a(value_parsing.cc.o) arrow::internal::StringToFloat(char const*, unsigned long, double*) in libarrow.a(value_parsing.cc.o) "arrow::internal::(anonymous namespace)::StringToFloatConverterImpl::fallback_junk_value_", referenced from: arrow::internal::StringToFloat(char const*, unsigned long, float*) in libarrow.a(value_parsing.cc.o) arrow::internal::StringToFloat(char const*, unsigned long, double*) in libarrow.a(value_parsing.cc.o) ld: symbol(s) not found for architecture x86_64 ``` These older clang versions need an explicit implementation definition. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] liyafan82 commented on pull request #6425: ARROW-6111: [Java] Support LargeVarChar and LargeBinary types
liyafan82 commented on pull request #6425: URL: https://github.com/apache/arrow/pull/6425#issuecomment-634447500 > Great job getting integration tests passing @liyafan82 ! I took a quick look and LGTM, will do a more thorough pass hopefully tomorrow before merging. @BryanCutler Please take your time. Thanks a lot for your effort. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org