[GitHub] [arrow] kiszk commented on a change in pull request #7295: ARROW-8968: [C++][Gandiva] set data layout for pre-compiled IR to llvm::module

2020-05-27 Thread GitBox


kiszk commented on a change in pull request #7295:
URL: https://github.com/apache/arrow/pull/7295#discussion_r431566905



##
File path: cpp/src/gandiva/engine.cc
##
@@ -150,6 +152,27 @@ Status Engine::Make(const std::shared_ptr& 
conf,
   return Status::OK();
 }
 
+static void setDataLayout(llvm::Module* module) {

Review comment:
   Thank you, addressed all of them.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] maxburke commented on a change in pull request #7291: ARROW-8455: [Rust] Parquet Arrow column read on partially compatible files FIX

2020-05-27 Thread GitBox


maxburke commented on a change in pull request #7291:
URL: https://github.com/apache/arrow/pull/7291#discussion_r431553463



##
File path: rust/parquet/src/arrow/array_reader.rs
##
@@ -609,23 +609,34 @@ where
 {
 let mut leaves = HashMap::<*const Type, usize>::new();
 
-let mut filtered_fields: Vec> = Vec::new();
+let mut filtered_root_names = HashSetnew();
 
 for c in column_indices {
 let column = parquet_schema.column(c).self_type() as *const Type;
 leaves.insert(column, c);
 
 let root = parquet_schema.get_column_root_ptr(c);
-filtered_fields.push(root);
+filtered_root_names.insert(root.name().to_string());
 }
 
 if leaves.is_empty() {
 return Err(general_err!("Can't build array reader without columns!"));
 }
 
+// Only pass root fields that take part in the projection
+// to avoid traversal of columns that are not read.
+// TODO: also prune unread parts of the tree in child structures
+let filtered_root_fields = parquet_schema
+.root_schema()
+.get_fields()
+.into_iter()
+.filter(|field| filtered_root_names.contains(field.name()))
+.map(|field| field.clone())

Review comment:
   I think this could probably use `.cloned()` instead of `.map(|field| 
field.clone())`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] sunchao commented on pull request #7291: ARROW-8455: [Rust] Parquet Arrow column read on partially compatible files FIX

2020-05-27 Thread GitBox


sunchao commented on pull request #7291:
URL: https://github.com/apache/arrow/pull/7291#issuecomment-635063404


   Merged. Thanks @rdettai for the fix and @maxburke for reporting this!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] sunchao closed pull request #7291: ARROW-8455: [Rust] Parquet Arrow column read on partially compatible files FIX

2020-05-27 Thread GitBox


sunchao closed pull request #7291:
URL: https://github.com/apache/arrow/pull/7291


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] maxburke commented on pull request #7291: ARROW-8455: [Rust] Parquet Arrow column read on partially compatible files FIX

2020-05-27 Thread GitBox


maxburke commented on pull request #7291:
URL: https://github.com/apache/arrow/pull/7291#issuecomment-635062239


   @rdettai Your change works for me! Thank you so much for looking into this! 
:smile: 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] liyafan82 commented on pull request #6425: ARROW-6111: [Java] Support LargeVarChar and LargeBinary types

2020-05-27 Thread GitBox


liyafan82 commented on pull request #6425:
URL: https://github.com/apache/arrow/pull/6425#issuecomment-635050822


   @BryanCutler @siddharthteotia @emkornfield I am aware that this PR is large, 
and consumes lots of effort. So thanks a lot for your effort and good comments!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] liyafan82 commented on a change in pull request #6425: ARROW-6111: [Java] Support LargeVarChar and LargeBinary types

2020-05-27 Thread GitBox


liyafan82 commented on a change in pull request #6425:
URL: https://github.com/apache/arrow/pull/6425#discussion_r431542563



##
File path: 
java/vector/src/main/java/org/apache/arrow/vector/validate/ValidateVectorVisitor.java
##
@@ -90,6 +91,11 @@ public Void visit(BaseVariableWidthVector vector, Void 
value) {
 return null;
   }
 
+  @Override
+  public Void visit(BaseLargeVariableWidthVector left, Void value) {
+return null;

Review comment:
   Sure. I will fix this problem in ARROW-8402, and will open other JIRAs 
to track other TODOs. Thanks for your reminder. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] liyafan82 commented on a change in pull request #7287: ARROW-8771: [C++] Add boost/process library to build support

2020-05-27 Thread GitBox


liyafan82 commented on a change in pull request #7287:
URL: https://github.com/apache/arrow/pull/7287#discussion_r431540394



##
File path: cpp/build-support/trim-boost.sh
##
@@ -41,6 +41,8 @@ BOOST_LIBS="$BOOST_LIBS regex.hpp"
 BOOST_LIBS="$BOOST_LIBS multiprecision/cpp_int.hpp"
 # These are for Thrift when Thrift_SOURCE=BUNDLED
 BOOST_LIBS="$BOOST_LIBS algorithm/string.hpp locale.hpp noncopyable.hpp 
numeric/conversion/cast.hpp scope_exit.hpp scoped_array.hpp shared_array.hpp 
tokenizer.hpp version.hpp"
+#These are for flight
+BOOST_LIBS="$BOOST_LIBS process.hpp process asio fusion"

Review comment:
   Thanks for your comments. 
   
   Some other header files are referenced (directly or indirectly) from 
process.hpp. 
   I tried to refine sub-directories in process, asio and fusion, but gave up 
because too many individual individual header files were required. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] kiszk edited a comment on pull request #7295: ARROW-8968: [C++][Gandiva] set data layout for pre-compiled IR to llvm::module

2020-05-27 Thread GitBox


kiszk edited a comment on pull request #7295:
URL: https://github.com/apache/arrow/pull/7295#issuecomment-635036852


   Regarding the licence, I will add a link to the file at 
https://github.com/llvm/llvm-project/blob/master/mlir/LICENSE.TXT.
   Could you please let me know if we need to create a separate file as 
https://github.com/apache/arrow/blob/master/cpp/src/arrow/util/bpacking_default.h



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] kiszk commented on pull request #7295: ARROW-8968: [C++][Gandiva] set data layout for pre-compiled IR to llvm::module

2020-05-27 Thread GitBox


kiszk commented on pull request #7295:
URL: https://github.com/apache/arrow/pull/7295#issuecomment-635036852


   Regarding the licence, I will add a link to the file at 
https://github.com/llvm/llvm-project/blob/master/mlir/LICENSE.TXT.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] github-actions[bot] commented on pull request #7280: ARROW-8158: [C++] Upgrade to LLVM 9

2020-05-27 Thread GitBox


github-actions[bot] commented on pull request #7280:
URL: https://github.com/apache/arrow/pull/7280#issuecomment-635028483


   Revision: a47ab63fab16246cac07c2ed7bed3050744d2144
   
   Submitted crossbow builds: [ursa-labs/crossbow @ 
actions-269](https://github.com/ursa-labs/crossbow/branches/all?query=actions-269)
   
   |Task|Status|
   ||--|
   |centos-6-amd64|[![Github 
Actions](https://github.com/ursa-labs/crossbow/workflows/Crossbow/badge.svg?branch=actions-269-github-centos-6-amd64)](https://github.com/ursa-labs/crossbow/actions?query=branch:actions-269-github-centos-6-amd64)|
   
|centos-7-aarch64|[![TravisCI](https://img.shields.io/travis/ursa-labs/crossbow/actions-269-travis-centos-7-aarch64.svg)](https://travis-ci.org/ursa-labs/crossbow/branches)|
   |centos-7-amd64|[![Github 
Actions](https://github.com/ursa-labs/crossbow/workflows/Crossbow/badge.svg?branch=actions-269-github-centos-7-amd64)](https://github.com/ursa-labs/crossbow/actions?query=branch:actions-269-github-centos-7-amd64)|
   
|centos-8-aarch64|[![TravisCI](https://img.shields.io/travis/ursa-labs/crossbow/actions-269-travis-centos-8-aarch64.svg)](https://travis-ci.org/ursa-labs/crossbow/branches)|
   |centos-8-amd64|[![Github 
Actions](https://github.com/ursa-labs/crossbow/workflows/Crossbow/badge.svg?branch=actions-269-github-centos-8-amd64)](https://github.com/ursa-labs/crossbow/actions?query=branch:actions-269-github-centos-8-amd64)|
   
|conda-linux-gcc-py36|[![Azure](https://dev.azure.com/ursa-labs/crossbow/_apis/build/status/ursa-labs.crossbow?branchName=actions-269-azure-conda-linux-gcc-py36)](https://dev.azure.com/ursa-labs/crossbow/_build/latest?definitionId=1=actions-269-azure-conda-linux-gcc-py36)|
   
|conda-linux-gcc-py37|[![Azure](https://dev.azure.com/ursa-labs/crossbow/_apis/build/status/ursa-labs.crossbow?branchName=actions-269-azure-conda-linux-gcc-py37)](https://dev.azure.com/ursa-labs/crossbow/_build/latest?definitionId=1=actions-269-azure-conda-linux-gcc-py37)|
   
|conda-linux-gcc-py38|[![Azure](https://dev.azure.com/ursa-labs/crossbow/_apis/build/status/ursa-labs.crossbow?branchName=actions-269-azure-conda-linux-gcc-py38)](https://dev.azure.com/ursa-labs/crossbow/_build/latest?definitionId=1=actions-269-azure-conda-linux-gcc-py38)|
   
|conda-osx-clang-py36|[![Azure](https://dev.azure.com/ursa-labs/crossbow/_apis/build/status/ursa-labs.crossbow?branchName=actions-269-azure-conda-osx-clang-py36)](https://dev.azure.com/ursa-labs/crossbow/_build/latest?definitionId=1=actions-269-azure-conda-osx-clang-py36)|
   
|conda-osx-clang-py37|[![Azure](https://dev.azure.com/ursa-labs/crossbow/_apis/build/status/ursa-labs.crossbow?branchName=actions-269-azure-conda-osx-clang-py37)](https://dev.azure.com/ursa-labs/crossbow/_build/latest?definitionId=1=actions-269-azure-conda-osx-clang-py37)|
   
|conda-osx-clang-py38|[![Azure](https://dev.azure.com/ursa-labs/crossbow/_apis/build/status/ursa-labs.crossbow?branchName=actions-269-azure-conda-osx-clang-py38)](https://dev.azure.com/ursa-labs/crossbow/_build/latest?definitionId=1=actions-269-azure-conda-osx-clang-py38)|
   
|conda-win-vs2015-py36|[![Azure](https://dev.azure.com/ursa-labs/crossbow/_apis/build/status/ursa-labs.crossbow?branchName=actions-269-azure-conda-win-vs2015-py36)](https://dev.azure.com/ursa-labs/crossbow/_build/latest?definitionId=1=actions-269-azure-conda-win-vs2015-py36)|
   
|conda-win-vs2015-py37|[![Azure](https://dev.azure.com/ursa-labs/crossbow/_apis/build/status/ursa-labs.crossbow?branchName=actions-269-azure-conda-win-vs2015-py37)](https://dev.azure.com/ursa-labs/crossbow/_build/latest?definitionId=1=actions-269-azure-conda-win-vs2015-py37)|
   
|conda-win-vs2015-py38|[![Azure](https://dev.azure.com/ursa-labs/crossbow/_apis/build/status/ursa-labs.crossbow?branchName=actions-269-azure-conda-win-vs2015-py38)](https://dev.azure.com/ursa-labs/crossbow/_build/latest?definitionId=1=actions-269-azure-conda-win-vs2015-py38)|
   |debian-buster-amd64|[![Github 
Actions](https://github.com/ursa-labs/crossbow/workflows/Crossbow/badge.svg?branch=actions-269-github-debian-buster-amd64)](https://github.com/ursa-labs/crossbow/actions?query=branch:actions-269-github-debian-buster-amd64)|
   
|debian-buster-arm64|[![TravisCI](https://img.shields.io/travis/ursa-labs/crossbow/actions-269-travis-debian-buster-arm64.svg)](https://travis-ci.org/ursa-labs/crossbow/branches)|
   |debian-stretch-amd64|[![Github 
Actions](https://github.com/ursa-labs/crossbow/workflows/Crossbow/badge.svg?branch=actions-269-github-debian-stretch-amd64)](https://github.com/ursa-labs/crossbow/actions?query=branch:actions-269-github-debian-stretch-amd64)|
   
|debian-stretch-arm64|[![TravisCI](https://img.shields.io/travis/ursa-labs/crossbow/actions-269-travis-debian-stretch-arm64.svg)](https://travis-ci.org/ursa-labs/crossbow/branches)|
   

[GitHub] [arrow] kou commented on pull request #7280: ARROW-8158: [C++] Upgrade to LLVM 9

2020-05-27 Thread GitBox


kou commented on pull request #7280:
URL: https://github.com/apache/arrow/pull/7280#issuecomment-635025560


   @github-actions crossbow submit -g nightly



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] kou commented on pull request #7294: ARROW-8971: [Python] Upgrade pip

2020-05-27 Thread GitBox


kou commented on pull request #7294:
URL: https://github.com/apache/arrow/pull/7294#issuecomment-635004181


   @github-actions crossbow submit wheel-manylinux1-*



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] kou commented on pull request #7294: ARROW-8971: [Python] Upgrade pip

2020-05-27 Thread GitBox


kou commented on pull request #7294:
URL: https://github.com/apache/arrow/pull/7294#issuecomment-635002228


   @github-actions crossbow submit -g wheel



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] kou commented on pull request #7294: ARROW-8971: [Python] Upgrade pip

2020-05-27 Thread GitBox


kou commented on pull request #7294:
URL: https://github.com/apache/arrow/pull/7294#issuecomment-634970128


   @github-actions crossbow submit wheel-manylinux1-*



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] github-actions[bot] commented on pull request #7294: ARROW-8971: [Python] Upgrade pip

2020-05-27 Thread GitBox


github-actions[bot] commented on pull request #7294:
URL: https://github.com/apache/arrow/pull/7294#issuecomment-634967465


   https://issues.apache.org/jira/browse/ARROW-8971



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] BryanCutler commented on pull request #6425: ARROW-6111: [Java] Support LargeVarChar and LargeBinary types

2020-05-27 Thread GitBox


BryanCutler commented on pull request #6425:
URL: https://github.com/apache/arrow/pull/6425#issuecomment-634963132


   merged to master, thanks @liyafan82 !



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] BryanCutler closed pull request #6425: ARROW-6111: [Java] Support LargeVarChar and LargeBinary types

2020-05-27 Thread GitBox


BryanCutler closed pull request #6425:
URL: https://github.com/apache/arrow/pull/6425


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] BryanCutler commented on a change in pull request #6425: ARROW-6111: [Java] Support LargeVarChar and LargeBinary types

2020-05-27 Thread GitBox


BryanCutler commented on a change in pull request #6425:
URL: https://github.com/apache/arrow/pull/6425#discussion_r431461407



##
File path: 
java/vector/src/main/java/org/apache/arrow/vector/validate/ValidateVectorVisitor.java
##
@@ -90,6 +91,11 @@ public Void visit(BaseVariableWidthVector vector, Void 
value) {
 return null;
   }
 
+  @Override
+  public Void visit(BaseLargeVariableWidthVector left, Void value) {
+return null;

Review comment:
   is this another TODO? Would you mind making JIRAs for this and the other 
TODOs here so they can be tracked?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] sunchao commented on pull request #6935: ARROW-8455: [Rust] Parquet Arrow column read on partially compatible files

2020-05-27 Thread GitBox


sunchao commented on pull request #6935:
URL: https://github.com/apache/arrow/pull/6935#issuecomment-634961391


   @rdettai looks like the fix is ready? I think we can just get the fix in 
(probably today) instead of reverting the other one.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] BinduAggarwal commented on a change in pull request #7294: ARROW-${TODO}: [Python] Upgrade pip

2020-05-27 Thread GitBox


BinduAggarwal commented on a change in pull request #7294:
URL: https://github.com/apache/arrow/pull/7294#discussion_r431459925



##
File path: python/manylinux1/scripts/requirements.txt
##
@@ -23,12 +23,14 @@
 
 # pip requirements for all cpythons
 # NOTE: pip has GPG signatures; could download and verify independently.
-pip==19.0.3 \
-
--hash=sha256:6e6f197a1abfb45118dbb878b5c859a0edbdd33fd250100bc015b67fded4b9f2 \
-
--hash=sha256:bd812612bbd8ba84159d9ddc0266b7fbce712fc9bc98c82dee5750546ec8ec64
-wheel==0.31.1 \
-
--hash=sha256:80044e51ec5bbf6c894ba0bc48d26a8c20a9ba629f4ca19ea26ecfcf87685f5f \
-
--hash=sha256:0a2e54558a0628f2145d2fc822137e322412115173e8a2ddbe1c9024338ae83c
-setuptools==40.7.3 \
-
--hash=sha256:4f4acaf06d617dccfd3fbbc9fbd83cf4749759a1fa2bdf589206a3278e0d537a \
-
--hash=sha256:702fdd31cb10a65a94beba1a7d89219a58d2587a349e0a1b7827b133e99ca430
+# pip requirements for all cpythons
+# NOTE: pip has GPG signatures; could download and verify independently.

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] kou commented on pull request #7294: ARROW-7293: [Python] Upgrade pip

2020-05-27 Thread GitBox


kou commented on pull request #7294:
URL: https://github.com/apache/arrow/pull/7294#issuecomment-634958008


   Could you open a new issue on 
https://issues.apache.org/jira/projects/ARROW/issues/ and use the JIRA issue ID 
(not pull request ID) in pull request title?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] kou closed issue #7293: update Pip version

2020-05-27 Thread GitBox


kou closed issue #7293:
URL: https://github.com/apache/arrow/issues/7293


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] kou commented on a change in pull request #7294: ARROW-7293: [Python] upgrading pip

2020-05-27 Thread GitBox


kou commented on a change in pull request #7294:
URL: https://github.com/apache/arrow/pull/7294#discussion_r431457963



##
File path: python/manylinux1/scripts/requirements.txt
##
@@ -23,12 +23,14 @@
 
 # pip requirements for all cpythons
 # NOTE: pip has GPG signatures; could download and verify independently.
-pip==19.0.3 \
-
--hash=sha256:6e6f197a1abfb45118dbb878b5c859a0edbdd33fd250100bc015b67fded4b9f2 \
-
--hash=sha256:bd812612bbd8ba84159d9ddc0266b7fbce712fc9bc98c82dee5750546ec8ec64
-wheel==0.31.1 \
-
--hash=sha256:80044e51ec5bbf6c894ba0bc48d26a8c20a9ba629f4ca19ea26ecfcf87685f5f \
-
--hash=sha256:0a2e54558a0628f2145d2fc822137e322412115173e8a2ddbe1c9024338ae83c
-setuptools==40.7.3 \
-
--hash=sha256:4f4acaf06d617dccfd3fbbc9fbd83cf4749759a1fa2bdf589206a3278e0d537a \
-
--hash=sha256:702fdd31cb10a65a94beba1a7d89219a58d2587a349e0a1b7827b133e99ca430
+# pip requirements for all cpythons
+# NOTE: pip has GPG signatures; could download and verify independently.

Review comment:
   Could you remove this duplicated comment?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] wesm commented on a change in pull request #7295: ARROW-8968: [C++][Gandiva] set data layout for pre-compiled IR to llvm::module

2020-05-27 Thread GitBox


wesm commented on a change in pull request #7295:
URL: https://github.com/apache/arrow/pull/7295#discussion_r431454890



##
File path: cpp/src/gandiva/engine.cc
##
@@ -150,6 +152,27 @@ Status Engine::Make(const std::shared_ptr& 
conf,
   return Status::OK();
 }
 
+static void setDataLayout(llvm::Module* module) {

Review comment:
   nit: SetDataLayout

##
File path: cpp/src/gandiva/engine.cc
##
@@ -150,6 +152,27 @@ Status Engine::Make(const std::shared_ptr& 
conf,
   return Status::OK();
 }
 
+static void setDataLayout(llvm::Module* module) {
+  auto targetTriple = llvm::sys::getDefaultTargetTriple();

Review comment:
   nit: use lower_with_underscores for local variables

##
File path: cpp/src/gandiva/engine.cc
##
@@ -150,6 +152,27 @@ Status Engine::Make(const std::shared_ptr& 
conf,
   return Status::OK();
 }
 
+static void setDataLayout(llvm::Module* module) {
+  auto targetTriple = llvm::sys::getDefaultTargetTriple();
+  std::string errorMessage;
+  auto target = llvm::TargetRegistry::lookupTarget(targetTriple, errorMessage);
+  if (!target) {
+return;
+  }
+
+  std::string cpu(llvm::sys::getHostCPUName());
+  llvm::SubtargetFeatures features;
+  llvm::StringMap hostFeatures;
+
+  if (llvm::sys::getHostCPUFeatures(hostFeatures))
+for (auto& f : hostFeatures) features.AddFeature(f.first(), f.second);

Review comment:
   nit: please use braces





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] github-actions[bot] commented on pull request #7296: ARROW-8793: [C++] Do not inline BitUtil::SetBitsTo

2020-05-27 Thread GitBox


github-actions[bot] commented on pull request #7296:
URL: https://github.com/apache/arrow/pull/7296#issuecomment-634952786


   https://issues.apache.org/jira/browse/ARROW-8793



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] wesm opened a new pull request #7296: ARROW-8793: [C++] Do not inline BitUtil::SetBitsTo

2020-05-27 Thread GitBox


wesm opened a new pull request #7296:
URL: https://github.com/apache/arrow/pull/7296


   Local benchmarks suggest that there is 3-4 ns of overhead for manipulating 
small bitmaps. 
   
   inline (argument is byte size of bitmap):
   
   ```
   
   Benchmark Time   CPU Iterations
   
   SetBitsTo/2   3 ns  3 ns  271439203   733.294MB/s
   SetBitsTo/16  2 ns  2 ns  308813758   6.49485GB/s
   SetBitsTo/10249 ns  9 ns   79821710   109.078GB/s
   SetBitsTo/131072   2029 ns   2029 ns 325563   60.1566GB/s
   ```
   
   non-inline:
   
   ```
   
   Benchmark Time   CPU Iterations
   
   SetBitsTo/2   6 ns  6 ns  129335891334.62MB/s
   SetBitsTo/16  6 ns  6 ns  122741527   2.53134GB/s
   SetBitsTo/1024   11 ns 11 ns   64547137   87.1395GB/s
   SetBitsTo/131072   2010 ns   2010 ns 332558   60.7215GB/s
   ```
   
   If it can be demonstrated that inlining can meaningfully improve the 
macroperformance of some function, then we could lift the implementation into a 
`SetBitsToInline`, but until then I think it makes sense to keep it in 
bit_util.cc



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] kou commented on a change in pull request #7284: ARROW-7409: [C++][Python] Windows link error LNK1104: cannot open file 'python37_d.lib'

2020-05-27 Thread GitBox


kou commented on a change in pull request #7284:
URL: https://github.com/apache/arrow/pull/7284#discussion_r431438314



##
File path: cpp/cmake_modules/DefineOptions.cmake
##
@@ -120,6 +120,8 @@ if("${CMAKE_SOURCE_DIR}" STREQUAL 
"${CMAKE_CURRENT_SOURCE_DIR}")
 
   define_option(ARROW_GGDB_DEBUG "Pass -ggdb flag to debug builds" ON)
 
+  define_option(ARROW_USE_DEPRECATED_FIND_PYTHONLIBS "Force using deprecated 
FindPythonLibsNew script" OFF)

Review comment:
   Could you remove this?
   This isn't used for forcing using deprecated `FindPythonLibsNew`.

##
File path: cpp/src/arrow/python/CMakeLists.txt
##
@@ -60,7 +60,11 @@ endif()
 
 set(ARROW_PYTHON_SHARED_LINK_LIBS arrow_shared)
 if(WIN32)
-  list(APPEND ARROW_PYTHON_SHARED_LINK_LIBS ${PYTHON_LIBRARIES} 
${PYTHON_OTHER_LIBS})
+  if(ARROW_USE_DEPRECATED_FIND_PYTHONLIBS)
+list(APPEND ARROW_PYTHON_SHARED_LINK_LIBS 
$<$:${PYTHON_DEBUG_LIBRARIES}> 
$<$:${PYTHON_LIBRARIES}>)
+  else()
+list(APPEND ARROW_PYTHON_SHARED_LINK_LIBS ${PYTHON_OTHER_LIBS})

Review comment:
   `${PYTHON_OTHER_LIBS}` should be always used.

##
File path: cpp/cmake_modules/FindPythonLibsNew.cmake
##
@@ -246,7 +248,7 @@ FUNCTION(PYTHON_ADD_MODULE _NAME )
   SET_TARGET_PROPERTIES(${_NAME} PROPERTIES LINK_FLAGS
   "-undefined dynamic_lookup")
 ELSEIF(MSVC)
-  target_link_libraries(${_NAME} ${PYTHON_LIBRARIES})
+  target_link_libraries(${_NAME} ${PYTHON_LIBRARIES} 
$<$:${PYTHON_DEBUG_LIBRARIES}> 
$<$:${PYTHON_LIBRARIES}>)

Review comment:
   `${PYTHON_LIBRARIES}` is duplicated in the `Release` configuration.
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] wesm closed pull request #7277: ARROW-8957: [FlightRPC][C++] directly use IpcWriteOptions

2020-05-27 Thread GitBox


wesm closed pull request #7277:
URL: https://github.com/apache/arrow/pull/7277


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] github-actions[bot] commented on pull request #7280: ARROW-8158: [C++] Upgrade to LLVM 9

2020-05-27 Thread GitBox


github-actions[bot] commented on pull request #7280:
URL: https://github.com/apache/arrow/pull/7280#issuecomment-634929177


   Revision: dc97db6abe1291c8b3808b62c29808f04ca33c43
   
   Submitted crossbow builds: [ursa-labs/crossbow @ 
actions-268](https://github.com/ursa-labs/crossbow/branches/all?query=actions-268)
   
   |Task|Status|
   ||--|
   |centos-6-amd64|[![Github 
Actions](https://github.com/ursa-labs/crossbow/workflows/Crossbow/badge.svg?branch=actions-268-github-centos-6-amd64)](https://github.com/ursa-labs/crossbow/actions?query=branch:actions-268-github-centos-6-amd64)|
   
|centos-7-aarch64|[![TravisCI](https://img.shields.io/travis/ursa-labs/crossbow/actions-268-travis-centos-7-aarch64.svg)](https://travis-ci.org/ursa-labs/crossbow/branches)|
   |centos-7-amd64|[![Github 
Actions](https://github.com/ursa-labs/crossbow/workflows/Crossbow/badge.svg?branch=actions-268-github-centos-7-amd64)](https://github.com/ursa-labs/crossbow/actions?query=branch:actions-268-github-centos-7-amd64)|
   
|centos-8-aarch64|[![TravisCI](https://img.shields.io/travis/ursa-labs/crossbow/actions-268-travis-centos-8-aarch64.svg)](https://travis-ci.org/ursa-labs/crossbow/branches)|
   |centos-8-amd64|[![Github 
Actions](https://github.com/ursa-labs/crossbow/workflows/Crossbow/badge.svg?branch=actions-268-github-centos-8-amd64)](https://github.com/ursa-labs/crossbow/actions?query=branch:actions-268-github-centos-8-amd64)|
   
|conda-linux-gcc-py36|[![Azure](https://dev.azure.com/ursa-labs/crossbow/_apis/build/status/ursa-labs.crossbow?branchName=actions-268-azure-conda-linux-gcc-py36)](https://dev.azure.com/ursa-labs/crossbow/_build/latest?definitionId=1=actions-268-azure-conda-linux-gcc-py36)|
   
|conda-linux-gcc-py37|[![Azure](https://dev.azure.com/ursa-labs/crossbow/_apis/build/status/ursa-labs.crossbow?branchName=actions-268-azure-conda-linux-gcc-py37)](https://dev.azure.com/ursa-labs/crossbow/_build/latest?definitionId=1=actions-268-azure-conda-linux-gcc-py37)|
   
|conda-linux-gcc-py38|[![Azure](https://dev.azure.com/ursa-labs/crossbow/_apis/build/status/ursa-labs.crossbow?branchName=actions-268-azure-conda-linux-gcc-py38)](https://dev.azure.com/ursa-labs/crossbow/_build/latest?definitionId=1=actions-268-azure-conda-linux-gcc-py38)|
   
|conda-osx-clang-py36|[![Azure](https://dev.azure.com/ursa-labs/crossbow/_apis/build/status/ursa-labs.crossbow?branchName=actions-268-azure-conda-osx-clang-py36)](https://dev.azure.com/ursa-labs/crossbow/_build/latest?definitionId=1=actions-268-azure-conda-osx-clang-py36)|
   
|conda-osx-clang-py37|[![Azure](https://dev.azure.com/ursa-labs/crossbow/_apis/build/status/ursa-labs.crossbow?branchName=actions-268-azure-conda-osx-clang-py37)](https://dev.azure.com/ursa-labs/crossbow/_build/latest?definitionId=1=actions-268-azure-conda-osx-clang-py37)|
   
|conda-osx-clang-py38|[![Azure](https://dev.azure.com/ursa-labs/crossbow/_apis/build/status/ursa-labs.crossbow?branchName=actions-268-azure-conda-osx-clang-py38)](https://dev.azure.com/ursa-labs/crossbow/_build/latest?definitionId=1=actions-268-azure-conda-osx-clang-py38)|
   
|conda-win-vs2015-py36|[![Azure](https://dev.azure.com/ursa-labs/crossbow/_apis/build/status/ursa-labs.crossbow?branchName=actions-268-azure-conda-win-vs2015-py36)](https://dev.azure.com/ursa-labs/crossbow/_build/latest?definitionId=1=actions-268-azure-conda-win-vs2015-py36)|
   
|conda-win-vs2015-py37|[![Azure](https://dev.azure.com/ursa-labs/crossbow/_apis/build/status/ursa-labs.crossbow?branchName=actions-268-azure-conda-win-vs2015-py37)](https://dev.azure.com/ursa-labs/crossbow/_build/latest?definitionId=1=actions-268-azure-conda-win-vs2015-py37)|
   
|conda-win-vs2015-py38|[![Azure](https://dev.azure.com/ursa-labs/crossbow/_apis/build/status/ursa-labs.crossbow?branchName=actions-268-azure-conda-win-vs2015-py38)](https://dev.azure.com/ursa-labs/crossbow/_build/latest?definitionId=1=actions-268-azure-conda-win-vs2015-py38)|
   |debian-buster-amd64|[![Github 
Actions](https://github.com/ursa-labs/crossbow/workflows/Crossbow/badge.svg?branch=actions-268-github-debian-buster-amd64)](https://github.com/ursa-labs/crossbow/actions?query=branch:actions-268-github-debian-buster-amd64)|
   
|debian-buster-arm64|[![TravisCI](https://img.shields.io/travis/ursa-labs/crossbow/actions-268-travis-debian-buster-arm64.svg)](https://travis-ci.org/ursa-labs/crossbow/branches)|
   |debian-stretch-amd64|[![Github 
Actions](https://github.com/ursa-labs/crossbow/workflows/Crossbow/badge.svg?branch=actions-268-github-debian-stretch-amd64)](https://github.com/ursa-labs/crossbow/actions?query=branch:actions-268-github-debian-stretch-amd64)|
   
|debian-stretch-arm64|[![TravisCI](https://img.shields.io/travis/ursa-labs/crossbow/actions-268-travis-debian-stretch-arm64.svg)](https://travis-ci.org/ursa-labs/crossbow/branches)|
   

[GitHub] [arrow] kou commented on pull request #7280: ARROW-8158: [C++] Upgrade to LLVM 9

2020-05-27 Thread GitBox


kou commented on pull request #7280:
URL: https://github.com/apache/arrow/pull/7280#issuecomment-634928303


   @github-actions crossbow submit -g nightly



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] nealrichardson commented on a change in pull request #7279: ARROW-8938: [R] Provide binding for arrow::compute::CallFunction

2020-05-27 Thread GitBox


nealrichardson commented on a change in pull request #7279:
URL: https://github.com/apache/arrow/pull/7279#discussion_r431423981



##
File path: r/src/expression.cpp
##
@@ -104,57 +104,9 @@ std::shared_ptr dataset___expr__is_valid(
 
 // [[arrow::export]]
 std::shared_ptr dataset___expr__scalar(SEXP x) {
-  switch (TYPEOF(x)) {
-case NILSXP:
-  return ds::scalar(std::make_shared());
-case LGLSXP:
-  return ds::scalar(Rf_asLogical(x));
-case REALSXP:
-  if (Rf_inherits(x, "Date")) {
-return ds::scalar(std::make_shared(REAL(x)[0]));
-  } else if (Rf_inherits(x, "POSIXct")) {
-return ds::scalar(std::make_shared(
-REAL(x)[0], arrow::timestamp(arrow::TimeUnit::SECOND)));
-  } else if (Rf_inherits(x, "integer64")) {
-int64_t value = *reinterpret_cast(REAL(x));
-return ds::scalar(value);
-  } else if (Rf_inherits(x, "difftime")) {
-int multiplier = 0;
-// TODO: shared with TimeConverter<> in array_from_vector.cpp
-std::string unit(CHAR(STRING_ELT(Rf_getAttrib(x, 
arrow::r::symbols::units), 0)));
-if (unit == "secs") {
-  multiplier = 1;
-} else if (unit == "mins") {
-  multiplier = 60;
-} else if (unit == "hours") {
-  multiplier = 3600;
-} else if (unit == "days") {
-  multiplier = 86400;
-} else if (unit == "weeks") {
-  multiplier = 604800;
-} else {
-  Rcpp::stop("unknown difftime unit");
-}
-return ds::scalar(std::make_shared(
-static_cast(REAL(x)[0] * multiplier),
-arrow::time32(arrow::TimeUnit::SECOND)));
-  }
-  return ds::scalar(Rf_asReal(x));
-case INTSXP:
-  if (Rf_inherits(x, "factor")) {
-// TODO: This does not use the actual value, just the levels
-auto type = arrow::r::InferArrowTypeFromFactor(x);
-return ds::scalar(std::make_shared(type));
-  }
-  return ds::scalar(Rf_asInteger(x));
-case STRSXP:
-  return ds::scalar(CHAR(STRING_ELT(x, 0)));
-default:
-  Rcpp::stop(
-  tfm::format("R object of type %s not supported", 
Rf_type2char(TYPEOF(x;
-  }
-
-  return nullptr;
+  // defined in scalar.cpp
+  std::shared_ptr Scalar__create(SEXP x);

Review comment:
   路 if it works, I'm fine with it, I don't have opinions about proper C++ 
style--I just do what the linter tells me to do.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] bkietz commented on a change in pull request #7279: ARROW-8938: [R] Provide binding for arrow::compute::CallFunction

2020-05-27 Thread GitBox


bkietz commented on a change in pull request #7279:
URL: https://github.com/apache/arrow/pull/7279#discussion_r431417510



##
File path: r/src/compute.cpp
##
@@ -206,60 +206,91 @@ std::shared_ptr Table__FilterChunked(
   return tab;
 }
 
+template 
+std::shared_ptr MaybeUnbox(const char* class_name, SEXP x) {
+  if (Rf_inherits(x, "ArrowObject") && Rf_inherits(x, class_name)) {
+Rcpp::ConstReferenceSmartPtrInputParameter> obj(x);
+return static_cast>(obj);
+  }
+  return nullptr;
+}
+
 arrow::Datum to_datum(SEXP x) {
-  // TODO: this is repetitive, can we DRY it out?
-  if (Rf_inherits(x, "ArrowObject") && Rf_inherits(x, "Array")) {
-Rcpp::ConstReferenceSmartPtrInputParameter> 
obj(x);
-return static_cast>(obj);
-  } else if (Rf_inherits(x, "ArrowObject") && Rf_inherits(x, "ChunkedArray")) {
-
Rcpp::ConstReferenceSmartPtrInputParameter>
 obj(x);
-return static_cast>(obj);
-  } else if (Rf_inherits(x, "ArrowObject") && Rf_inherits(x, "RecordBatch")) {
-
Rcpp::ConstReferenceSmartPtrInputParameter> 
obj(x);
-return static_cast>(obj);
-  } else if (Rf_inherits(x, "ArrowObject") && Rf_inherits(x, "Table")) {
-Rcpp::ConstReferenceSmartPtrInputParameter> 
obj(x);
-return static_cast>(obj);
-  } else {
-// TODO: scalar?
-// This assumes that R objects have already been converted to Arrow 
objects;
-// that seems right but should we do the wrapping here too/instead?
-Rcpp::stop("to_datum: Not implemented");
+  if (auto array = MaybeUnbox("Array", x)) {
+return array;
+  }
+
+  if (auto chunked_array = MaybeUnbox("ChunkedArray", x)) 
{
+return chunked_array;
+  }
+
+  if (auto batch = MaybeUnbox("RecordBatch", x)) {
+return batch;
+  }
+
+  if (auto table = MaybeUnbox("Table", x)) {
+return table;
+  }
+
+  if (auto scalar = MaybeUnbox("Scalar", x)) {
+return scalar;
   }
+
+  // This assumes that R objects have already been converted to Arrow objects;
+  // that seems right but should we do the wrapping here too/instead?
+  Rcpp::stop("to_datum: Not implemented for type %s", Rf_type2char(TYPEOF(x)));

Review comment:
   Alright





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] nealrichardson commented on a change in pull request #7279: ARROW-8938: [R] Provide binding for arrow::compute::CallFunction

2020-05-27 Thread GitBox


nealrichardson commented on a change in pull request #7279:
URL: https://github.com/apache/arrow/pull/7279#discussion_r431404447



##
File path: r/R/scalar.R
##
@@ -0,0 +1,53 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+#' @include arrowExports.R
+
+#' Arrow scalars
+#'
+#' @description
+#' `Scalar`s are used to store a singular value of an arrow `DataType`
+#'
+#' @name Scalar
+#' @rdname Scalar
+#' @export
+Scalar <- R6Class("Scalar", inherit = ArrowObject,
+  public = list(
+ToString = function() Scalar__ToString(self),
+as_vector = function() Scalar__as_vector(self)
+  ),
+  active = list(
+is_valid = function() Scalar__is_valid(self),
+type = function() DataType$create(Scalar__type(self))
+  )
+)
+Scalar$create <- function(x) {
+  # TODO: it would probably be best if an explicit type could be provided
+  if (!inherits(x, "externalptr")) {
+x <- Scalar__create(x)
+  }
+  shared_ptr(Scalar, x)
+}
+
+#' @export
+length.Scalar <- function(x) 1

Review comment:
   ```suggestion
   length.Scalar <- function(x) 1L
   ```
   
   `length` should be an explicit integer, for consistency

##
File path: r/src/expression.cpp
##
@@ -104,57 +104,9 @@ std::shared_ptr dataset___expr__is_valid(
 
 // [[arrow::export]]
 std::shared_ptr dataset___expr__scalar(SEXP x) {
-  switch (TYPEOF(x)) {
-case NILSXP:
-  return ds::scalar(std::make_shared());
-case LGLSXP:
-  return ds::scalar(Rf_asLogical(x));
-case REALSXP:
-  if (Rf_inherits(x, "Date")) {
-return ds::scalar(std::make_shared(REAL(x)[0]));
-  } else if (Rf_inherits(x, "POSIXct")) {
-return ds::scalar(std::make_shared(
-REAL(x)[0], arrow::timestamp(arrow::TimeUnit::SECOND)));
-  } else if (Rf_inherits(x, "integer64")) {
-int64_t value = *reinterpret_cast(REAL(x));
-return ds::scalar(value);
-  } else if (Rf_inherits(x, "difftime")) {
-int multiplier = 0;
-// TODO: shared with TimeConverter<> in array_from_vector.cpp
-std::string unit(CHAR(STRING_ELT(Rf_getAttrib(x, 
arrow::r::symbols::units), 0)));
-if (unit == "secs") {
-  multiplier = 1;
-} else if (unit == "mins") {
-  multiplier = 60;
-} else if (unit == "hours") {
-  multiplier = 3600;
-} else if (unit == "days") {
-  multiplier = 86400;
-} else if (unit == "weeks") {
-  multiplier = 604800;
-} else {
-  Rcpp::stop("unknown difftime unit");
-}
-return ds::scalar(std::make_shared(
-static_cast(REAL(x)[0] * multiplier),
-arrow::time32(arrow::TimeUnit::SECOND)));
-  }
-  return ds::scalar(Rf_asReal(x));
-case INTSXP:
-  if (Rf_inherits(x, "factor")) {
-// TODO: This does not use the actual value, just the levels
-auto type = arrow::r::InferArrowTypeFromFactor(x);
-return ds::scalar(std::make_shared(type));
-  }
-  return ds::scalar(Rf_asInteger(x));
-case STRSXP:
-  return ds::scalar(CHAR(STRING_ELT(x, 0)));
-default:
-  Rcpp::stop(
-  tfm::format("R object of type %s not supported", 
Rf_type2char(TYPEOF(x;
-  }
-
-  return nullptr;
+  // defined in scalar.cpp
+  std::shared_ptr Scalar__create(SEXP x);

Review comment:
   Does this need to be added to the header file that is included here? I 
seem to recall having to do that before.

##
File path: r/R/scalar.R
##
@@ -0,0 +1,53 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License 

[GitHub] [arrow] github-actions[bot] commented on pull request #7295: ARROW-8968: [C++][gandiva] set data layout for pre-compiled IR to llvm::module

2020-05-27 Thread GitBox


github-actions[bot] commented on pull request #7295:
URL: https://github.com/apache/arrow/pull/7295#issuecomment-634893457


   https://issues.apache.org/jira/browse/ARROW-8968



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] kiszk commented on a change in pull request #7274: ARROW-8960: [MINOR] [FORMAT] fix typo

2020-05-27 Thread GitBox


kiszk commented on a change in pull request #7274:
URL: https://github.com/apache/arrow/pull/7274#discussion_r431387864



##
File path: CHANGELOG.md
##
@@ -866,7 +866,7 @@
 * ARROW-5809 - [Rust] Dockerize (add to docker-compose) Rust Travis CI build
 * ARROW-5831 - [Release] Migrate and improve binary release verification script
 * ARROW-5855 - [Python] Add support for Duration type
-* ARROW-5859 - [Python] Support ExtentionType on conversion to numpy/pandas
+* ARROW-5859 - [Python] Support ExtensionType on conversion to numpy/pandas

Review comment:
   According to 
https://github.com/apache/arrow/pull/7274#discussion_r430541033, it looks 
necessary to revert all of changes in `CHANGELOG.md`.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] kiszk opened a new pull request #7295: ARROW-8968: [C++][gandiva] set data layout for pre-compiled IR to llvm::module

2020-05-27 Thread GitBox


kiszk opened a new pull request #7295:
URL: https://github.com/apache/arrow/pull/7295


   This PR explicitly sets data layout for pre-compiled IR to `llvm:module`. 
Without the PR, the following warning message is shown at link time of gandiva.
   
   Most of this code comes from https://reviews.llvm.org/D80130 by @imaihal in 
llvm.
   
   ```
   ~/arrow/cpp/src/gandiva$ ../../build/debug/gandiva-binary-test -V
   Running main() from 
/home/ishizaki/arrow/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest_main.cc
   [==] Running 1 test from 1 test case.
   [--] Global test environment set-up.
   [--] 1 test from TestBinary
   [ RUN  ] TestBinary.TestSimple
   warning: Linking two modules of different data layouts: 'precompiled' is 
'E-m:e-i1:8:16-i8:8:16-i64:64-f128:64-a:8:16-n32:64' whereas 'codegen' is 
'E-m:e-i1:8:16-i8:8:16-i64:64-f128:64-v128:64-a:8:16-n32:64'
   
   [   OK ] TestBinary.TestSimple (41 ms)
   [--] 1 test from TestBinary (41 ms total)
   
   [--] Global test environment tear-down
   [==] 1 test from 1 test case ran. (41 ms total)
   [  PASSED  ] 1 test.
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] itamarst commented on a change in pull request #7169: ARROW-5359: [Python] Support loading non-nanosecond out-of-range timestamps

2020-05-27 Thread GitBox


itamarst commented on a change in pull request #7169:
URL: https://github.com/apache/arrow/pull/7169#discussion_r431376285



##
File path: python/pyarrow/tests/test_pandas.py
##
@@ -3965,8 +3964,32 @@ def test_timestamp_as_object():
 np.datetime64('2050-05-03 15:42', 'ns'),
 ],
 })
+# Not part of what we're testing, just ensuring that the inputs are what we
+# expect.
+assert (df.dateTimeMs.dtype, df.dateTimeNs.dtype) == (
+# O == object, 

[GitHub] [arrow] jorisvandenbossche commented on pull request #7169: ARROW-5359: [Python] Support loading non-nanosecond out-of-range timestamps

2020-05-27 Thread GitBox


jorisvandenbossche commented on pull request #7169:
URL: https://github.com/apache/arrow/pull/7169#issuecomment-634876862


   @itamarst sorry for the slow follow-up. I think I am OK with the current 
semantics of the option



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] itamarst commented on pull request #7169: ARROW-5359: [Python] Support loading non-nanosecond out-of-range timestamps

2020-05-27 Thread GitBox


itamarst commented on pull request #7169:
URL: https://github.com/apache/arrow/pull/7169#issuecomment-634853818


   @jorisvandenbossche are you happy with this as is, or do you want semantic 
changes to the flag?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] github-actions[bot] commented on pull request #7294: upgrading pip/wheel/setuptools

2020-05-27 Thread GitBox


github-actions[bot] commented on pull request #7294:
URL: https://github.com/apache/arrow/pull/7294#issuecomment-634842663


   
   
   Thanks for opening a pull request!
   
   Could you open an issue for this pull request on JIRA?
   https://issues.apache.org/jira/browse/ARROW
   
   Then could you also rename pull request title in the following format?
   
   ARROW-${JIRA_ID}: [${COMPONENT}] ${SUMMARY}
   
   See also:
   
 * [Other pull requests](https://github.com/apache/arrow/pulls/)
 * [Contribution Guidelines - How to contribute 
patches](https://arrow.apache.org/docs/developers/contributing.html#how-to-contribute-patches)
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] BinduAggarwal opened a new pull request #7294: upgrading pip/wheel/setuptools

2020-05-27 Thread GitBox


BinduAggarwal opened a new pull request #7294:
URL: https://github.com/apache/arrow/pull/7294


   This is for the fix of CVE issue, upgraded the pip version
   CVE-2018-20225
   An issue was discovered in pip (all versions) because it installs the 
version with the highest version number, even if the user had intended to 
obtain a private package from a private index. This only affects use of the 
--extra-index-url option, and exploitation requires that the package does not 
already exist in the public index (and thus the attacker can put the package 
there with an arbitrary version number).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] BinduAggarwal commented on issue #7293: update Pip version

2020-05-27 Thread GitBox


BinduAggarwal commented on issue #7293:
URL: https://github.com/apache/arrow/issues/7293#issuecomment-634825586


   sure I will do it



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] fsaintjacques closed pull request #7286: ARROW-8962: [C++] Add explicit implementation for junk values

2020-05-27 Thread GitBox


fsaintjacques closed pull request #7286:
URL: https://github.com/apache/arrow/pull/7286


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] nealrichardson commented on issue #7293: update Pip version

2020-05-27 Thread GitBox


nealrichardson commented on issue #7293:
URL: https://github.com/apache/arrow/issues/7293#issuecomment-634814027


   @BinduAggarwal would you be interested in opening a JIRA issue and a pull 
request? Looks like the source of this requirements.txt file has been updated 
as you describe: 
https://github.com/pypa/manylinux/blob/manylinux1/docker/build_scripts/requirements.txt



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] BinduAggarwal opened a new issue #7293: update Pip version

2020-05-27 Thread GitBox


BinduAggarwal opened a new issue #7293:
URL: https://github.com/apache/arrow/issues/7293


   Could you please update the pip latest version 20.1
   
   
https://github.com/apache/arrow/blob/2688a62f8179f20c20c06a10fcd22fe8a714ae48/python/manylinux1/scripts/requirements.txt
   
   CVE-2018-20225
   An issue was discovered in pip (all versions) because it installs the 
version with the highest version number, even if the user had intended to 
obtain a private package from a private index. This only affects use of the 
--extra-index-url option, and exploitation requires that the package does not 
already exist in the public index (and thus the attacker can put the package 
there with an arbitrary version number).
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] nealrichardson commented on a change in pull request #7287: ARROW-8771: [C++] Add boost/process library to build support

2020-05-27 Thread GitBox


nealrichardson commented on a change in pull request #7287:
URL: https://github.com/apache/arrow/pull/7287#discussion_r431282877



##
File path: cpp/build-support/trim-boost.sh
##
@@ -41,6 +41,8 @@ BOOST_LIBS="$BOOST_LIBS regex.hpp"
 BOOST_LIBS="$BOOST_LIBS multiprecision/cpp_int.hpp"
 # These are for Thrift when Thrift_SOURCE=BUNDLED
 BOOST_LIBS="$BOOST_LIBS algorithm/string.hpp locale.hpp noncopyable.hpp 
numeric/conversion/cast.hpp scope_exit.hpp scoped_array.hpp shared_array.hpp 
tokenizer.hpp version.hpp"
+#These are for flight
+BOOST_LIBS="$BOOST_LIBS process.hpp process asio fusion"

Review comment:
   I thought you only needed process.hpp? Why also `process asio fusion`?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] nealrichardson commented on pull request #7287: ARROW-8771: [C++] Add boost/process library to build support

2020-05-27 Thread GitBox


nealrichardson commented on pull request #7287:
URL: https://github.com/apache/arrow/pull/7287#issuecomment-634786539


   Yes, we will. Good test of the process we (hopefully) set up before.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] lidavidm commented on pull request #7277: ARROW-8957: [FlightRPC][C++] directly use IpcWriteOptions

2020-05-27 Thread GitBox


lidavidm commented on pull request #7277:
URL: https://github.com/apache/arrow/pull/7277#issuecomment-634783693


   I've removed `std::move`. Sorry for the trouble.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] wesm commented on a change in pull request #7279: ARROW-8938: [R] Provide binding for arrow::compute::CallFunction

2020-05-27 Thread GitBox


wesm commented on a change in pull request #7279:
URL: https://github.com/apache/arrow/pull/7279#discussion_r431269117



##
File path: r/src/compute.cpp
##
@@ -223,4 +205,66 @@ std::shared_ptr Table__FilterChunked(
   }
   return tab;
 }
+
+arrow::Datum to_datum(SEXP x) {

Review comment:
   I also don't think it needs to be exposed

##
File path: r/src/compute.cpp
##
@@ -223,4 +205,66 @@ std::shared_ptr Table__FilterChunked(
   }
   return tab;
 }
+
+arrow::Datum to_datum(SEXP x) {
+  // TODO: this is repetitive, can we DRY it out?
+  if (Rf_inherits(x, "ArrowObject") && Rf_inherits(x, "Array")) {
+Rcpp::ConstReferenceSmartPtrInputParameter> 
obj(x);
+return static_cast>(obj);
+  } else if (Rf_inherits(x, "ArrowObject") && Rf_inherits(x, "ChunkedArray")) {
+
Rcpp::ConstReferenceSmartPtrInputParameter>
 obj(x);
+return static_cast>(obj);
+  } else if (Rf_inherits(x, "ArrowObject") && Rf_inherits(x, "RecordBatch")) {
+
Rcpp::ConstReferenceSmartPtrInputParameter> 
obj(x);
+return static_cast>(obj);
+  } else if (Rf_inherits(x, "ArrowObject") && Rf_inherits(x, "Table")) {
+Rcpp::ConstReferenceSmartPtrInputParameter> 
obj(x);
+return static_cast>(obj);
+  } else {
+// TODO: scalar?

Review comment:
   Can we create an R6 class wrapper for `arrow::Scalar`? 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] kiszk edited a comment on pull request #7277: ARROW-8957: [FlightRPC][C++] directly use IpcWriteOptions

2020-05-27 Thread GitBox


kiszk edited a comment on pull request #7277:
URL: https://github.com/apache/arrow/pull/7277#issuecomment-634737470


   Thanks, we still have build failure in another place at 
https://travis-ci.org/github/apache/arrow/jobs/691713153#L1015
   
   @lidavidm can we remove `std::move()` as a compiler suggested? In 
https://github.com/apache/arrow/blob/master/cpp/src/arrow/flight/server.cc#L118,
 `std::move()` is not called.
   
   ```
   /arrow/cpp/src/arrow/flight/client.cc:391:50:   required from here
   /arrow/cpp/src/arrow/flight/client.cc:411:28: error: moving a local object 
in a return statement prevents copy elision [-Werror=pessimizing-move]
 411 | return std::move(result);
 |^
   /arrow/cpp/src/arrow/flight/client.cc:411:28: note: remove 'std::move' call
   /arrow/cpp/src/arrow/flight/client.cc: In instantiation of 
'arrow::Result > 
arrow::flight::GrpcIpcMessageReader::ReadNextMessage() [with Reader = 
grpc_impl::ClientReaderWriter]':
   /arrow/cpp/src/arrow/flight/client.cc:391:50:   required from here
   /arrow/cpp/src/arrow/flight/client.cc:411:28: error: moving a local object 
in a return statement prevents copy elision [-Werror=pessimizing-move]
   /arrow/cpp/src/arrow/flight/client.cc:411:28: note: remove 'std::move' call
   cc1plus: all warnings being treated as errors
   ```
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] bkietz commented on a change in pull request #7279: ARROW-8938: [R] Provide binding for arrow::compute::CallFunction

2020-05-27 Thread GitBox


bkietz commented on a change in pull request #7279:
URL: https://github.com/apache/arrow/pull/7279#discussion_r431273570



##
File path: r/src/compute.cpp
##
@@ -223,4 +205,66 @@ std::shared_ptr Table__FilterChunked(
   }
   return tab;
 }
+
+arrow::Datum to_datum(SEXP x) {

Review comment:
   How's this:
   ```c++
   template 
   std::shared_ptr MaybeUnbox(const char* class_name, SEXP x) {
 if (Rf_inherits(x, "ArrowObject") && Rf_inherits(x, class_name)) {
   Rcpp::ConstReferenceSmartPtrInputParameter> obj(x);
   return static_cast>(obj);
 }
 return nullptr;
   }
   
   arrow::Datum to_datum(SEXP x) {
 if (auto array = MaybeUnbox("Array", x)) {
   return array;
 }
   
 if (auto chunked_array = MaybeUnbox("ChunkedArray", 
x)) {
   return chunked_array;
 }
   
 // ...
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] fsaintjacques commented on pull request #7287: ARROW-8771: [C++] Add boost/process library to build support

2020-05-27 Thread GitBox


fsaintjacques commented on pull request #7287:
URL: https://github.com/apache/arrow/pull/7287#issuecomment-634777086


   @nealrichardson we might need a re-upload.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] bkietz commented on a change in pull request #7279: ARROW-8938: [R] Provide binding for arrow::compute::CallFunction

2020-05-27 Thread GitBox


bkietz commented on a change in pull request #7279:
URL: https://github.com/apache/arrow/pull/7279#discussion_r431266235



##
File path: r/src/compute.cpp
##
@@ -223,4 +205,66 @@ std::shared_ptr Table__FilterChunked(
   }
   return tab;
 }
+
+arrow::Datum to_datum(SEXP x) {
+  // TODO: this is repetitive, can we DRY it out?
+  if (Rf_inherits(x, "ArrowObject") && Rf_inherits(x, "Array")) {
+Rcpp::ConstReferenceSmartPtrInputParameter> 
obj(x);
+return static_cast>(obj);
+  } else if (Rf_inherits(x, "ArrowObject") && Rf_inherits(x, "ChunkedArray")) {
+
Rcpp::ConstReferenceSmartPtrInputParameter>
 obj(x);
+return static_cast>(obj);
+  } else if (Rf_inherits(x, "ArrowObject") && Rf_inherits(x, "RecordBatch")) {
+
Rcpp::ConstReferenceSmartPtrInputParameter> 
obj(x);
+return static_cast>(obj);
+  } else if (Rf_inherits(x, "ArrowObject") && Rf_inherits(x, "Table")) {
+Rcpp::ConstReferenceSmartPtrInputParameter> 
obj(x);
+return static_cast>(obj);
+  } else {
+// TODO: scalar?
+// This assumes that R objects have already been converted to Arrow 
objects;
+// that seems right but should we do the wrapping here too/instead?
+Rcpp::stop("to_datum: Not implemented");
+  }
+}
+
+SEXP from_datum(arrow::Datum datum) {
+  if (datum.is_array()) {
+return Rcpp::wrap(datum.make_array());
+  } else if (datum.is_arraylike()) {
+return Rcpp::wrap(datum.chunked_array());
+  } else {
+// TODO: the other datum types
+Rcpp::stop("from_datum: Not implemented");
+  }
+}
+
+std::shared_ptr 
make_compute_options(std::string func_name,
+List_ options) {
+  if (func_name == "filter") {
+auto out = 
std::make_shared(arrow::compute::FilterOptions::Defaults());
+if (!Rf_isNull(options["keep_na"]) && options["keep_na"]) {
+  out->null_selection_behavior = arrow::compute::FilterOptions::EMIT_NULL;
+}
+return out;
+  } else if (func_name == "take") {
+auto out = 
std::make_shared(arrow::compute::TakeOptions::Defaults());
+return out;
+  } else {
+return nullptr;
+  }
+  // TODO: make sure the correct destructor is called?

Review comment:
   ```suggestion
   ```
   confirmed: `make_shared` sets the destructor to `deleter` (so we 
aren't relying on `~FunctionOptions` being virtual)





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] bkietz commented on a change in pull request #7279: ARROW-8938: [R] Provide binding for arrow::compute::CallFunction

2020-05-27 Thread GitBox


bkietz commented on a change in pull request #7279:
URL: https://github.com/apache/arrow/pull/7279#discussion_r431263416



##
File path: r/src/compute.cpp
##
@@ -223,4 +205,66 @@ std::shared_ptr Table__FilterChunked(
   }
   return tab;
 }
+
+arrow::Datum to_datum(SEXP x) {
+  // TODO: this is repetitive, can we DRY it out?
+  if (Rf_inherits(x, "ArrowObject") && Rf_inherits(x, "Array")) {
+Rcpp::ConstReferenceSmartPtrInputParameter> 
obj(x);
+return static_cast>(obj);
+  } else if (Rf_inherits(x, "ArrowObject") && Rf_inherits(x, "ChunkedArray")) {
+
Rcpp::ConstReferenceSmartPtrInputParameter>
 obj(x);
+return static_cast>(obj);
+  } else if (Rf_inherits(x, "ArrowObject") && Rf_inherits(x, "RecordBatch")) {
+
Rcpp::ConstReferenceSmartPtrInputParameter> 
obj(x);
+return static_cast>(obj);
+  } else if (Rf_inherits(x, "ArrowObject") && Rf_inherits(x, "Table")) {
+Rcpp::ConstReferenceSmartPtrInputParameter> 
obj(x);
+return static_cast>(obj);
+  } else {
+// TODO: scalar?

Review comment:
   I would recommend refactoring that function into 
`std::shared_ptr to_scalar(SEXP)`. That function need not be exported 
(same logic as `Datum`; `SEXP` already holds whatever value). The climax of 
`to_scalar` should be a call to `arrow::MakeScalar`.
   
   After this:
   ```c++
   std::shared_ptr dataset___expr__scalar(SEXP x) {
 return ds::scalar(to_scalar(x));
   }
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] nealrichardson commented on pull request #7279: ARROW-8938: [R] Provide binding for arrow::compute::CallFunction

2020-05-27 Thread GitBox


nealrichardson commented on pull request #7279:
URL: https://github.com/apache/arrow/pull/7279#issuecomment-634755810


   @github-actions autotune



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] github-actions[bot] commented on pull request #7292: ARROW-8471: [C++][Integration] Represent 64 bit integers as strings

2020-05-27 Thread GitBox


github-actions[bot] commented on pull request #7292:
URL: https://github.com/apache/arrow/pull/7292#issuecomment-634752661


   https://issues.apache.org/jira/browse/ARROW-8471



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] bkietz opened a new pull request #7292: ARROW-8471: [C++][Integration] Represent 64 bit integers as strings

2020-05-27 Thread GitBox


bkietz opened a new pull request #7292:
URL: https://github.com/apache/arrow/pull/7292


   Some 64 bit integer values are not representable as JSON numbers, so store 
these as strings.
   
   This is issue is a regression; original fix was #5267



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] rdettai commented on pull request #7291: ARROW-8455: [Rust] Parquet Arrow column read on partially compatible files FIX

2020-05-27 Thread GitBox


rdettai commented on pull request #7291:
URL: https://github.com/apache/arrow/pull/7291#issuecomment-634741622


   @maxburke this should definitively fix your issue ! Sorry for the 
inconvenience. There are really very few safety nets on this codebase :-)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] kiszk commented on pull request #7277: ARROW-8957: [FlightRPC][C++] directly use IpcWriteOptions

2020-05-27 Thread GitBox


kiszk commented on pull request #7277:
URL: https://github.com/apache/arrow/pull/7277#issuecomment-634737470


   Thanks, we still have build failure in another place at 
https://travis-ci.org/github/apache/arrow/jobs/691713153#L1015
   
   ```
   /arrow/cpp/src/arrow/flight/client.cc:391:50:   required from here
   /arrow/cpp/src/arrow/flight/client.cc:411:28: error: moving a local object 
in a return statement prevents copy elision [-Werror=pessimizing-move]
 411 | return std::move(result);
 |^
   /arrow/cpp/src/arrow/flight/client.cc:411:28: note: remove 'std::move' call
   /arrow/cpp/src/arrow/flight/client.cc: In instantiation of 
'arrow::Result > 
arrow::flight::GrpcIpcMessageReader::ReadNextMessage() [with Reader = 
grpc_impl::ClientReaderWriter]':
   /arrow/cpp/src/arrow/flight/client.cc:391:50:   required from here
   /arrow/cpp/src/arrow/flight/client.cc:411:28: error: moving a local object 
in a return statement prevents copy elision [-Werror=pessimizing-move]
   /arrow/cpp/src/arrow/flight/client.cc:411:28: note: remove 'std::move' call
   cc1plus: all warnings being treated as errors
   ```
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] github-actions[bot] commented on pull request #7291: ARROW-8455: [Rust] Parquet Arrow column read on partially compatible files FIX

2020-05-27 Thread GitBox


github-actions[bot] commented on pull request #7291:
URL: https://github.com/apache/arrow/pull/7291#issuecomment-634732266


   https://issues.apache.org/jira/browse/ARROW-8455



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] rdettai opened a new pull request #7291: ARROW-8455: [Rust] Parquet Arrow column read on partially compatible files FIX

2020-05-27 Thread GitBox


rdettai opened a new pull request #7291:
URL: https://github.com/apache/arrow/pull/7291


   This is a fix for the faulty https://github.com/apache/arrow/pull/6935
   
   It addresses https://issues.apache.org/jira/browse/ARROW-8455
   >Seen behavior: When reading a Parquet file into Arrow with 
get_record_reader_by_columns, it will fail if one of the column of the file is 
a list (or any other unsupported type).
   
   >Expected behavior: it should only fail if you are actually reading the 
column with unsuported type.
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] wesm commented on pull request #7258: ARROW-8918: [C++][Python] Implement cast metafunction to allow use of "cast" with CallFunction, use in Python

2020-05-27 Thread GitBox


wesm commented on pull request #7258:
URL: https://github.com/apache/arrow/pull/7258#issuecomment-634725333


   Could someone review this? 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] github-actions[bot] commented on pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

2020-05-27 Thread GitBox


github-actions[bot] commented on pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#issuecomment-634652668


   https://issues.apache.org/jira/browse/ARROW-1692



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] lidavidm commented on pull request #6656: ARROW-8297: [FlightRPC][C++] Implement Flight DoExchange for C++

2020-05-27 Thread GitBox


lidavidm commented on pull request #6656:
URL: https://github.com/apache/arrow/pull/6656#issuecomment-634650667


   Sorry about that, for posterity this is being worked on in 
https://github.com/apache/arrow/pull/7277.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] rymurr opened a new pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

2020-05-27 Thread GitBox


rymurr opened a new pull request #7290:
URL: https://github.com/apache/arrow/pull/7290


   This fixes the integration tests for Union Arrays.
   
   Both Dense and Sparse unions now work:
   
   * validity buffer added to Sparse Union
   * both Union types now track logical types rather than MinorType enum ordinal
   
   Dependent on #7289 for duplicate field names.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] hn5092 commented on pull request #7288: ARROW-8963: [C++][Parquet] optimize LeafReader::NextBatch to save memory

2020-05-27 Thread GitBox


hn5092 commented on pull request #7288:
URL: https://github.com/apache/arrow/pull/7288#issuecomment-634647561


   retest this please!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] lidavidm commented on a change in pull request #7277: ARROW-8957: [FlightRPC][C++] directly use IpcWriteOptions

2020-05-27 Thread GitBox


lidavidm commented on a change in pull request #7277:
URL: https://github.com/apache/arrow/pull/7277#discussion_r431066490



##
File path: cpp/src/arrow/flight/server.cc
##
@@ -352,7 +352,7 @@ class DoExchangeMessageWriter : public FlightMessageWriter {
   }
 
   grpc::ServerReaderWriter* stream_;
-  ipc::IpcOptions ipc_options_;
+  ::arrow::ipc::IpcOptions ipc_options_;

Review comment:
   Yes, my mistake, thank you both for pointing it out. I've updated this 
PR.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] github-actions[bot] commented on pull request #7289: ARROW-8948: [Java][Integration] enable duplicate field names integration tests

2020-05-27 Thread GitBox


github-actions[bot] commented on pull request #7289:
URL: https://github.com/apache/arrow/pull/7289#issuecomment-634593683


   https://issues.apache.org/jira/browse/ARROW-8948



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] rymurr opened a new pull request #7289: ARROW-8948: [Java][Integration] enable duplicate field names integration tests

2020-05-27 Thread GitBox


rymurr opened a new pull request #7289:
URL: https://github.com/apache/arrow/pull/7289


   This addresses the integration test error where Java cannot process 
duplicate field names.
   
   This extends StructVector and VectorSchemaRoot to support duplicate field 
names.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] cyb70289 commented on pull request #7285: ARROW-8843: [C++] Compare bitmaps in words

2020-05-27 Thread GitBox


cyb70289 commented on pull request #7285:
URL: https://github.com/apache/arrow/pull/7285#issuecomment-634555761


   Before (apply only benchmark code)
   ```
   BitmapEqualsWithoutOffset/8192195 ns  195 ns  
3618366 bytes_per_second=39.1055G/s
   BitmapEqualsWithOffset/8192 91272 ns91235 ns 
7673 bytes_per_second=85.6307M/s
   ```
   
   After
   ```
   BitmapEqualsWithoutOffset/8192193 ns  193 ns  
3635971 bytes_per_second=39.6261G/s
   BitmapEqualsWithOffset/8192  1332 ns 1331 ns   
510343 bytes_per_second=5.73058G/s



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] github-actions[bot] commented on pull request #7288: ARROW-8963: [C++][Parquet] optimize LeafReader::NextBatch to save memory

2020-05-27 Thread GitBox


github-actions[bot] commented on pull request #7288:
URL: https://github.com/apache/arrow/pull/7288#issuecomment-634551846


   https://issues.apache.org/jira/browse/ARROW-8963



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] github-actions[bot] commented on pull request #7287: ARROW-8771: [C++] Add boost/process library to build support

2020-05-27 Thread GitBox


github-actions[bot] commented on pull request #7287:
URL: https://github.com/apache/arrow/pull/7287#issuecomment-634544583


   https://issues.apache.org/jira/browse/ARROW-8771



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] hn5092 opened a new pull request #7288: Optimize parquet cpp reader

2020-05-27 Thread GitBox


hn5092 opened a new pull request #7288:
URL: https://github.com/apache/arrow/pull/7288


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] liyafan82 opened a new pull request #7287: ARROW-8771: [C++] Add boost/process library to build support

2020-05-27 Thread GitBox


liyafan82 opened a new pull request #7287:
URL: https://github.com/apache/arrow/pull/7287


   Some of our test source code requires the process.hpp file (and its 
dependent libraries). Our current build support does not include these files, 
causing build failures like:
   
   `fatal error: boost/process.hpp: No such file or directory`
   
   In this PR, we add required header files which are sufficient to make flight 
build successful. 
   The added header files are verified in my local machine. However, it seems 
the script is supposed to run from the server side, so we cannot test it. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] rdettai commented on pull request #6935: ARROW-8455: [Rust] Parquet Arrow column read on partially compatible files

2020-05-27 Thread GitBox


rdettai commented on pull request #6935:
URL: https://github.com/apache/arrow/pull/6935#issuecomment-634495368


   @maxburke there is indeed an issue with struct fields. @sunchao I don't know 
how you usually operate: should we reverse the merge then re-apply it with the 
fix or create a new PR with the fix directly ?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] kiszk commented on a change in pull request #7277: ARROW-8957: [FlightRPC][C++] directly use IpcWriteOptions

2020-05-27 Thread GitBox


kiszk commented on a change in pull request #7277:
URL: https://github.com/apache/arrow/pull/7277#discussion_r430857565



##
File path: cpp/src/arrow/flight/server.cc
##
@@ -352,7 +352,7 @@ class DoExchangeMessageWriter : public FlightMessageWriter {
   }
 
   grpc::ServerReaderWriter* stream_;
-  ipc::IpcOptions ipc_options_;
+  ::arrow::ipc::IpcOptions ipc_options_;

Review comment:
   @lidavidm  How about `::arrow::ipc::IpcWriteOptions ipc_options_;`? It 
can succeed to compile arrow-flight with and without 
`-DARROW_NO_DEPRECATED_API=ON`.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] github-actions[bot] commented on pull request #7286: ARROW-8962: [C++] Add explicit implementation for junk values

2020-05-27 Thread GitBox


github-actions[bot] commented on pull request #7286:
URL: https://github.com/apache/arrow/pull/7286#issuecomment-634463719


   https://issues.apache.org/jira/browse/ARROW-8962



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] xhochy opened a new pull request #7286: ARROW-8962: [C++] Add explicit implementation for junk values

2020-05-27 Thread GitBox


xhochy opened a new pull request #7286:
URL: https://github.com/apache/arrow/pull/7286


   When linking the tests with `clang-4.0`, I get
   
   ```
   Undefined symbols for architecture x86_64:
 "arrow::internal::(anonymous 
namespace)::StringToFloatConverterImpl::main_junk_value_", referenced from:
 arrow::internal::StringToFloat(char const*, unsigned long, float*) in 
libarrow.a(value_parsing.cc.o)
 arrow::internal::StringToFloat(char const*, unsigned long, double*) in 
libarrow.a(value_parsing.cc.o)
 "arrow::internal::(anonymous 
namespace)::StringToFloatConverterImpl::fallback_junk_value_", referenced from:
 arrow::internal::StringToFloat(char const*, unsigned long, float*) in 
libarrow.a(value_parsing.cc.o)
 arrow::internal::StringToFloat(char const*, unsigned long, double*) in 
libarrow.a(value_parsing.cc.o)
   ld: symbol(s) not found for architecture x86_64
   ```
   
   These older clang versions need an explicit implementation definition.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] liyafan82 commented on pull request #6425: ARROW-6111: [Java] Support LargeVarChar and LargeBinary types

2020-05-27 Thread GitBox


liyafan82 commented on pull request #6425:
URL: https://github.com/apache/arrow/pull/6425#issuecomment-634447500


   > Great job getting integration tests passing @liyafan82 ! I took a quick 
look and LGTM, will do a more thorough pass hopefully tomorrow before merging.
   
   @BryanCutler Please take your time. Thanks a lot for your effort. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org