[GitHub] [arrow] cyb70289 opened a new pull request #7476: ARROW-9168: [C++][Flight] configure TCP connection sharing in benchmark

2020-06-17 Thread GitBox
cyb70289 opened a new pull request #7476: URL: https://github.com/apache/arrow/pull/7476 Flight benchmark performs worse when working threads increases. It can be improved by setting gRPC option GRPC_ARG_USE_LOCAL_SUBCHANNEL_POOL to make each client creating its own TCP connection to

[GitHub] [arrow] liyafan82 commented on pull request #7287: ARROW-8771: [C++] Add boost/process library to build support

2020-06-17 Thread GitBox
liyafan82 commented on pull request #7287: URL: https://github.com/apache/arrow/pull/7287#issuecomment-645762806 > That's odd. I wonder what's different about your build setup from the jobs we run on CI because I haven't seen that before. Do you think you could add a crossbow job that

[GitHub] [arrow] liyafan82 commented on a change in pull request #7347: ARROW-8230: [Java] Remove netty dependency from arrow-memory

2020-06-17 Thread GitBox
liyafan82 commented on a change in pull request #7347: URL: https://github.com/apache/arrow/pull/7347#discussion_r441931433 ## File path: java/memory/src/main/java/org/apache/arrow/memory/rounding/DefaultRoundingPolicy.java ## @@ -31,19 +28,18 @@ public final long

[GitHub] [arrow] liyafan82 commented on a change in pull request #7347: ARROW-8230: [Java] Remove netty dependency from arrow-memory

2020-06-17 Thread GitBox
liyafan82 commented on a change in pull request #7347: URL: https://github.com/apache/arrow/pull/7347#discussion_r441930746 ## File path: java/memory/src/main/java/org/apache/arrow/memory/util/MemoryUtil.java ## @@ -78,6 +77,55 @@ public Object run() { Field

[GitHub] [arrow] liyafan82 commented on a change in pull request #7347: ARROW-8230: [Java] Remove netty dependency from arrow-memory

2020-06-17 Thread GitBox
liyafan82 commented on a change in pull request #7347: URL: https://github.com/apache/arrow/pull/7347#discussion_r441929412 ## File path: java/memory/src/main/java/io/netty/buffer/NettyArrowBuf.java ## @@ -404,7 +407,7 @@ protected int _getUnsignedMediumLE(int index) {

[GitHub] [arrow] kou commented on pull request #7396: ARROW-9092: [C++][TRIAGE] Do not enable TestRoundFunctions when using LLVM 9 until gandiva-decimal-test is fixed

2020-06-17 Thread GitBox
kou commented on pull request #7396: URL: https://github.com/apache/arrow/pull/7396#issuecomment-645726115 Sorry. I couldn't reproduce this on my local environment including `docker-compose run conda-cpp` on my local machine:

[GitHub] [arrow] kou commented on a change in pull request #7459: ARROW-6800: [C++] Support building libraries targeting C++14 or higher, disable GNU CXX extensions

2020-06-17 Thread GitBox
kou commented on a change in pull request #7459: URL: https://github.com/apache/arrow/pull/7459#discussion_r441221973 ## File path: cpp/cmake_modules/SetupCxxFlags.cmake ## @@ -62,12 +62,15 @@ endif() # Support C11 set(CMAKE_C_STANDARD 11) -# This ensures that things like

[GitHub] [arrow] kou commented on pull request #7459: ARROW-6800: [C++] Support building libraries targeting C++14 or higher, disable GNU CXX extensions

2020-06-17 Thread GitBox
kou commented on pull request #7459: URL: https://github.com/apache/arrow/pull/7459#issuecomment-645720173 Should we link to ARROW-6848 instead of ARROW-6800? This is an automated message from the Apache Git Service. To

[GitHub] [arrow] ursabot commented on pull request #7475: ARROW-8500: [C++] Add benchmark for using Filter on RecordBatch

2020-06-17 Thread GitBox
ursabot commented on pull request #7475: URL: https://github.com/apache/arrow/pull/7475#issuecomment-645714003 [AMD64 Ubuntu 18.04 C++ Benchmark (#113134)](https://ci.ursalabs.org/#builders/73/builds/84) builder failed with an exception. Revision:

[GitHub] [arrow] github-actions[bot] commented on pull request #7475: ARROW-8500: [C++] Add benchmark for using Filter on RecordBatch

2020-06-17 Thread GitBox
github-actions[bot] commented on pull request #7475: URL: https://github.com/apache/arrow/pull/7475#issuecomment-645710548 https://issues.apache.org/jira/browse/ARROW-8500 This is an automated message from the Apache Git

[GitHub] [arrow] wesm closed pull request #7461: ARROW-8969: [C++] Reduce binary size of kernels/scalar_compare.cc.o by reusing more kernels between types, operators

2020-06-17 Thread GitBox
wesm closed pull request #7461: URL: https://github.com/apache/arrow/pull/7461 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] wesm commented on pull request #7461: ARROW-8969: [C++] Reduce binary size of kernels/scalar_compare.cc.o by reusing more kernels between types, operators

2020-06-17 Thread GitBox
wesm commented on pull request #7461: URL: https://github.com/apache/arrow/pull/7461#issuecomment-645710004 I gave up on trying to have e.g. a common "64-bit" kernel for Equals/NotEquals Int64/UInt64/Timestamp/etc. The sticking point is scalar unboxing. We might need to fashion a common

[GitHub] [arrow] wesm opened a new pull request #7475: ARROW-8500: [C++] Add benchmark for using Filter on RecordBatch

2020-06-17 Thread GitBox
wesm opened a new pull request #7475: URL: https://github.com/apache/arrow/pull/7475 Since I changed Filter on RecordBatch to transform the filter to indices and use Take, I wanted to have a benchmark to compare the before/after performance so this can also be monitored over time. These

[GitHub] [arrow] wesm commented on pull request #7475: ARROW-8500: [C++] Add benchmark for using Filter on RecordBatch

2020-06-17 Thread GitBox
wesm commented on pull request #7475: URL: https://github.com/apache/arrow/pull/7475#issuecomment-645709018 @ursabot benchmark --benchmark-filter=FilterRecordBatch 22f374102 This is an automated message from the Apache Git

[GitHub] [arrow] wesm commented on pull request #7449: ARROW-9133: [C++] Add utf8_upper and utf8_lower

2020-06-17 Thread GitBox
wesm commented on pull request #7449: URL: https://github.com/apache/arrow/pull/7449#issuecomment-645698551 I just merged my changes for the ASCII kernels making those work on sliced arrays This is an automated message from

[GitHub] [arrow] wesm closed pull request #7465: PARQUET-1877: [C++] Reconcile thrift limits

2020-06-17 Thread GitBox
wesm closed pull request #7465: URL: https://github.com/apache/arrow/pull/7465 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] wesm closed pull request #7458: ARROW-9122: [C++] Properly handle sliced arrays in ascii_lower, ascii_upper kernels

2020-06-17 Thread GitBox
wesm closed pull request #7458: URL: https://github.com/apache/arrow/pull/7458 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] wesm commented on pull request #7458: ARROW-9122: [C++] Properly handle sliced arrays in ascii_lower, ascii_upper kernels

2020-06-17 Thread GitBox
wesm commented on pull request #7458: URL: https://github.com/apache/arrow/pull/7458#issuecomment-645697268 the test failure is an s390x flake This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] wesm closed pull request #7463: ARROW-9145: [C++] Implement BooleanArray::true_count and false_count, add Python bindings

2020-06-17 Thread GitBox
wesm closed pull request #7463: URL: https://github.com/apache/arrow/pull/7463 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] wesm commented on pull request #7463: ARROW-9145: [C++] Implement BooleanArray::true_count and false_count, add Python bindings

2020-06-17 Thread GitBox
wesm commented on pull request #7463: URL: https://github.com/apache/arrow/pull/7463#issuecomment-645696772 +1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [arrow] wesm commented on pull request #7461: ARROW-8969: [C++] Reduce binary size of kernels/scalar_compare.cc.o by reusing more kernels between types, operators

2020-06-17 Thread GitBox
wesm commented on pull request #7461: URL: https://github.com/apache/arrow/pull/7461#issuecomment-645696688 +1. I'm going to merge this to help avoid conflicts caused by the stuff I just renamed. I welcome further comments and I will work to address them in follow ups

[GitHub] [arrow] wesm commented on a change in pull request #7461: ARROW-8969: [C++] Reduce binary size of kernels/scalar_compare.cc.o by reusing more kernels between types, operators

2020-06-17 Thread GitBox
wesm commented on a change in pull request #7461: URL: https://github.com/apache/arrow/pull/7461#discussion_r441902495 ## File path: cpp/src/arrow/compute/kernels/codegen_internal.h ## @@ -485,18 +502,44 @@ struct ScalarUnaryNotNullStateful { struct ArrayExec> { static

[GitHub] [arrow] bkietz commented on a change in pull request #7156: ARROW-8074: [C++][Dataset][Python] FileFragments from buffers and NativeFiles

2020-06-17 Thread GitBox
bkietz commented on a change in pull request #7156: URL: https://github.com/apache/arrow/pull/7156#discussion_r441836943 ## File path: python/pyarrow/_dataset.pyx ## @@ -43,6 +44,51 @@ def _forbid_instantiation(klass, subclasses_instead=True): raise TypeError(msg)

[GitHub] [arrow] wesm commented on pull request #7461: ARROW-8969: [C++] Reduce binary size of kernels/scalar_compare.cc.o by reusing more kernels between types, operators

2020-06-17 Thread GitBox
wesm commented on pull request #7461: URL: https://github.com/apache/arrow/pull/7461#issuecomment-645623392 I'm revamping the documentation about these codegen functions which I'm dubbing "Generator-Dispatchers" (GDs) for short. I'll add "Generate" to their name. Stay tuned

[GitHub] [arrow] wesm commented on pull request #7473: ARROW-9162: [Python] Expose Add/Subtract/Multiply arithmetic kernels

2020-06-17 Thread GitBox
wesm commented on pull request #7473: URL: https://github.com/apache/arrow/pull/7473#issuecomment-645615268 https://issues.apache.org/jira/browse/ARROW-9164 This is an automated message from the Apache Git Service. To

[GitHub] [arrow] kszucs edited a comment on pull request #7471: ARROW-9109: [Python][Packaging] Enable S3 support in manylinux wheels

2020-06-17 Thread GitBox
kszucs edited a comment on pull request #7471: URL: https://github.com/apache/arrow/pull/7471#issuecomment-645614037 Quickly went through it and it LGTM. I can give it a more thorough review tomorrow. This is an automated

[GitHub] [arrow] kszucs commented on pull request #7471: ARROW-9109: [Python][Packaging] Enable S3 support in manylinux wheels

2020-06-17 Thread GitBox
kszucs commented on pull request #7471: URL: https://github.com/apache/arrow/pull/7471#issuecomment-645614037 Quickly went through it and LGTM. I can give it a more thorough review tomorrow. This is an automated message

[GitHub] [arrow] wesm commented on pull request #7471: ARROW-9109: [Python][Packaging] Enable S3 support in manylinux wheels

2020-06-17 Thread GitBox
wesm commented on pull request #7471: URL: https://github.com/apache/arrow/pull/7471#issuecomment-645612814 Very reasonable. Thanks @pitrou! This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] kszucs commented on pull request #7473: ARROW-9162: [Python] Expose Add/Subtract/Multiply arithmetic kernels

2020-06-17 Thread GitBox
kszucs commented on pull request #7473: URL: https://github.com/apache/arrow/pull/7473#issuecomment-645612711 I was thinking of the same, so I eventually deferred the docstrings which I can add in my other PR where we need to add `check_overflow` flag to these functions.

[GitHub] [arrow] wesm closed pull request #7473: ARROW-9162: [Python] Expose Add/Subtract/Multiply arithmetic kernels

2020-06-17 Thread GitBox
wesm closed pull request #7473: URL: https://github.com/apache/arrow/pull/7473 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] wesm commented on pull request #7442: ARROW-9075: [C++] Optimized Filter implementation: faster performance + compilation, smaller code size

2020-06-17 Thread GitBox
wesm commented on pull request #7442: URL: https://github.com/apache/arrow/pull/7442#issuecomment-645600060 +1. Thanks all for the comments This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [arrow] wesm closed pull request #7442: ARROW-9075: [C++] Optimized Filter implementation: faster performance + compilation, smaller code size

2020-06-17 Thread GitBox
wesm closed pull request #7442: URL: https://github.com/apache/arrow/pull/7442 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] fsaintjacques commented on pull request #7474: ARROW-8802: [C++][Dataset] Preserve dataset schema's metadata on column projection

2020-06-17 Thread GitBox
fsaintjacques commented on pull request #7474: URL: https://github.com/apache/arrow/pull/7474#issuecomment-645595500 @jorisvandenbossche I'm puzzled by the error. Could it be an upstream pandas change? This is an automated

[GitHub] [arrow] kszucs edited a comment on pull request #7439: ARROW-4309: [Documentation] Add a docker-compose entry which builds the documentation with CUDA enabled

2020-06-17 Thread GitBox
kszucs edited a comment on pull request #7439: URL: https://github.com/apache/arrow/pull/7439#issuecomment-645593482 Please verify it with the following command: ```bash $ archery docker run --no-pull ubuntu-cuda-cpp $ archery docker run --no-pull ubuntu-cuda-python $

[GitHub] [arrow] kszucs commented on pull request #7439: ARROW-4309: [Documentation] Add a docker-compose entry which builds the documentation with CUDA enabled

2020-06-17 Thread GitBox
kszucs commented on pull request #7439: URL: https://github.com/apache/arrow/pull/7439#issuecomment-645593482 Please verify it with the following command: ```bash $ archery docker run --no-pull ubuntu-cuda-cpp $ archery docker run --no-pull ubuntu-cuda-python $ archery

[GitHub] [arrow] ursabot commented on pull request #7442: ARROW-9075: [C++] Optimized Filter implementation: faster performance + compilation, smaller code size

2020-06-17 Thread GitBox
ursabot commented on pull request #7442: URL: https://github.com/apache/arrow/pull/7442#issuecomment-645588035 [AMD64 Ubuntu 18.04 C++ Benchmark (#113048)](https://ci.ursalabs.org/#builders/73/builds/83) builder has been succeeded. Revision: 54bb83848d391477c5ded222fd4401acbe08c6c7

[GitHub] [arrow] github-actions[bot] commented on pull request #7474: ARROW-8802: [C++][Dataset] Preserve dataset schema's metadata on column projection

2020-06-17 Thread GitBox
github-actions[bot] commented on pull request #7474: URL: https://github.com/apache/arrow/pull/7474#issuecomment-645570471 https://issues.apache.org/jira/browse/ARROW-8802 This is an automated message from the Apache Git

[GitHub] [arrow] wesm commented on pull request #7442: ARROW-9075: [C++] Optimized Filter implementation: faster performance + compilation, smaller code size

2020-06-17 Thread GitBox
wesm commented on pull request #7442: URL: https://github.com/apache/arrow/pull/7442#issuecomment-645569875 @ursabot benchmark --benchmark-filter=Filter 04006ff This is an automated message from the Apache Git Service. To

[GitHub] [arrow] fsaintjacques opened a new pull request #7474: ARROW-8802: [C++][Dataset] Preserve Dataset's schema metadata on column projection

2020-06-17 Thread GitBox
fsaintjacques opened a new pull request #7474: URL: https://github.com/apache/arrow/pull/7474 Scanner does not preserve the original schema metadata when columns are projected. This is an automated message from the Apache

[GitHub] [arrow] pitrou commented on pull request #7471: ARROW-9109: [Python][Packaging] Enable S3 support in manylinux wheels

2020-06-17 Thread GitBox
pitrou commented on pull request #7471: URL: https://github.com/apache/arrow/pull/7471#issuecomment-645562851 This grows the manylinux wheel sizes by about 2-3 MB (from ~13MB to ~15-16MB). It think it's reasonable. This is

[GitHub] [arrow] pitrou commented on pull request #7471: ARROW-9109: [Python][Packaging] Enable S3 support in manylinux wheels

2020-06-17 Thread GitBox
pitrou commented on pull request #7471: URL: https://github.com/apache/arrow/pull/7471#issuecomment-645560128 It seems like everything is working fine here. This is an automated message from the Apache Git Service. To

[GitHub] [arrow] wesm commented on pull request #7442: ARROW-9075: [C++] Optimized Filter implementation: faster performance + compilation, smaller code size

2020-06-17 Thread GitBox
wesm commented on pull request #7442: URL: https://github.com/apache/arrow/pull/7442#issuecomment-645556193 So these "readability" improvements made performance worse so I'll revert them This is an automated message from

[GitHub] [arrow] ursabot commented on pull request #7442: ARROW-9075: [C++] Optimized Filter implementation: faster performance + compilation, smaller code size

2020-06-17 Thread GitBox
ursabot commented on pull request #7442: URL: https://github.com/apache/arrow/pull/7442#issuecomment-645545062 [AMD64 Ubuntu 18.04 C++ Benchmark (#112989)](https://ci.ursalabs.org/#builders/73/builds/82) builder has been succeeded. Revision: 21227cc7530e59a481f7e3c0aae8d351b4226e9d

[GitHub] [arrow] fsaintjacques commented on a change in pull request #7156: ARROW-8074: [C++][Dataset][Python] FileFragments from buffers and NativeFiles

2020-06-17 Thread GitBox
fsaintjacques commented on a change in pull request #7156: URL: https://github.com/apache/arrow/pull/7156#discussion_r441740202 ## File path: python/pyarrow/_dataset.pyx ## @@ -43,6 +44,51 @@ def _forbid_instantiation(klass, subclasses_instead=True): raise TypeError(msg)

[GitHub] [arrow] kszucs commented on pull request #7449: ARROW-9133: [C++] Add utf8_upper and utf8_lower

2020-06-17 Thread GitBox
kszucs commented on pull request #7449: URL: https://github.com/apache/arrow/pull/7449#issuecomment-645539612 Added `libutf8proc` dependency to the ursabot builders, same could be done for the docker-compose images. The tests are failing though.

[GitHub] [arrow] kszucs commented on pull request #7449: ARROW-9133: [C++] Add utf8_upper and utf8_lower

2020-06-17 Thread GitBox
kszucs commented on pull request #7449: URL: https://github.com/apache/arrow/pull/7449#issuecomment-645538795 @ursabot build This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [arrow] wesm commented on pull request #7442: ARROW-9075: [C++] Optimized Filter implementation: faster performance + compilation, smaller code size

2020-06-17 Thread GitBox
wesm commented on pull request #7442: URL: https://github.com/apache/arrow/pull/7442#issuecomment-645526968 @ursabot benchmark --benchmark-filter=Filter 04006ff This is an automated message from the Apache Git Service. To

[GitHub] [arrow] github-actions[bot] commented on pull request #7471: ARROW-9109: [Python][Packaging] Enable S3 support in manylinux wheels

2020-06-17 Thread GitBox
github-actions[bot] commented on pull request #7471: URL: https://github.com/apache/arrow/pull/7471#issuecomment-645526238 Revision: f28701736f39939dc7eafb3db2e093612ed3c10c Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] pitrou commented on pull request #7471: ARROW-9109: [Python][Packaging] Enable S3 support in manylinux wheels

2020-06-17 Thread GitBox
pitrou commented on pull request #7471: URL: https://github.com/apache/arrow/pull/7471#issuecomment-645525436 @github-actions crossbow submit -g wheel This is an automated message from the Apache Git Service. To respond to

[GitHub] [arrow] wesm commented on pull request #7442: ARROW-9075: [C++] Optimized Filter implementation: faster performance + compilation, smaller code size

2020-06-17 Thread GitBox
wesm commented on pull request #7442: URL: https://github.com/apache/arrow/pull/7442#issuecomment-645521577 Something weird with the commit history, I'm not sure those benchmarks are right. I'll rebase things again and rerun

[GitHub] [arrow] ursabot commented on pull request #7442: ARROW-9075: [C++] Optimized Filter implementation: faster performance + compilation, smaller code size

2020-06-17 Thread GitBox
ursabot commented on pull request #7442: URL: https://github.com/apache/arrow/pull/7442#issuecomment-645518289 [AMD64 Ubuntu 18.04 C++ Benchmark (#112952)](https://ci.ursalabs.org/#builders/73/builds/81) builder has been succeeded. Revision: f50b39e54c50e8a53606eda486c88e6ec51d7006

[GitHub] [arrow] github-actions[bot] commented on pull request #7471: ARROW-9109: [Python][Packaging] Enable S3 support in manylinux wheels

2020-06-17 Thread GitBox
github-actions[bot] commented on pull request #7471: URL: https://github.com/apache/arrow/pull/7471#issuecomment-645507606 Revision: 282de25836ec86c707c34340c5a0f4907392e178 Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] pitrou commented on pull request #7471: ARROW-9109: [Python][Packaging] Enable S3 support in manylinux wheels

2020-06-17 Thread GitBox
pitrou commented on pull request #7471: URL: https://github.com/apache/arrow/pull/7471#issuecomment-645502945 @github-actions crossbow submit -g wheel This is an automated message from the Apache Git Service. To respond to

[GitHub] [arrow] pitrou commented on a change in pull request #7469: ARROW-8832: [Python] Provide better error message when S3/HDFS is not enabled in installation

2020-06-17 Thread GitBox
pitrou commented on a change in pull request #7469: URL: https://github.com/apache/arrow/pull/7469#discussion_r441596426 ## File path: python/pyarrow/fs.py ## @@ -35,14 +36,31 @@ # For backward compatibility. FileStats = FileInfo +_not_imported = [] + try: from

[GitHub] [arrow] nealrichardson closed pull request #7415: ARROW-7028: [R] Date roundtrip results in different R storage mode

2020-06-17 Thread GitBox
nealrichardson closed pull request #7415: URL: https://github.com/apache/arrow/pull/7415 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow] wesm commented on pull request #7442: ARROW-9075: [C++] Optimized Filter implementation: faster performance + compilation, smaller code size

2020-06-17 Thread GitBox
wesm commented on pull request #7442: URL: https://github.com/apache/arrow/pull/7442#issuecomment-645498297 I think I improved some of the readability problems and addressed the other comments. I'd like to merge this soon once CI is creen

[GitHub] [arrow] wesm edited a comment on pull request #7442: ARROW-9075: [C++] Optimized Filter implementation: faster performance + compilation, smaller code size

2020-06-17 Thread GitBox
wesm edited a comment on pull request #7442: URL: https://github.com/apache/arrow/pull/7442#issuecomment-645498297 I think I improved some of the readability problems and addressed the other comments. I'd like to merge this soon once CI is green

[GitHub] [arrow] wesm commented on pull request #7442: ARROW-9075: [C++] Optimized Filter implementation: faster performance + compilation, smaller code size

2020-06-17 Thread GitBox
wesm commented on pull request #7442: URL: https://github.com/apache/arrow/pull/7442#issuecomment-645497918 @ursabot benchmark --benchmark-filter=Filter c4f425768 This is an automated message from the Apache Git Service. To

[GitHub] [arrow] nevi-me closed pull request #7399: ARROW-9095: [Rust] Spec-compliant NullArray

2020-06-17 Thread GitBox
nevi-me closed pull request #7399: URL: https://github.com/apache/arrow/pull/7399 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] github-actions[bot] commented on pull request #7473: ARROW-9162: [Python] Expose Add/Subtract/Multiply arithmetic kernels

2020-06-17 Thread GitBox
github-actions[bot] commented on pull request #7473: URL: https://github.com/apache/arrow/pull/7473#issuecomment-645490171 https://issues.apache.org/jira/browse/ARROW-9162 This is an automated message from the Apache Git

[GitHub] [arrow] kszucs opened a new pull request #7473: ARROW-9162: [Python] Expose Add/Subtract/Multiply arithmetic kernels

2020-06-17 Thread GitBox
kszucs opened a new pull request #7473: URL: https://github.com/apache/arrow/pull/7473 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow] wesm commented on a change in pull request #7442: ARROW-9075: [C++] Optimized Filter implementation: faster performance + compilation, smaller code size

2020-06-17 Thread GitBox
wesm commented on a change in pull request #7442: URL: https://github.com/apache/arrow/pull/7442#discussion_r441674808 ## File path: cpp/src/arrow/compute/kernels/vector_selection.cc ## @@ -0,0 +1,1816 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or

[GitHub] [arrow] wesm commented on a change in pull request #7442: ARROW-9075: [C++] Optimized Filter implementation: faster performance + compilation, smaller code size

2020-06-17 Thread GitBox
wesm commented on a change in pull request #7442: URL: https://github.com/apache/arrow/pull/7442#discussion_r441668163 ## File path: cpp/src/arrow/compute/api_vector.h ## @@ -64,6 +67,24 @@ Result Filter(const Datum& values, const Datum& filter, const

[GitHub] [arrow] wesm commented on a change in pull request #7442: ARROW-9075: [C++] Optimized Filter implementation: faster performance + compilation, smaller code size

2020-06-17 Thread GitBox
wesm commented on a change in pull request #7442: URL: https://github.com/apache/arrow/pull/7442#discussion_r441662165 ## File path: cpp/src/arrow/compute/kernels/vector_selection.cc ## @@ -0,0 +1,1816 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or

[GitHub] [arrow] wesm commented on a change in pull request #7442: ARROW-9075: [C++] Optimized Filter implementation: faster performance + compilation, smaller code size

2020-06-17 Thread GitBox
wesm commented on a change in pull request #7442: URL: https://github.com/apache/arrow/pull/7442#discussion_r441661392 ## File path: cpp/src/arrow/compute/kernels/vector_selection.cc ## @@ -0,0 +1,1816 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or

[GitHub] [arrow] wesm commented on a change in pull request #7442: ARROW-9075: [C++] Optimized Filter implementation: faster performance + compilation, smaller code size

2020-06-17 Thread GitBox
wesm commented on a change in pull request #7442: URL: https://github.com/apache/arrow/pull/7442#discussion_r441660992 ## File path: cpp/src/arrow/compute/kernels/vector_selection.cc ## @@ -0,0 +1,1816 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or

[GitHub] [arrow] wesm commented on a change in pull request #7442: ARROW-9075: [C++] Optimized Filter implementation: faster performance + compilation, smaller code size

2020-06-17 Thread GitBox
wesm commented on a change in pull request #7442: URL: https://github.com/apache/arrow/pull/7442#discussion_r441660523 ## File path: cpp/src/arrow/compute/kernels/vector_selection.cc ## @@ -0,0 +1,1816 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or

[GitHub] [arrow] wesm commented on a change in pull request #7442: ARROW-9075: [C++] Optimized Filter implementation: faster performance + compilation, smaller code size

2020-06-17 Thread GitBox
wesm commented on a change in pull request #7442: URL: https://github.com/apache/arrow/pull/7442#discussion_r441659681 ## File path: cpp/src/arrow/compute/kernels/vector_selection.cc ## @@ -0,0 +1,1816 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or

[GitHub] [arrow] wesm commented on a change in pull request #7442: ARROW-9075: [C++] Optimized Filter implementation: faster performance + compilation, smaller code size

2020-06-17 Thread GitBox
wesm commented on a change in pull request #7442: URL: https://github.com/apache/arrow/pull/7442#discussion_r441658427 ## File path: cpp/src/arrow/testing/random.cc ## @@ -84,7 +84,7 @@ std::shared_ptr RandomArrayGenerator::Boolean(int64_t size, double probab

[GitHub] [arrow] wesm commented on a change in pull request #7442: ARROW-9075: [C++] Optimized Filter implementation: faster performance + compilation, smaller code size

2020-06-17 Thread GitBox
wesm commented on a change in pull request #7442: URL: https://github.com/apache/arrow/pull/7442#discussion_r441656051 ## File path: cpp/src/arrow/compute/kernels/util_internal.h ## @@ -0,0 +1,50 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more

[GitHub] [arrow] github-actions[bot] commented on pull request #7471: ARROW-9109: [Python][Packaging] Enable S3 support in manylinux wheels

2020-06-17 Thread GitBox
github-actions[bot] commented on pull request #7471: URL: https://github.com/apache/arrow/pull/7471#issuecomment-645462522 Revision: 0517d82791270284f4004ffe7812f08f2bb35235 Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] kszucs closed pull request #6512: ARROW-8430: [CI] Configure self-hosted runners for Github Actions

2020-06-17 Thread GitBox
kszucs closed pull request #6512: URL: https://github.com/apache/arrow/pull/6512 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] pitrou commented on pull request #7471: ARROW-9109: [Python][Packaging] Enable S3 support in manylinux wheels

2020-06-17 Thread GitBox
pitrou commented on pull request #7471: URL: https://github.com/apache/arrow/pull/7471#issuecomment-645461657 @github-actions crossbow submit -g wheel This is an automated message from the Apache Git Service. To respond to

[GitHub] [arrow] pitrou opened a new pull request #7471: ARROW-9109: [Python][Packaging] Enable S3 support in manylinux wheels

2020-06-17 Thread GitBox
pitrou opened a new pull request #7471: URL: https://github.com/apache/arrow/pull/7471 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow] wesm commented on a change in pull request #7442: ARROW-9075: [C++] Optimized Filter implementation: faster performance + compilation, smaller code size

2020-06-17 Thread GitBox
wesm commented on a change in pull request #7442: URL: https://github.com/apache/arrow/pull/7442#discussion_r441643717 ## File path: cpp/src/arrow/compute/kernels/vector_selection.cc ## @@ -0,0 +1,1816 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or

[GitHub] [arrow] fsaintjacques commented on a change in pull request #7468: ARROW-8283: [Python] Limit FileSystemDataset constructor from fragments/paths, no filesystem interaction

2020-06-17 Thread GitBox
fsaintjacques commented on a change in pull request #7468: URL: https://github.com/apache/arrow/pull/7468#discussion_r441638604 ## File path: python/pyarrow/_dataset.pyx ## @@ -407,42 +407,82 @@ cdef class UnionDataset(Dataset): cdef class FileSystemDataset(Dataset): -

[GitHub] [arrow] fsaintjacques commented on a change in pull request #7437: ARROW-8943: [C++][Python][Dataset] Add partitioning support to ParquetDatasetFactory

2020-06-17 Thread GitBox
fsaintjacques commented on a change in pull request #7437: URL: https://github.com/apache/arrow/pull/7437#discussion_r441633483 ## File path: python/pyarrow/tests/test_dataset.py ## @@ -1522,19 +1522,20 @@ def _create_parquet_dataset_partitioned(root_path):

[GitHub] [arrow] fsaintjacques closed pull request #7438: ARROW-9105: [C++][Dataset][Python] Pass an explicit schema to split_by_row_groups

2020-06-17 Thread GitBox
fsaintjacques closed pull request #7438: URL: https://github.com/apache/arrow/pull/7438 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow] bkietz commented on a change in pull request #7442: ARROW-9075: [C++] Optimized Filter implementation: faster performance + compilation, smaller code size

2020-06-17 Thread GitBox
bkietz commented on a change in pull request #7442: URL: https://github.com/apache/arrow/pull/7442#discussion_r440923999 ## File path: cpp/src/arrow/compute/api_vector.h ## @@ -64,6 +67,24 @@ Result Filter(const Datum& values, const Datum& filter, const

[GitHub] [arrow] wesm commented on pull request #7396: ARROW-9092: [C++][TRIAGE] Do not enable TestRoundFunctions when using LLVM 9 until gandiva-decimal-test is fixed

2020-06-17 Thread GitBox
wesm commented on pull request #7396: URL: https://github.com/apache/arrow/pull/7396#issuecomment-645434797 It occurs for me every time. My toolchain environment is somewhat complex (because I'm using some conda packages) so I'll see if I can provide a reproduction for you (it could take

[GitHub] [arrow] fsaintjacques commented on a change in pull request #7437: ARROW-8943: [C++][Python][Dataset] Add partitioning support to ParquetDatasetFactory

2020-06-17 Thread GitBox
fsaintjacques commented on a change in pull request #7437: URL: https://github.com/apache/arrow/pull/7437#discussion_r441621088 ## File path: cpp/src/arrow/dataset/file_parquet.h ## @@ -215,6 +215,34 @@ class ARROW_DS_EXPORT ParquetFileFragment : public FileFragment {

[GitHub] [arrow] nealrichardson commented on pull request #7287: ARROW-8771: [C++] Add boost/process library to build support

2020-06-17 Thread GitBox
nealrichardson commented on pull request #7287: URL: https://github.com/apache/arrow/pull/7287#issuecomment-645432605 That's odd. I wonder what's different about your build setup from the jobs we run on CI because I haven't seen that before. Do you think you could add a crossbow job that

[GitHub] [arrow] wesm commented on a change in pull request #7442: ARROW-9075: [C++] Optimized Filter implementation: faster performance + compilation, smaller code size

2020-06-17 Thread GitBox
wesm commented on a change in pull request #7442: URL: https://github.com/apache/arrow/pull/7442#discussion_r441613340 ## File path: cpp/src/arrow/compute/kernels/util_internal.cc ## @@ -0,0 +1,61 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more

[GitHub] [arrow] wesm commented on a change in pull request #7442: ARROW-9075: [C++] Optimized Filter implementation: faster performance + compilation, smaller code size

2020-06-17 Thread GitBox
wesm commented on a change in pull request #7442: URL: https://github.com/apache/arrow/pull/7442#discussion_r441613579 ## File path: cpp/src/arrow/compute/kernels/vector_selection_test.cc ## @@ -0,0 +1,1637 @@ +// Licensed to the Apache Software Foundation (ASF) under one +//

[GitHub] [arrow] wesm commented on a change in pull request #7442: ARROW-9075: [C++] Optimized Filter implementation: faster performance + compilation, smaller code size

2020-06-17 Thread GitBox
wesm commented on a change in pull request #7442: URL: https://github.com/apache/arrow/pull/7442#discussion_r441612459 ## File path: cpp/src/arrow/compute/kernels/vector_selection.cc ## @@ -0,0 +1,1816 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or

[GitHub] [arrow] wesm commented on a change in pull request #7461: ARROW-8969: [C++] Reduce binary size of kernels/scalar_compare.cc.o by reusing more kernels between types, operators

2020-06-17 Thread GitBox
wesm commented on a change in pull request #7461: URL: https://github.com/apache/arrow/pull/7461#discussion_r441609892 ## File path: cpp/src/arrow/compute/kernels/codegen_internal.h ## @@ -807,11 +885,30 @@ ArrayKernelExec SignedInteger(detail::GetTypeId get_id) { } }

[GitHub] [arrow] fsaintjacques commented on a change in pull request #7442: ARROW-9075: [C++] Optimized Filter implementation: faster performance + compilation, smaller code size

2020-06-17 Thread GitBox
fsaintjacques commented on a change in pull request #7442: URL: https://github.com/apache/arrow/pull/7442#discussion_r441606010 ## File path: cpp/src/arrow/compute/kernels/vector_selection_test.cc ## @@ -0,0 +1,1637 @@ +// Licensed to the Apache Software Foundation (ASF) under

[GitHub] [arrow] wesm commented on a change in pull request #7461: ARROW-8969: [C++] Reduce binary size of kernels/scalar_compare.cc.o by reusing more kernels between types, operators

2020-06-17 Thread GitBox
wesm commented on a change in pull request #7461: URL: https://github.com/apache/arrow/pull/7461#discussion_r441606788 ## File path: cpp/src/arrow/compute/kernels/codegen_internal.h ## @@ -485,18 +502,44 @@ struct ScalarUnaryNotNullStateful { struct ArrayExec> { static

[GitHub] [arrow] wesm commented on a change in pull request #7461: ARROW-8969: [C++] Reduce binary size of kernels/scalar_compare.cc.o by reusing more kernels between types, operators

2020-06-17 Thread GitBox
wesm commented on a change in pull request #7461: URL: https://github.com/apache/arrow/pull/7461#discussion_r441606788 ## File path: cpp/src/arrow/compute/kernels/codegen_internal.h ## @@ -485,18 +502,44 @@ struct ScalarUnaryNotNullStateful { struct ArrayExec> { static

[GitHub] [arrow] wesm commented on a change in pull request #7461: ARROW-8969: [C++] Reduce binary size of kernels/scalar_compare.cc.o by reusing more kernels between types, operators

2020-06-17 Thread GitBox
wesm commented on a change in pull request #7461: URL: https://github.com/apache/arrow/pull/7461#discussion_r441605587 ## File path: cpp/src/arrow/compute/kernels/scalar_compare.cc ## @@ -54,72 +56,106 @@ struct GreaterEqual { } }; -struct Less { - template - static

[GitHub] [arrow] wesm commented on a change in pull request #7461: ARROW-8969: [C++] Reduce binary size of kernels/scalar_compare.cc.o by reusing more kernels between types, operators

2020-06-17 Thread GitBox
wesm commented on a change in pull request #7461: URL: https://github.com/apache/arrow/pull/7461#discussion_r441605587 ## File path: cpp/src/arrow/compute/kernels/scalar_compare.cc ## @@ -54,72 +56,106 @@ struct GreaterEqual { } }; -struct Less { - template - static

[GitHub] [arrow] wesm commented on a change in pull request #7461: ARROW-8969: [C++] Reduce binary size of kernels/scalar_compare.cc.o by reusing more kernels between types, operators

2020-06-17 Thread GitBox
wesm commented on a change in pull request #7461: URL: https://github.com/apache/arrow/pull/7461#discussion_r441605115 ## File path: cpp/src/arrow/type.h ## @@ -1251,6 +1256,7 @@ class ARROW_EXPORT TimestampType : public TemporalType, public ParametricType { static

[GitHub] [arrow] wesm commented on a change in pull request #7461: ARROW-8969: [C++] Reduce binary size of kernels/scalar_compare.cc.o by reusing more kernels between types, operators

2020-06-17 Thread GitBox
wesm commented on a change in pull request #7461: URL: https://github.com/apache/arrow/pull/7461#discussion_r441604605 ## File path: cpp/src/arrow/scalar.h ## @@ -237,19 +246,17 @@ struct ARROW_EXPORT FixedSizeBinaryScalar : public BinaryScalar { explicit

[GitHub] [arrow] itamarst commented on pull request #7169: ARROW-5359: [Python] Support non-nanosecond out-of-range timestamps in conversion to pandas

2020-06-17 Thread GitBox
itamarst commented on pull request #7169: URL: https://github.com/apache/arrow/pull/7169#issuecomment-645414175 Great, ready to merge then? This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [arrow] bkietz commented on a change in pull request #7461: ARROW-8969: [C++] Reduce binary size of kernels/scalar_compare.cc.o by reusing more kernels between types, operators

2020-06-17 Thread GitBox
bkietz commented on a change in pull request #7461: URL: https://github.com/apache/arrow/pull/7461#discussion_r441584777 ## File path: cpp/src/arrow/compute/kernels/codegen_internal.h ## @@ -807,11 +885,30 @@ ArrayKernelExec SignedInteger(detail::GetTypeId get_id) { } }

[GitHub] [arrow] itamarst commented on a change in pull request #7169: ARROW-5359: [Python] Support non-nanosecond out-of-range timestamps in conversion to pandas

2020-06-17 Thread GitBox
itamarst commented on a change in pull request #7169: URL: https://github.com/apache/arrow/pull/7169#discussion_r441594967 ## File path: cpp/src/arrow/python/arrow_to_pandas.cc ## @@ -1688,8 +1698,12 @@ static Status GetPandasWriterType(const ChunkedArray& data, const

[GitHub] [arrow] wesm commented on pull request #7449: ARROW-9133: [C++] Add utf8_upper and utf8_lower

2020-06-17 Thread GitBox
wesm commented on pull request #7449: URL: https://github.com/apache/arrow/pull/7449#issuecomment-645413587 I also agree with inlining the utf8proc functions until utf8proc can be patched to have better performance. I doubt that these optimizations will meaningfully impact the

[GitHub] [arrow] wesm commented on a change in pull request #7410: ARROW-971: [C++][Compute] IsValid, IsNull kernels

2020-06-17 Thread GitBox
wesm commented on a change in pull request #7410: URL: https://github.com/apache/arrow/pull/7410#discussion_r441588468 ## File path: cpp/src/arrow/compute/kernels/scalar_validity.cc ## @@ -0,0 +1,107 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more

[GitHub] [arrow] bkietz commented on a change in pull request #7461: ARROW-8969: [C++] Reduce binary size of kernels/scalar_compare.cc.o by reusing more kernels between types, operators

2020-06-17 Thread GitBox
bkietz commented on a change in pull request #7461: URL: https://github.com/apache/arrow/pull/7461#discussion_r441579699 ## File path: cpp/src/arrow/compute/kernels/codegen_internal.h ## @@ -787,6 +830,41 @@ ArrayKernelExec Integer(detail::GetTypeId get_id) { } }

[GitHub] [arrow] xhochy commented on pull request #7449: ARROW-9133: [C++] Add utf8_upper and utf8_lower

2020-06-17 Thread GitBox
xhochy commented on pull request #7449: URL: https://github.com/apache/arrow/pull/7449#issuecomment-645401816 > Would a lookup table in the order of 256kb (generated at runtime, not in the binary) per case mapping be acceptable for Arrow? I would find that acceptable if the mapping

  1   2   >