[GitHub] [arrow] github-actions[bot] commented on pull request #7465: PARQUET-1877: [C++] Reconcile thrift limits

2020-06-16 Thread GitBox
github-actions[bot] commented on pull request #7465: URL: https://github.com/apache/arrow/pull/7465#issuecomment-645152229 https://issues.apache.org/jira/browse/PARQUET-1877 This is an automated message from the Apache Git

[GitHub] [arrow] emkornfield edited a comment on pull request #7465: PARQUET-1877: [C++] Reconcile thrift limits

2020-06-16 Thread GitBox
emkornfield edited a comment on pull request #7465: URL: https://github.com/apache/arrow/pull/7465#issuecomment-645150494 CC @wesm @pitrou I would assume 1MM elements is still sufficient for any reasonable parquet file, but let me know if you think differently.

[GitHub] [arrow] emkornfield commented on pull request #7465: PARQUET-1877: [C++] Reconcile thrift limits

2020-06-16 Thread GitBox
emkornfield commented on pull request #7465: URL: https://github.com/apache/arrow/pull/7465#issuecomment-645150494 CC @wesm @pitrou This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [arrow] emkornfield opened a new pull request #7465: PARQUET-1877: [C++] Reconcile thrift limits

2020-06-16 Thread GitBox
emkornfield opened a new pull request #7465: URL: https://github.com/apache/arrow/pull/7465 Sets container size limit to have an upper bound memory footprint at the same order of magnitude as string size limits. This is an

[GitHub] [arrow] github-actions[bot] commented on pull request #7464: ARROW-9157: [Rust][Datafusion] create_physical_plan should take self as immutable reference

2020-06-16 Thread GitBox
github-actions[bot] commented on pull request #7464: URL: https://github.com/apache/arrow/pull/7464#issuecomment-645143002 https://issues.apache.org/jira/browse/ARROW-9157 This is an automated message from the Apache Git

[GitHub] [arrow] dhirschfeld commented on pull request #7461: ARROW-8969: [C++] Reduce binary size of kernels/scalar_compare.cc.o by reusing more kernels between types, operators

2020-06-16 Thread GitBox
dhirschfeld commented on pull request #7461: URL: https://github.com/apache/arrow/pull/7461#issuecomment-645141127 > Not to beat a dead horse about ARROW-9155 The bot is fine - I guess it links to whatever JIRA is listed in the title. That doesn't help if someone mentions a JIRA in

[GitHub] [arrow] houqp opened a new pull request #7464: ARROW-9157: [Rust][Datafusion] create_physical_plan should take self as immutable reference

2020-06-16 Thread GitBox
houqp opened a new pull request #7464: URL: https://github.com/apache/arrow/pull/7464 Since it's not mutating self, mutable reference is not necessary. This is an automated message from the Apache Git Service. To respond to

[GitHub] [arrow] ursabot commented on pull request #7314: ARROW-8996: [C++] runtime support for aggregate sum dense kernel

2020-06-16 Thread GitBox
ursabot commented on pull request #7314: URL: https://github.com/apache/arrow/pull/7314#issuecomment-645135876 [AMD64 Ubuntu 18.04 C++ Benchmark (#112762)](https://ci.ursalabs.org/#builders/73/builds/79) builder has been succeeded. Revision: 525caea882fe49c0248932fff77df6bcd3f2f477

[GitHub] [arrow] jianxind commented on pull request #7314: ARROW-8996: [C++] runtime support for aggregate sum dense kernel

2020-06-16 Thread GitBox
jianxind commented on pull request #7314: URL: https://github.com/apache/arrow/pull/7314#issuecomment-645130206 @ursabot benchmark --suite-filter=arrow-compute-aggregate-benchmark This is an automated message from the Apache

[GitHub] [arrow] wesm commented on pull request #7461: ARROW-8969: [C++] Reduce binary size of kernels/scalar_compare.cc.o by reusing more kernels between types, operators

2020-06-16 Thread GitBox
wesm commented on pull request #7461: URL: https://github.com/apache/arrow/pull/7461#issuecomment-645129920 Well my theory about greater/less didn't hold. The other relevant change was moving things into the anonymous namespace. It's possible that anonymous namespaces impact inlining

[GitHub] [arrow] ursabot commented on pull request #7461: ARROW-8969: [C++] Reduce binary size of kernels/scalar_compare.cc.o by reusing more kernels between types, operators

2020-06-16 Thread GitBox
ursabot commented on pull request #7461: URL: https://github.com/apache/arrow/pull/7461#issuecomment-645120596 [AMD64 Ubuntu 18.04 C++ Benchmark (#112729)](https://ci.ursalabs.org/#builders/73/builds/78) builder has been succeeded. Revision: 74caaae25e3bd95d57f3f6d9b835c2610639ab41

[GitHub] [arrow] github-actions[bot] commented on pull request #7463: ARROW-9145: [C++] Implement BooleanArray::true_count and false_count, add Python bindings

2020-06-16 Thread GitBox
github-actions[bot] commented on pull request #7463: URL: https://github.com/apache/arrow/pull/7463#issuecomment-645116913 https://issues.apache.org/jira/browse/ARROW-9145 This is an automated message from the Apache Git

[GitHub] [arrow] wesm commented on pull request #7461: ARROW-8969: [C++] Reduce binary size of kernels/scalar_compare.cc.o by reusing more kernels between types, operators

2020-06-16 Thread GitBox
wesm commented on pull request #7461: URL: https://github.com/apache/arrow/pull/7461#issuecomment-645116862 Well we have these bot comments, is it not sufficient? https://github.com/apache/arrow/pull/7461#issuecomment-645086851

[GitHub] [arrow] wesm commented on pull request #7462: ARROW-7068: [C++] Add ListArray::offsets and LargeListArray::offsets returning boxed version of offsets as Int32Array/Int64Array

2020-06-16 Thread GitBox
wesm commented on pull request #7462: URL: https://github.com/apache/arrow/pull/7462#issuecomment-645116463 I'm a bit stumped on the MinGW failure ``` [ 64%] Linking CXX executable ../../release/arrow-array-test.exe

[GitHub] [arrow] dhirschfeld commented on pull request #7461: ARROW-8969: [C++] Reduce binary size of kernels/scalar_compare.cc.o by reusing more kernels between types, operators

2020-06-16 Thread GitBox
dhirschfeld commented on pull request #7461: URL: https://github.com/apache/arrow/pull/7461#issuecomment-645115775 To help follow along it would be handy if references to JIRA could be autolinked:

[GitHub] [arrow] wesm commented on pull request #7461: ARROW-8969: [C++] Reduce binary size of kernels/scalar_compare.cc.o by reusing more kernels between types, operators

2020-06-16 Thread GitBox
wesm commented on pull request #7461: URL: https://github.com/apache/arrow/pull/7461#issuecomment-645113493 Not to beat a dead horse about ARROW-9155, but the turnaround time for simple benchmarks isn't great This is an

[GitHub] [arrow] wesm commented on pull request #7461: ARROW-8969: [C++] Reduce binary size of kernels/scalar_compare.cc.o by reusing more kernels between types, operators

2020-06-16 Thread GitBox
wesm commented on pull request #7461: URL: https://github.com/apache/arrow/pull/7461#issuecomment-645113285 @ursabot benchmark --benchmark-filter=Greater 18e559b This is an automated message from the Apache Git Service. To

[GitHub] [arrow] wesm commented on pull request #7461: ARROW-8969: [C++] Reduce binary size of kernels/scalar_compare.cc.o by reusing more kernels between types, operators

2020-06-16 Thread GitBox
wesm commented on pull request #7461: URL: https://github.com/apache/arrow/pull/7461#issuecomment-645111941 Ah! It's because Greater is not implemented using Less. Let me switch things around This is an automated message

[GitHub] [arrow] wesm edited a comment on pull request #7461: ARROW-8969: [C++] Reduce binary size of kernels/scalar_compare.cc.o by reusing more kernels between types, operators

2020-06-16 Thread GitBox
wesm edited a comment on pull request #7461: URL: https://github.com/apache/arrow/pull/7461#issuecomment-645111941 Ah! It's because Greater is now implemented using Less. Let me switch things around so things are based on Greater/GreaterEqual instead

[GitHub] [arrow] ursabot commented on pull request #7461: ARROW-8969: [C++] Reduce binary size of kernels/scalar_compare.cc.o by reusing more kernels between types, operators

2020-06-16 Thread GitBox
ursabot commented on pull request #7461: URL: https://github.com/apache/arrow/pull/7461#issuecomment-645111207 [AMD64 Ubuntu 18.04 C++ Benchmark (#112703)](https://ci.ursalabs.org/#builders/73/builds/77) builder has been succeeded. Revision: 301ffa539e634f2c464ca072cd5c543f1407f1f7

[GitHub] [arrow] wesm commented on pull request #7463: ARROW-9145: [C++] Implement BooleanArray::true_count and false_count, add Python bindings

2020-06-16 Thread GitBox
wesm commented on pull request #7463: URL: https://github.com/apache/arrow/pull/7463#issuecomment-645110679 FWIW `BooleanArray::true_count()` should be what's used for the `sum(boolean)` kernel This is an automated message

[GitHub] [arrow] wesm opened a new pull request #7463: ARROW-9145: [C++] Implement BooleanArray::true_count and false_count, add Python bindings

2020-06-16 Thread GitBox
wesm opened a new pull request #7463: URL: https://github.com/apache/arrow/pull/7463 This seemed like a reasonable place to put this, and it seems like it may come in handy. This is an automated message from the Apache Git

[GitHub] [arrow] wesm commented on a change in pull request #7461: ARROW-8969: [C++] Reduce binary size of kernels/scalar_compare.cc.o by reusing more kernels between types, operators

2020-06-16 Thread GitBox
wesm commented on a change in pull request #7461: URL: https://github.com/apache/arrow/pull/7461#discussion_r441240680 ## File path: cpp/src/arrow/compute/kernels/codegen_internal.h ## @@ -121,18 +123,34 @@ struct ArrayIterator> { template struct ArrayIterator> { -

[GitHub] [arrow] wesm commented on pull request #7461: ARROW-8969: [C++] Reduce binary size of kernels/scalar_compare.cc.o by reusing more kernels between types, operators

2020-06-16 Thread GitBox
wesm commented on pull request #7461: URL: https://github.com/apache/arrow/pull/7461#issuecomment-645104527 @ursabot benchmark --benchmark-filter=Greater 18e559b This is an automated message from the Apache Git Service. To

[GitHub] [arrow] liyafan82 commented on pull request #7287: ARROW-8771: [C++] Add boost/process library to build support

2020-06-16 Thread GitBox
liyafan82 commented on pull request #7287: URL: https://github.com/apache/arrow/pull/7287#issuecomment-645103962 > @liyafan82 I rebuilt the boost bundle and uploaded to bintray. Can you re-run whichever tests you have that failed because of this before and see if they work now? If they're

[GitHub] [arrow] wesm commented on pull request #7461: ARROW-8969: [C++] Reduce binary size of kernels/scalar_compare.cc.o by reusing more kernels between types, operators

2020-06-16 Thread GitBox
wesm commented on pull request #7461: URL: https://github.com/apache/arrow/pull/7461#issuecomment-645102793 There weren't any C++ unit tests for comparisons of primitive types so I addressed that, and also added comparisons for Time and Duration types (which on account of this patch are

[GitHub] [arrow] github-actions[bot] commented on pull request #7462: ARROW-7068: [C++] Add ListArray::offsets and LargeListArray::offsets returning boxed version of offsets as Int32Array/Int64Array

2020-06-16 Thread GitBox
github-actions[bot] commented on pull request #7462: URL: https://github.com/apache/arrow/pull/7462#issuecomment-645098523 https://issues.apache.org/jira/browse/ARROW-7068 This is an automated message from the Apache Git

[GitHub] [arrow] wesm opened a new pull request #7462: ARROW-7068: [C++] Add ListArray::offsets and LargeListArray::offsets returning boxed version of offsets as Int32Array/Int64Array

2020-06-16 Thread GitBox
wesm opened a new pull request #7462: URL: https://github.com/apache/arrow/pull/7462 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow] ursabot commented on pull request #7461: ARROW-8969: [C++] Reduce binary size of kernels/scalar_compare.cc.o by reusing more kernels between types, operators

2020-06-16 Thread GitBox
ursabot commented on pull request #7461: URL: https://github.com/apache/arrow/pull/7461#issuecomment-645087761 [AMD64 Ubuntu 18.04 C++ Benchmark (#112653)](https://ci.ursalabs.org/#builders/73/builds/76) builder has been succeeded. Revision: 53671af32c338fcca1edc40732c4c5fd1ad7585e

[GitHub] [arrow] github-actions[bot] commented on pull request #7461: ARROW-8969: [C++] Reduce binary size of kernels/scalar_compare.cc.o by reusing more kernels between types, operators

2020-06-16 Thread GitBox
github-actions[bot] commented on pull request #7461: URL: https://github.com/apache/arrow/pull/7461#issuecomment-645086851 https://issues.apache.org/jira/browse/ARROW-8969 This is an automated message from the Apache Git

[GitHub] [arrow] kou closed pull request #7430: ARROW-9126: [C++] Fix building trimmed Boost bundle on Windows

2020-06-16 Thread GitBox
kou closed pull request #7430: URL: https://github.com/apache/arrow/pull/7430 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] kou commented on pull request #7430: ARROW-9126: [C++] Fix building trimmed Boost bundle on Windows

2020-06-16 Thread GitBox
kou commented on pull request #7430: URL: https://github.com/apache/arrow/pull/7430#issuecomment-645084134 > Done. Thanks! Now, I'm a member of https://bintray.com/ursalabs . (Please ignore my meaningless join request...) > We could I guess, though I don't think I have

[GitHub] [arrow] wesm commented on pull request #7461: ARROW-8969: [C++] Reduce binary size of kernels/scalar_compare.cc.o by reusing more kernels between types, operators

2020-06-16 Thread GitBox
wesm commented on pull request #7461: URL: https://github.com/apache/arrow/pull/7461#issuecomment-645083255 @ursabot benchmark --benchmark-filter=Greater 18e559b This is an automated message from the Apache Git Service. To

[GitHub] [arrow] wesm commented on pull request #7461: ARROW-8969: [C++] Reduce binary size of kernels/scalar_compare.cc.o by reusing more kernels between types, operators

2020-06-16 Thread GitBox
wesm commented on pull request #7461: URL: https://github.com/apache/arrow/pull/7461#issuecomment-645083142 @ursabot benchmark --benchmark_filter=Greater 18e559b This is an automated message from the Apache Git Service. To

[GitHub] [arrow] ursabot commented on pull request #7461: ARROW-8969: [C++] Reduce binary size of kernels/scalar_compare.cc.o by reusing more kernels between types, operators

2020-06-16 Thread GitBox
ursabot commented on pull request #7461: URL: https://github.com/apache/arrow/pull/7461#issuecomment-645083146 ``` no such option: --benchmark_filter ``` This is an automated message from the Apache Git Service. To

[GitHub] [arrow] wesm opened a new pull request #7461: ARROW-8969: [C++] Reduce binary size of kernels/scalar_compare.cc.o by reusing more kernels between types, operators

2020-06-16 Thread GitBox
wesm opened a new pull request #7461: URL: https://github.com/apache/arrow/pull/7461 With clang-8 on Linux this unit is now 654KB down from 1257KB. A few strategies: * Use same binary code for Less/Greater and LessEqual/GreaterEqual with arguments flipped * Reuse kernels

[GitHub] [arrow] github-actions[bot] commented on pull request #7460: ARROW-9154: [Developer] Use GitHub issue templates better

2020-06-16 Thread GitBox
github-actions[bot] commented on pull request #7460: URL: https://github.com/apache/arrow/pull/7460#issuecomment-645058491 https://issues.apache.org/jira/browse/ARROW-9154 This is an automated message from the Apache Git

[GitHub] [arrow] nealrichardson opened a new pull request #7460: ARROW-9154: [Developer] Use GitHub issue templates better

2020-06-16 Thread GitBox
nealrichardson opened a new pull request #7460: URL: https://github.com/apache/arrow/pull/7460 To check it out on my fork, go to https://github.com/nealrichardson/arrow/issues and click New Issue This is an automated

[GitHub] [arrow] github-actions[bot] commented on pull request #7459: ARROW-6800: [C++] Support building libraries targeting C++14 or higher, disable GNU CXX extensions

2020-06-16 Thread GitBox
github-actions[bot] commented on pull request #7459: URL: https://github.com/apache/arrow/pull/7459#issuecomment-645049497 https://issues.apache.org/jira/browse/ARROW-6800 This is an automated message from the Apache Git

[GitHub] [arrow] wesm commented on pull request #7459: ARROW-6800: [C++] Support building libraries targeting C++14 or higher, disable GNU CXX extensions

2020-06-16 Thread GitBox
wesm commented on pull request #7459: URL: https://github.com/apache/arrow/pull/7459#issuecomment-645049112 OK I guess I'm a bit out of my depth on the CMAKE_CXX_EXTENSIONS issue. If anyone has ideas about what to do let me know

[GitHub] [arrow] wesm commented on pull request #7459: ARROW-6800: [C++] Support building libraries targeting C++14 or higher, disable GNU CXX extensions

2020-06-16 Thread GitBox
wesm commented on pull request #7459: URL: https://github.com/apache/arrow/pull/7459#issuecomment-645047309 Well this is weird ``` CMake Error in src/plasma/CMakeLists.txt: Target "plasma-external-store-tests" requires the language dialect "CXXOFF" , but CMake does not

[GitHub] [arrow] wesm commented on pull request #7459: ARROW-6800: [C++] Support building libraries targeting C++14 or higher, disable GNU CXX extensions

2020-06-16 Thread GitBox
wesm commented on pull request #7459: URL: https://github.com/apache/arrow/pull/7459#issuecomment-645047223 @kou @brills @kszucs could use your advice on possible concerns regarding the -std=gnu++11 to -std=c++11 change when using gcc

[GitHub] [arrow] wesm opened a new pull request #7459: ARROW-6800: [C++] Support building libraries targeting C++14 or higher, disable GNU CXX extensions

2020-06-16 Thread GitBox
wesm opened a new pull request #7459: URL: https://github.com/apache/arrow/pull/7459 This seemed pretty simple. The C++ standard targeted by default is still C++11 but some users might want to target a higher standard (e.g. to get C++17 `std::string_view`). I also disabled use of

[GitHub] [arrow] nealrichardson closed pull request #7455: ARROW-9031: [R] Implement conversion from Type::UINT64 to R vector

2020-06-16 Thread GitBox
nealrichardson closed pull request #7455: URL: https://github.com/apache/arrow/pull/7455 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow] nealrichardson commented on pull request #7455: ARROW-9031: [R] Implement conversion from Type::UINT64 to R vector

2020-06-16 Thread GitBox
nealrichardson commented on pull request #7455: URL: https://github.com/apache/arrow/pull/7455#issuecomment-645045336 Deferring to https://jira.apache.org/jira/browse/ARROW-9083 questions of whether there are better ways to convert. This at least no longer errors and gives something

[GitHub] [arrow] nealrichardson edited a comment on pull request #7287: ARROW-8771: [C++] Add boost/process library to build support

2020-06-16 Thread GitBox
nealrichardson edited a comment on pull request #7287: URL: https://github.com/apache/arrow/pull/7287#issuecomment-645042129 @liyafan82 I rebuilt the boost bundle and uploaded to bintray. Can you re-run whichever tests you have that failed because of this before and see if they work now?

[GitHub] [arrow] nealrichardson commented on a change in pull request #7287: ARROW-8771: [C++] Add boost/process library to build support

2020-06-16 Thread GitBox
nealrichardson commented on a change in pull request #7287: URL: https://github.com/apache/arrow/pull/7287#discussion_r441176530 ## File path: cpp/build-support/trim-boost.sh ## @@ -32,12 +32,12 @@ set -eu :

[GitHub] [arrow] nealrichardson commented on pull request #7287: ARROW-8771: [C++] Add boost/process library to build support

2020-06-16 Thread GitBox
nealrichardson commented on pull request #7287: URL: https://github.com/apache/arrow/pull/7287#issuecomment-645042129 @liyafan82 I rebuilt the boost bundle and uploaded to bintray. Can you re-run whichever tests you have that failed because of this before and see if they work now?

[GitHub] [arrow] kou commented on a change in pull request #7436: ARROW-9094: [Python] Bump versions of compiled dependencies in manylinux wheels

2020-06-16 Thread GitBox
kou commented on a change in pull request #7436: URL: https://github.com/apache/arrow/pull/7436#discussion_r441174856 ## File path: python/manylinux1/scripts/build_boost.sh ## @@ -16,12 +16,12 @@ # specific language governing permissions and limitations # under the License.

[GitHub] [arrow] nealrichardson commented on pull request #7430: ARROW-9126: [C++] Fix building trimmed Boost bundle on Windows

2020-06-16 Thread GitBox
nealrichardson commented on pull request #7430: URL: https://github.com/apache/arrow/pull/7430#issuecomment-645041382 Uploaded and re-triggered the failing build for confirmation that it's working now. This is an automated

[GitHub] [arrow] nealrichardson commented on pull request #7430: ARROW-9126: [C++] Fix building trimmed Boost bundle on Windows

2020-06-16 Thread GitBox
nealrichardson commented on pull request #7430: URL: https://github.com/apache/arrow/pull/7430#issuecomment-645041249 @github-actions crossbow submit test-r-rstudio-r-base-3.6-bionic This is an automated message from the

[GitHub] [arrow] github-actions[bot] commented on pull request #7458: ARROW-9122: [C++] Properly handle sliced arrays in ascii_lower, ascii_upper kernels

2020-06-16 Thread GitBox
github-actions[bot] commented on pull request #7458: URL: https://github.com/apache/arrow/pull/7458#issuecomment-645039862 https://issues.apache.org/jira/browse/ARROW-9122 This is an automated message from the Apache Git

[GitHub] [arrow] wesm closed pull request #7445: ARROW-8583: [C++][Doc] Undocumented parameter in Dataset namespace

2020-06-16 Thread GitBox
wesm closed pull request #7445: URL: https://github.com/apache/arrow/pull/7445 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] wesm opened a new pull request #7458: ARROW-9122: [C++] Properly handle sliced arrays in ascii_lower, ascii_upper kernels

2020-06-16 Thread GitBox
wesm opened a new pull request #7458: URL: https://github.com/apache/arrow/pull/7458 We must both take into account the referenced range of the data buffer and shift the value offsets if the array slice offset is nonzero.

[GitHub] [arrow] wesm commented on pull request #7315: ARROW-7605: [C++] Bundle jemalloc into static libarrow

2020-06-16 Thread GitBox
wesm commented on pull request #7315: URL: https://github.com/apache/arrow/pull/7315#issuecomment-645038858 Will this approach work on Windows? This doesn't look like it's going to work for all our dependencies that might be built in bundled mode. I still think the static lib splicing

[GitHub] [arrow] wesm edited a comment on pull request #7315: ARROW-7605: [C++] Bundle jemalloc into static libarrow

2020-06-16 Thread GitBox
wesm edited a comment on pull request #7315: URL: https://github.com/apache/arrow/pull/7315#issuecomment-645038858 Will this approach work on Windows? This doesn't look like it's going to work for all our dependencies that might be built in bundled mode. I still think the static lib

[GitHub] [arrow] wesm closed pull request #7435: ARROW-8779: [R] Implement conversion to List

2020-06-16 Thread GitBox
wesm closed pull request #7435: URL: https://github.com/apache/arrow/pull/7435 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] wesm commented on pull request #7435: ARROW-8779: [R] Implement conversion to List

2020-06-16 Thread GitBox
wesm commented on pull request #7435: URL: https://github.com/apache/arrow/pull/7435#issuecomment-645037889 thanks @romainfrancois! This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [arrow] wesm commented on a change in pull request #7435: ARROW-8779: [R] Implement conversion to List

2020-06-16 Thread GitBox
wesm commented on a change in pull request #7435: URL: https://github.com/apache/arrow/pull/7435#discussion_r441170772 ## File path: r/src/array_from_vector.cpp ## @@ -201,6 +202,67 @@ struct VectorToArrayConverter { return Status::OK(); } + template +

[GitHub] [arrow] wesm closed pull request #7410: ARROW-971: [C++][Compute] IsValid, IsNull kernels

2020-06-16 Thread GitBox
wesm closed pull request #7410: URL: https://github.com/apache/arrow/pull/7410 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] wesm commented on pull request #7410: ARROW-971: [C++][Compute] IsValid, IsNull kernels

2020-06-16 Thread GitBox
wesm commented on pull request #7410: URL: https://github.com/apache/arrow/pull/7410#issuecomment-645034827 Merging This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow] kszucs commented on pull request #6512: ARROW-8430: [CI] Configure self-hosted runners for Github Actions

2020-06-16 Thread GitBox
kszucs commented on pull request #6512: URL: https://github.com/apache/arrow/pull/6512#issuecomment-645027704 Tried but it didn’t work. I can further investigate it, but passing “-j” fixed the resource issues for now. This

[GitHub] [arrow] kou commented on a change in pull request #6512: ARROW-8430: [CI] Configure self-hosted runners for Github Actions

2020-06-16 Thread GitBox
kou commented on a change in pull request #6512: URL: https://github.com/apache/arrow/pull/6512#discussion_r441156336 ## File path: ci/docker/ubuntu-14.04-cpp.dockerfile ## @@ -68,7 +68,7 @@ ENV ARROW_BUILD_TESTS=ON \ ARROW_WITH_BROTLI=ON \ ARROW_WITH_BZ2=ON \

[GitHub] [arrow] wesm commented on pull request #7449: ARROW-9133: [C++] Add utf8_upper and utf8_lower

2020-06-16 Thread GitBox
wesm commented on pull request #7449: URL: https://github.com/apache/arrow/pull/7449#issuecomment-645023981 I went ahead and asked https://github.com/ufal/unilib/issues/2 This is an automated message from the Apache Git

[GitHub] [arrow] nealrichardson commented on pull request #7430: ARROW-9126: [C++] Fix building trimmed Boost bundle on Windows

2020-06-16 Thread GitBox
nealrichardson commented on pull request #7430: URL: https://github.com/apache/arrow/pull/7430#issuecomment-645021672 > OK. Could you update the bundle and then re-run the build? Yes, will do > And could you add me ( https://bintray.com/kou ) to https://bintray.com/ursalabs

[GitHub] [arrow] wesm commented on a change in pull request #7449: ARROW-9133: [C++] Add utf8_upper and utf8_lower

2020-06-16 Thread GitBox
wesm commented on a change in pull request #7449: URL: https://github.com/apache/arrow/pull/7449#discussion_r441146931 ## File path: cpp/src/arrow/compute/kernels/scalar_string.cc ## @@ -30,6 +31,21 @@ namespace internal { namespace { +// Code units in the range [a-z] can

[GitHub] [arrow] kou commented on pull request #6729: ARROW-8229: [Java] Move ArrowBuf into the Arrow package

2020-06-16 Thread GitBox
kou commented on pull request #6729: URL: https://github.com/apache/arrow/pull/6729#issuecomment-645016478 Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow] kou commented on pull request #7430: ARROW-9126: [C++] Fix building trimmed Boost bundle on Windows

2020-06-16 Thread GitBox
kou commented on pull request #7430: URL: https://github.com/apache/arrow/pull/7430#issuecomment-645015432 OK. Could you update the bundle and then re-run the build? And could you add me ( https://bintray.com/kou ) to https://bintray.com/ursalabs 's members? I may want to upload

[GitHub] [arrow] wesm commented on pull request #7410: ARROW-971: [C++][Compute] IsValid, IsNull kernels

2020-06-16 Thread GitBox
wesm commented on pull request #7410: URL: https://github.com/apache/arrow/pull/7410#issuecomment-645011421 @bkietz if you want to disable the ascii_* tests that are failing please go ahead This is an automated message from

[GitHub] [arrow] wesm commented on a change in pull request #7410: ARROW-971: [C++][Compute] IsValid, IsNull kernels

2020-06-16 Thread GitBox
wesm commented on a change in pull request #7410: URL: https://github.com/apache/arrow/pull/7410#discussion_r441139760 ## File path: cpp/src/arrow/compute/kernels/scalar_validity.cc ## @@ -0,0 +1,107 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more

[GitHub] [arrow] wesm edited a comment on pull request #7442: ARROW-9075: [C++] Optimized Filter implementation: faster performance + compilation, smaller code size

2020-06-16 Thread GitBox
wesm edited a comment on pull request #7442: URL: https://github.com/apache/arrow/pull/7442#issuecomment-645004792 I'll have to deal with the string optimization in a follow up PR, so I'm going to leave this for review as is. It would be good to get this merged sooner rather than later.

[GitHub] [arrow] wesm commented on pull request #7442: ARROW-9075: [C++] Optimized Filter implementation: faster performance + compilation, smaller code size

2020-06-16 Thread GitBox
wesm commented on pull request #7442: URL: https://github.com/apache/arrow/pull/7442#issuecomment-645004792 I'll have to deal with the string optimization in a follow up PR, so I'm going to leave this for review as is. It would be good to get this merged sooner rather than later

[GitHub] [arrow] nealrichardson closed pull request #7441: ARROW-3446: [R] Document mapping of Arrow <-> R types

2020-06-16 Thread GitBox
nealrichardson closed pull request #7441: URL: https://github.com/apache/arrow/pull/7441 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow] nealrichardson closed pull request #7457: ARROW-9151: [R][CI] Fix Rtools 4.0 build: pacman sync

2020-06-16 Thread GitBox
nealrichardson closed pull request #7457: URL: https://github.com/apache/arrow/pull/7457 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow] bkietz commented on pull request #7156: ARROW-8074: [C++][Dataset][Python] FileFragments from buffers and NativeFiles

2020-06-16 Thread GitBox
bkietz commented on pull request #7156: URL: https://github.com/apache/arrow/pull/7156#issuecomment-644973548 Okay, I'll start trimming This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [arrow] bkietz commented on a change in pull request #7410: ARROW-971: [C++][Compute] IsValid, IsNull kernels

2020-06-16 Thread GitBox
bkietz commented on a change in pull request #7410: URL: https://github.com/apache/arrow/pull/7410#discussion_r441096806 ## File path: cpp/src/arrow/compute/kernels/scalar_validity.cc ## @@ -0,0 +1,110 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or

[GitHub] [arrow] github-actions[bot] commented on pull request #7457: ARROW-9151: [R][CI] Fix Rtools 4.0 build: pacman sync

2020-06-16 Thread GitBox
github-actions[bot] commented on pull request #7457: URL: https://github.com/apache/arrow/pull/7457#issuecomment-644969431 https://issues.apache.org/jira/browse/ARROW-9151 This is an automated message from the Apache Git

[GitHub] [arrow] fsaintjacques commented on pull request #7156: ARROW-8074: [C++][Dataset][Python] FileFragments from buffers and NativeFiles

2020-06-16 Thread GitBox
fsaintjacques commented on pull request #7156: URL: https://github.com/apache/arrow/pull/7156#issuecomment-644969405 Yes, we can already create FileFragment from any [FileSource](https://github.com/apache/arrow/blob/master/cpp/src/arrow/dataset/file_base.h#L153-L157). You make a valid

[GitHub] [arrow] nealrichardson closed pull request #7453: ARROW-9141: [R] Update cross-package documentation links

2020-06-16 Thread GitBox
nealrichardson closed pull request #7453: URL: https://github.com/apache/arrow/pull/7453 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow] jorisvandenbossche commented on pull request #7156: ARROW-8074: [C++][Dataset][Python] FileFragments from buffers and NativeFiles

2020-06-16 Thread GitBox
jorisvandenbossche commented on pull request #7156: URL: https://github.com/apache/arrow/pull/7156#issuecomment-644966741 Note that it is not *only* for testing. We for sure use it for testing in pyarrow, but in pandas 1.0.4, we accidentally broke reading parquet files from file-like

[GitHub] [arrow] jorisvandenbossche edited a comment on pull request #7156: ARROW-8074: [C++][Dataset][Python] FileFragments from buffers and NativeFiles

2020-06-16 Thread GitBox
jorisvandenbossche edited a comment on pull request #7156: URL: https://github.com/apache/arrow/pull/7156#issuecomment-644966084 Taking a step back: wouldn't it be possible to eg "just" allow to create a Fragment from a buffer instead from a file? In practice, I think we only need

[GitHub] [arrow] nealrichardson closed pull request #7451: ARROW-8769: [C++][R] Add convenience accessor for StructScalar fields

2020-06-16 Thread GitBox
nealrichardson closed pull request #7451: URL: https://github.com/apache/arrow/pull/7451 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow] nealrichardson commented on pull request #7451: ARROW-8769: [C++][R] Add convenience accessor for StructScalar fields

2020-06-16 Thread GitBox
nealrichardson commented on pull request #7451: URL: https://github.com/apache/arrow/pull/7451#issuecomment-644966570 Rtools build failure is ARROW-9151. I'll merge. This is an automated message from the Apache Git Service.

[GitHub] [arrow] nealrichardson opened a new pull request #7457: ARROW-9151: [R][CI] Fix Rtools 4.0 build: pacman sync

2020-06-16 Thread GitBox
nealrichardson opened a new pull request #7457: URL: https://github.com/apache/arrow/pull/7457 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [arrow] jorisvandenbossche commented on pull request #7156: ARROW-8074: [C++][Dataset][Python] FileFragments from buffers and NativeFiles

2020-06-16 Thread GitBox
jorisvandenbossche commented on pull request #7156: URL: https://github.com/apache/arrow/pull/7156#issuecomment-644966084 Taking a step back: wouldn't it be possible to eg "just" allow to create a Fragment from a buffer instead from a file? In practice, I think we only need to

[GitHub] [arrow] fsaintjacques commented on a change in pull request #7451: ARROW-8769: [C++][R] Add convenience accessor for StructScalar fields

2020-06-16 Thread GitBox
fsaintjacques commented on a change in pull request #7451: URL: https://github.com/apache/arrow/pull/7451#discussion_r441071413 ## File path: cpp/src/arrow/scalar_test.cc ## @@ -433,4 +433,24 @@ TYPED_TEST(TestNumericScalar, Cast) { } } +TEST(TestStructScalar,

[GitHub] [arrow] nealrichardson commented on a change in pull request #7441: ARROW-3446: [R] Document mapping of Arrow <-> R types

2020-06-16 Thread GitBox
nealrichardson commented on a change in pull request #7441: URL: https://github.com/apache/arrow/pull/7441#discussion_r441079053 ## File path: r/src/array_to_vector.cpp ## @@ -418,6 +418,7 @@ class Converter_Struct : public Converter { std::vector> converters; }; +//

[GitHub] [arrow] fsaintjacques commented on a change in pull request #7441: ARROW-3446: [R] Document mapping of Arrow <-> R types

2020-06-16 Thread GitBox
fsaintjacques commented on a change in pull request #7441: URL: https://github.com/apache/arrow/pull/7441#discussion_r440846641 ## File path: r/vignettes/arrow.Rmd ## @@ -86,7 +88,73 @@ to other applications and services that use Arrow. One example is Spark: the move data to

[GitHub] [arrow] nealrichardson commented on a change in pull request #7441: ARROW-3446: [R] Document mapping of Arrow <-> R types

2020-06-16 Thread GitBox
nealrichardson commented on a change in pull request #7441: URL: https://github.com/apache/arrow/pull/7441#discussion_r441074458 ## File path: r/vignettes/arrow.Rmd ## @@ -86,7 +88,73 @@ to other applications and services that use Arrow. One example is Spark: the move data

[GitHub] [arrow] maartenbreddels commented on pull request #7452: ARROW-8961: [C++] Add utf8proc library to toolchain

2020-06-16 Thread GitBox
maartenbreddels commented on pull request #7452: URL: https://github.com/apache/arrow/pull/7452#issuecomment-644951837 @xhochy do you want me to rebase https://github.com/apache/arrow/pull/7449 on this? So we can see if it's all working?

[GitHub] [arrow] maartenbreddels commented on pull request #7449: ARROW-9133: [C++] Add utf8_upper and utf8_lower

2020-06-16 Thread GitBox
maartenbreddels commented on pull request #7449: URL: https://github.com/apache/arrow/pull/7449#issuecomment-644949978 It's not *that* slow, it was 40% of Vaex' performance (single threaded), so I think there is a bit more to be gained still. But I have added an optimization that tries

[GitHub] [arrow] github-actions[bot] commented on pull request #7453: ARROW-9141: [R] Update cross-package documentation links

2020-06-16 Thread GitBox
github-actions[bot] commented on pull request #7453: URL: https://github.com/apache/arrow/pull/7453#issuecomment-644946345 Revision: bdaec94d2b1c88216a4b9e0395c98438b5cf5c5e Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] nealrichardson commented on pull request #7453: ARROW-9141: [R] Update cross-package documentation links

2020-06-16 Thread GitBox
nealrichardson commented on pull request #7453: URL: https://github.com/apache/arrow/pull/7453#issuecomment-644945052 @github-actions crossbow submit *as-cran* This is an automated message from the Apache Git Service. To

[GitHub] [arrow] bkietz commented on a change in pull request #7410: ARROW-971: [C++][Compute] IsValid, IsNull kernels

2020-06-16 Thread GitBox
bkietz commented on a change in pull request #7410: URL: https://github.com/apache/arrow/pull/7410#discussion_r441019388 ## File path: cpp/src/arrow/compute/kernels/scalar_validity.cc ## @@ -0,0 +1,110 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or

[GitHub] [arrow] fsaintjacques edited a comment on pull request #7156: ARROW-8074: [C++][Dataset][Python] FileFragments from buffers and NativeFiles

2020-06-16 Thread GitBox
fsaintjacques edited a comment on pull request #7156: URL: https://github.com/apache/arrow/pull/7156#issuecomment-644939437 I feel 0.5 on this PR in general, the functionality it adds is initially for testing and it introduces debt. I'm not keen on the change on FileSystemFactory since

[GitHub] [arrow] nealrichardson commented on pull request #7455: ARROW-9031: [R] Implement conversion from Type::UINT64 to R vector

2020-06-16 Thread GitBox
nealrichardson commented on pull request #7455: URL: https://github.com/apache/arrow/pull/7455#issuecomment-644940220 > Why does `int64` use `Converter_Int64` but the other types are handled differently? Historical reasons I suppose. See also this discussion:

[GitHub] [arrow] fsaintjacques commented on pull request #7156: ARROW-8074: [C++][Dataset][Python] FileFragments from buffers and NativeFiles

2020-06-16 Thread GitBox
fsaintjacques commented on pull request #7156: URL: https://github.com/apache/arrow/pull/7156#issuecomment-644939437 I feel 0.5 on this PR in general, the functionality it adds is initially for testing and it introduces debt. I'm not keen on the change on FileSystemFactory since this

[GitHub] [arrow] rymurr commented on a change in pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

2020-06-16 Thread GitBox
rymurr commented on a change in pull request #7290: URL: https://github.com/apache/arrow/pull/7290#discussion_r441052320 ## File path: java/vector/src/main/codegen/templates/UnionVector.java ## @@ -325,12 +361,45 @@ private void allocateTypeBuffer() {

[GitHub] [arrow] rymurr commented on a change in pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

2020-06-16 Thread GitBox
rymurr commented on a change in pull request #7290: URL: https://github.com/apache/arrow/pull/7290#discussion_r441050322 ## File path: java/vector/src/main/codegen/templates/UnionVector.java ## @@ -586,7 +686,9 @@ public ValueVector getVectorByType(int typeId) { }

[GitHub] [arrow] rymurr commented on a change in pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

2020-06-16 Thread GitBox
rymurr commented on a change in pull request #7290: URL: https://github.com/apache/arrow/pull/7290#discussion_r441047163 ## File path: java/vector/src/main/codegen/templates/UnionVector.java ## @@ -493,6 +576,19 @@ public void splitAndTransfer(int startIndex, int length) {

  1   2   3   >