[GitHub] [arrow] kou closed pull request #7564: ARROW-9255: [C++] Use CMake to build bundled Protobuf with CMake >= 3.7

2020-06-27 Thread GitBox
kou closed pull request #7564: URL: https://github.com/apache/arrow/pull/7564 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] kou commented on pull request #7564: ARROW-9255: [C++] Use CMake to build bundled Protobuf with CMake >= 3.7

2020-06-27 Thread GitBox
kou commented on pull request #7564: URL: https://github.com/apache/arrow/pull/7564#issuecomment-650695347 +1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [arrow] tianchen92 commented on pull request #6156: ARROW-7539: [Java] FieldVector getFieldBuffers API should not set reader/writer indices

2020-06-27 Thread GitBox
tianchen92 commented on pull request #6156: URL: https://github.com/apache/arrow/pull/6156#issuecomment-650685609 > Does this impact IPC? seems not, IPC used getFieldBuffers which has the right buffer order, this PR is going to replace getFieldBuffers with getBuffers (getBuffers has

[GitHub] [arrow] liyafan82 commented on pull request #7347: ARROW-8230: [Java] Remove netty dependency from arrow-memory

2020-06-27 Thread GitBox
liyafan82 commented on pull request #7347: URL: https://github.com/apache/arrow/pull/7347#issuecomment-650685726 @rymurr Thanks for your effort. I will make another pass today. This is an automated message from the Apache

[GitHub] [arrow] liyafan82 commented on a change in pull request #7347: ARROW-8230: [Java] Remove netty dependency from arrow-memory

2020-06-27 Thread GitBox
liyafan82 commented on a change in pull request #7347: URL: https://github.com/apache/arrow/pull/7347#discussion_r446595068 ## File path: java/memory/src/main/java/org/apache/arrow/memory/rounding/DefaultRoundingPolicy.java ## @@ -17,33 +17,107 @@ package

[GitHub] [arrow] liyafan82 commented on a change in pull request #7347: ARROW-8230: [Java] Remove netty dependency from arrow-memory

2020-06-27 Thread GitBox
liyafan82 commented on a change in pull request #7347: URL: https://github.com/apache/arrow/pull/7347#discussion_r446594755 ## File path: java/memory/src/main/java/org/apache/arrow/memory/ArrowBuf.java ## @@ -227,13 +207,28 @@ public ArrowBuf slice(long index, long length) {

[GitHub] [arrow] github-actions[bot] commented on pull request #7564: ARROW-9255: [C++] Use CMake to build bundled Protobuf with CMake >= 3.7

2020-06-27 Thread GitBox
github-actions[bot] commented on pull request #7564: URL: https://github.com/apache/arrow/pull/7564#issuecomment-650682223 Revision: a86f3649bdce9c5b2f58174488615725883b1f5b Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] kou commented on pull request #7564: ARROW-9255: [C++] Use CMake to build bundled Protobuf with CMake >= 3.7

2020-06-27 Thread GitBox
kou commented on pull request #7564: URL: https://github.com/apache/arrow/pull/7564#issuecomment-650681862 @github-actions crossbow submit -g linux -g wheel This is an automated message from the Apache Git Service. To

[GitHub] [arrow] github-actions[bot] commented on pull request #7564: ARROW-9255: [C++] Use CMake to build bundled Protobuf with CMake >= 3.7

2020-06-27 Thread GitBox
github-actions[bot] commented on pull request #7564: URL: https://github.com/apache/arrow/pull/7564#issuecomment-650676307 https://issues.apache.org/jira/browse/ARROW-9255 This is an automated message from the Apache Git

[GitHub] [arrow] kou opened a new pull request #7564: ARROW-9255: [C++] Use CMake to build bundled Protobuf with CMake >= 3.7

2020-06-27 Thread GitBox
kou opened a new pull request #7564: URL: https://github.com/apache/arrow/pull/7564 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] wesm commented on pull request #7315: ARROW-7605: [C++] Bundle jemalloc into static libarrow

2020-06-27 Thread GitBox
wesm commented on pull request #7315: URL: https://github.com/apache/arrow/pull/7315#issuecomment-650658746 I'm going to close this for now and attempt to pursue the static library splicing solution for 1.0.0 This is an

[GitHub] [arrow] wesm closed pull request #7315: ARROW-7605: [C++] Bundle jemalloc into static libarrow

2020-06-27 Thread GitBox
wesm closed pull request #7315: URL: https://github.com/apache/arrow/pull/7315 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] github-actions[bot] commented on pull request #7563: ARROW-8888: [Python] Do not use thread pool when converting pandas columns that are definitely zero-copyable

2020-06-27 Thread GitBox
github-actions[bot] commented on pull request #7563: URL: https://github.com/apache/arrow/pull/7563#issuecomment-650656331 https://issues.apache.org/jira/browse/ARROW- This is an automated message from the Apache Git

[GitHub] [arrow] wesm opened a new pull request #7563: ARROW-8888: [Python] Do not use thread pool when converting pandas columns that are definitely zero-copyable

2020-06-27 Thread GitBox
wesm opened a new pull request #7563: URL: https://github.com/apache/arrow/pull/7563 The ThreadPoolExecutor has a good amount of per-column overhead This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] github-actions[bot] commented on pull request #7562: ARROW-7273: [Python][C++][Parquet] Do not permit constructing a non-nullable null field in Python, catch this case in Arrow->Parqu

2020-06-27 Thread GitBox
github-actions[bot] commented on pull request #7562: URL: https://github.com/apache/arrow/pull/7562#issuecomment-650652746 https://issues.apache.org/jira/browse/ARROW-7273 This is an automated message from the Apache Git

[GitHub] [arrow] wesm commented on pull request #7560: ARROW-9252: [Integration] Factor out IPC integration tests into script, add back 0.14.1 "gold" files

2020-06-27 Thread GitBox
wesm commented on pull request #7560: URL: https://github.com/apache/arrow/pull/7560#issuecomment-650650589 Looks like the int64 tests must be removed from the "gold" corpus as the JSON files cannot be parsed anymore This

[GitHub] [arrow] wesm opened a new pull request #7562: ARROW-7273: [Python][C++][Parquet] Do not permit constructing a non-nullable null field in Python, catch this case in Arrow->Parquet schema conve

2020-06-27 Thread GitBox
wesm opened a new pull request #7562: URL: https://github.com/apache/arrow/pull/7562 This was the simplest triage I could think of. This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [arrow] wesm commented on pull request #7556: ARROW-9188: [C++] Use Brotli shared libraries if they are available

2020-06-27 Thread GitBox
wesm commented on pull request #7556: URL: https://github.com/apache/arrow/pull/7556#issuecomment-650648681 @nealrichardson I figure this might impact the R packages also This is an automated message from the Apache Git

[GitHub] [arrow] github-actions[bot] commented on pull request #7561: ARROW-9254: [C++] Split out CastNumberToNumberUnsafe function from scalar_cast_numeric, add data()/mutable_data() functions for ac

2020-06-27 Thread GitBox
github-actions[bot] commented on pull request #7561: URL: https://github.com/apache/arrow/pull/7561#issuecomment-650647251 https://issues.apache.org/jira/browse/ARROW-9254 This is an automated message from the Apache Git

[GitHub] [arrow] wesm opened a new pull request #7561: ARROW-9254: [C++] Split out CastNumberToNumberUnsafe function from scalar_cast_numeric, add data()/mutable_data() functions for accessing primiti

2020-06-27 Thread GitBox
wesm opened a new pull request #7561: URL: https://github.com/apache/arrow/pull/7561 This is some preparatory work for ARROW-9196. I also addressed some prior uncleanliness related to unboxing temporal scalars based on C types. By adding these `data()` and `mutable_data()` functions we

[GitHub] [arrow] github-actions[bot] commented on pull request #7560: ARROW-9252: [Integration] Factor out IPC integration tests into script, add back 0.14.1 "gold" files

2020-06-27 Thread GitBox
github-actions[bot] commented on pull request #7560: URL: https://github.com/apache/arrow/pull/7560#issuecomment-650643241 https://issues.apache.org/jira/browse/ARROW-9252 This is an automated message from the Apache Git

[GitHub] [arrow] wesm opened a new pull request #7560: ARROW-9252: [Integration] Factor out IPC integration tests into script, add back 0.14.1 "gold" files

2020-06-27 Thread GitBox
wesm opened a new pull request #7560: URL: https://github.com/apache/arrow/pull/7560 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow] github-actions[bot] commented on pull request #7559: ARROW-9247: [Python] Expose total_values_length functions on BinaryArray, LargeBinaryArray

2020-06-27 Thread GitBox
github-actions[bot] commented on pull request #7559: URL: https://github.com/apache/arrow/pull/7559#issuecomment-650639293 https://issues.apache.org/jira/browse/ARROW-9247 This is an automated message from the Apache Git

[GitHub] [arrow] wesm commented on pull request #7559: ARROW-9247: [Python] Expose total_values_length functions on BinaryArray, LargeBinaryArray

2020-06-27 Thread GitBox
wesm commented on pull request #7559: URL: https://github.com/apache/arrow/pull/7559#issuecomment-650638222 cc @brills This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [arrow] wesm opened a new pull request #7559: ARROW-9247: [Python] Expose total_values_length functions on BinaryArray, LargeBinaryArray

2020-06-27 Thread GitBox
wesm opened a new pull request #7559: URL: https://github.com/apache/arrow/pull/7559 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow] github-actions[bot] commented on pull request #7558: ARROW-9250: [C++] Instantiate fewer templates in IsIn, Match kernel implementations

2020-06-27 Thread GitBox
github-actions[bot] commented on pull request #7558: URL: https://github.com/apache/arrow/pull/7558#issuecomment-650637397 https://issues.apache.org/jira/browse/ARROW-9250 This is an automated message from the Apache Git

[GitHub] [arrow] wesm commented on a change in pull request #7558: ARROW-9250: [C++] Instantiate fewer templates in IsIn, Match kernel implementations

2020-06-27 Thread GitBox
wesm commented on a change in pull request #7558: URL: https://github.com/apache/arrow/pull/7558#discussion_r446571732 ## File path: cpp/src/arrow/type.h ## @@ -900,7 +902,7 @@ class ARROW_EXPORT LargeStringType : public LargeBinaryType { public: static constexpr

[GitHub] [arrow] wesm opened a new pull request #7558: ARROW-9250: [C++] Instantiate fewer templates in IsIn, Match kernel implementations

2020-06-27 Thread GitBox
wesm opened a new pull request #7558: URL: https://github.com/apache/arrow/pull/7558 This yields a 150KB reduction in code for me on Linux. Since this may become a common pattern (using e.g. a single `uint32_t`-based function to process both int32/uint32), some of this may be

[GitHub] [arrow] github-actions[bot] commented on pull request #7557: ARROW-9251: [C++] Relocate integration testing JSON code implementation to src/arrow/testing

2020-06-27 Thread GitBox
github-actions[bot] commented on pull request #7557: URL: https://github.com/apache/arrow/pull/7557#issuecomment-650634874 https://issues.apache.org/jira/browse/ARROW-9251 This is an automated message from the Apache Git

[GitHub] [arrow] wesm commented on pull request #7556: ARROW-9188: [C++] Use Brotli shared libraries if they are available

2020-06-27 Thread GitBox
wesm commented on pull request #7556: URL: https://github.com/apache/arrow/pull/7556#issuecomment-650633720 It appears that the Brotli shared libraries are in the manylinux1 image even though `-DBUILD_SHARED_LIBS=OFF`

[GitHub] [arrow] wesm opened a new pull request #7557: ARROW-9251: [C++] Relocate integration testing JSON code implementation to src/arrow/testing

2020-06-27 Thread GitBox
wesm opened a new pull request #7557: URL: https://github.com/apache/arrow/pull/7557 While this code is not being shipped in any packages, I think it would be better for it to live in the testing directory so that its purpose is clear I think there may be potentially some value in

[GitHub] [arrow] github-actions[bot] commented on pull request #7556: ARROW-9188: [C++] Use Brotli shared libraries if they are available

2020-06-27 Thread GitBox
github-actions[bot] commented on pull request #7556: URL: https://github.com/apache/arrow/pull/7556#issuecomment-650627708 Revision: f675cd913b83c56bdbbe24ecc074059dfb382fd0 Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] wesm commented on pull request #7556: ARROW-9188: [C++] Use Brotli shared libraries if they are available

2020-06-27 Thread GitBox
wesm commented on pull request #7556: URL: https://github.com/apache/arrow/pull/7556#issuecomment-650627218 @github-actions crossbow submit -g linux -g wheel -g conda This is an automated message from the Apache Git Service.

[GitHub] [arrow] wesm commented on pull request #7556: ARROW-9188: [C++] Use Brotli shared libraries if they are available

2020-06-27 Thread GitBox
wesm commented on pull request #7556: URL: https://github.com/apache/arrow/pull/7556#issuecomment-650618145 Thanks, will look into this. I'm guessing these changes will break some of the Python wheel builds so we may need a flag to indicate a preference of shared vs static

[GitHub] [arrow] wesm commented on pull request #7449: ARROW-9133: [C++] Add utf8_upper and utf8_lower

2020-06-27 Thread GitBox
wesm commented on pull request #7449: URL: https://github.com/apache/arrow/pull/7449#issuecomment-650616427 Ok thanks, that's much appreciated This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] andygrove closed pull request #7494: ARROW-9184: [Rust][Datafusion] table scan without projection should return all columns

2020-06-27 Thread GitBox
andygrove closed pull request #7494: URL: https://github.com/apache/arrow/pull/7494 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] pitrou commented on pull request #7449: ARROW-9133: [C++] Add utf8_upper and utf8_lower

2020-06-27 Thread GitBox
pitrou commented on pull request #7449: URL: https://github.com/apache/arrow/pull/7449#issuecomment-650606456 @wesm I can also take this since you already have quite a bit on your plate. This is an automated message from the

[GitHub] [arrow] pitrou edited a comment on pull request #7449: ARROW-9133: [C++] Add utf8_upper and utf8_lower

2020-06-27 Thread GitBox
pitrou edited a comment on pull request #7449: URL: https://github.com/apache/arrow/pull/7449#issuecomment-650606456 @wesm I can also take this since you already have quite a bit on your plate for 1.0. This is an automated

[GitHub] [arrow] kiszk commented on pull request #7556: ARROW-9188: [C++] Use Brotli shared libraries if they are available

2020-06-27 Thread GitBox
kiszk commented on pull request #7556: URL: https://github.com/apache/arrow/pull/7556#issuecomment-650588895 Looks good except one minor comment. LZ4 and ZSTD also use the dynamic library at first if available. This is an

[GitHub] [arrow] kiszk commented on a change in pull request #7556: ARROW-9188: [C++] Use Brotli shared libraries if they are available

2020-06-27 Thread GitBox
kiszk commented on a change in pull request #7556: URL: https://github.com/apache/arrow/pull/7556#discussion_r446547216 ## File path: cpp/cmake_modules/FindBrotli.cmake ## @@ -17,29 +17,29 @@ # # find_package(Brotli) -# Favour static libraries over dynamic libraries, and

[GitHub] [arrow] github-actions[bot] commented on pull request #7556: ARROW-9188: [C++] Use Brotli shared libraries if they are available

2020-06-27 Thread GitBox
github-actions[bot] commented on pull request #7556: URL: https://github.com/apache/arrow/pull/7556#issuecomment-650583053 https://issues.apache.org/jira/browse/ARROW-9188 This is an automated message from the Apache Git

[GitHub] [arrow] wesm opened a new pull request #7556: ARROW-9188: [C++] Use Brotli shared libraries if they are available

2020-06-27 Thread GitBox
wesm opened a new pull request #7556: URL: https://github.com/apache/arrow/pull/7556 If both shared and static Brotli libraries are available, the static ones were being selected, causing ~750KB of code to be statically linked into libarrow.so on Linux. This is not consistent with our

[GitHub] [arrow] wesm closed pull request #7551: ARROW-9132: [C++] Support Unique and ValueCounts on dictionary data with non-changing dictionaries, add ChunkedArray::Make validating constructor

2020-06-27 Thread GitBox
wesm closed pull request #7551: URL: https://github.com/apache/arrow/pull/7551 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] wesm commented on pull request #7551: ARROW-9132: [C++] Support Unique and ValueCounts on dictionary data with non-changing dictionaries, add ChunkedArray::Make validating constructor

2020-06-27 Thread GitBox
wesm commented on pull request #7551: URL: https://github.com/apache/arrow/pull/7551#issuecomment-650580242 +1. If anyone desires refinements of `ChunkedArray::Make` please let me know and I will make them This is an

[GitHub] [arrow] wesm commented on pull request #7449: ARROW-9133: [C++] Add utf8_upper and utf8_lower

2020-06-27 Thread GitBox
wesm commented on pull request #7449: URL: https://github.com/apache/arrow/pull/7449#issuecomment-650579924 @maartenbreddels let me know if I can help with anything to get this merge-ready -- I want to make the utf8proc-depending code optional so I will need to make a small refactor after

[GitHub] [arrow] wesm commented on a change in pull request #7449: ARROW-9133: [C++] Add utf8_upper and utf8_lower

2020-06-27 Thread GitBox
wesm commented on a change in pull request #7449: URL: https://github.com/apache/arrow/pull/7449#discussion_r446541086 ## File path: cpp/src/arrow/compute/kernels/scalar_string.cc ## @@ -39,6 +158,121 @@ struct AsciiLength { } }; +template class Derived> +struct

[GitHub] [arrow] wesm closed pull request #7321: ARROW-8985: [Format] Add Decimal::bitWidth field with default value of 128 for forward compatibility

2020-06-27 Thread GitBox
wesm closed pull request #7321: URL: https://github.com/apache/arrow/pull/7321 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] wesm commented on pull request #7321: ARROW-8985: [Format] Add Decimal::bitWidth field with default value of 128 for forward compatibility

2020-06-27 Thread GitBox
wesm commented on pull request #7321: URL: https://github.com/apache/arrow/pull/7321#issuecomment-650575218 +1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [arrow] github-actions[bot] commented on pull request #7555: ARROW-9238: [C++][CI][FlightRPC] increase test coverage of round-robin under IPC and Flight

2020-06-27 Thread GitBox
github-actions[bot] commented on pull request #7555: URL: https://github.com/apache/arrow/pull/7555#issuecomment-650570060 https://issues.apache.org/jira/browse/ARROW-9238 This is an automated message from the Apache Git

[GitHub] [arrow] kiszk opened a new pull request #7555: ARROW-9238: [C++][CI][FlightRPC] increase test coverage of round-robin under IPC and Flight

2020-06-27 Thread GitBox
kiszk opened a new pull request #7555: URL: https://github.com/apache/arrow/pull/7555 This PR increase test coverage of round-robin under ipc and flight. Before this PR, round-robin tests for primitive data under ipc use only int32 (and boolean in some cases). This PR adds other primitive

[GitHub] [arrow] Demetrio92 commented on issue #1688: Possible to read categoricals back into Pandas from Parquet using Pyarrow?

2020-06-27 Thread GitBox
Demetrio92 commented on issue #1688: URL: https://github.com/apache/arrow/issues/1688#issuecomment-650559676 @wesm yeah, sorry, guys, you're awesome. I thought this was pandas repo... This is an automated message from the

[GitHub] [arrow] github-actions[bot] commented on pull request #7554: ARROW-9236: [Rust] CSV WriterBuilder never writes header

2020-06-27 Thread GitBox
github-actions[bot] commented on pull request #7554: URL: https://github.com/apache/arrow/pull/7554#issuecomment-650521333 https://issues.apache.org/jira/browse/ARROW-9236 This is an automated message from the Apache Git

[GitHub] [arrow] ritchie46 opened a new pull request #7554: ARROW-9236: [Rust] CSV WriterBuilder never writes header

2020-06-27 Thread GitBox
ritchie46 opened a new pull request #7554: URL: https://github.com/apache/arrow/pull/7554 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow] scampi commented on a change in pull request #6402: ARROW-7831: [Java] do not allocate a new offset buffer if the slice starts at 0 since the relative offset pointer would be unchange

2020-06-27 Thread GitBox
scampi commented on a change in pull request #6402: URL: https://github.com/apache/arrow/pull/6402#discussion_r446497720 ## File path: java/vector/src/main/java/org/apache/arrow/vector/BaseVariableWidthVector.java ## @@ -751,55 +757,57 @@ private void