[GitHub] [arrow] pitrou commented on pull request #7469: ARROW-8832: [Python] Provide better error message when S3/HDFS is not enabled in installation

2020-06-18 Thread GitBox
pitrou commented on pull request #7469: URL: https://github.com/apache/arrow/pull/7469#issuecomment-645927818 Yes, it's ok if it's not tested, IMHO. This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] jorisvandenbossche commented on pull request #7474: ARROW-8802: [C++][Dataset] Preserve dataset schema's metadata on column projection

2020-06-18 Thread GitBox
jorisvandenbossche commented on pull request #7474: URL: https://github.com/apache/arrow/pull/7474#issuecomment-645904783 So this was actually a "time bomb" in the extension type testing code: we were using a test extension type with the same name than an extension type that in the

[GitHub] [arrow] jorisvandenbossche edited a comment on pull request #7474: ARROW-8802: [C++][Dataset] Preserve dataset schema's metadata on column projection

2020-06-18 Thread GitBox
jorisvandenbossche edited a comment on pull request #7474: URL: https://github.com/apache/arrow/pull/7474#issuecomment-645904783 So this was actually a "time bomb" in the extension type testing code: we were using a test extension type with the same name than an extension type that in the

[GitHub] [arrow] jorisvandenbossche commented on pull request #7469: ARROW-8832: [Python] Provide better error message when S3/HDFS is not enabled in installation

2020-06-18 Thread GitBox
jorisvandenbossche commented on pull request #7469: URL: https://github.com/apache/arrow/pull/7469#issuecomment-645916538 Updated. Of course, now you added S3 support in wheels, this change will be less needed. This

[GitHub] [arrow] xhochy commented on a change in pull request #7452: ARROW-8961: [C++] Add utf8proc library to toolchain

2020-06-18 Thread GitBox
xhochy commented on a change in pull request #7452: URL: https://github.com/apache/arrow/pull/7452#discussion_r442159234 ## File path: cpp/cmake_modules/Findutf8proc.cmake ## @@ -0,0 +1,51 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor

[GitHub] [arrow] ursabot commented on pull request #7475: ARROW-8500: [C++] Add benchmark for using Filter on RecordBatch

2020-06-18 Thread GitBox
ursabot commented on pull request #7475: URL: https://github.com/apache/arrow/pull/7475#issuecomment-645968685 [AMD64 Ubuntu 18.04 C++ Benchmark (#113245)](https://ci.ursalabs.org/#builders/73/builds/85) builder failed with an exception. Revision:

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #7468: ARROW-8283: [Python] Limit FileSystemDataset constructor from fragments/paths, no filesystem interaction

2020-06-18 Thread GitBox
jorisvandenbossche commented on a change in pull request #7468: URL: https://github.com/apache/arrow/pull/7468#discussion_r442117037 ## File path: python/pyarrow/_dataset.pyx ## @@ -407,42 +407,82 @@ cdef class UnionDataset(Dataset): cdef class FileSystemDataset(Dataset):

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #7156: ARROW-8074: [C++][Dataset][Python] FileFragments from buffers and NativeFiles

2020-06-18 Thread GitBox
jorisvandenbossche commented on a change in pull request #7156: URL: https://github.com/apache/arrow/pull/7156#discussion_r442172970 ## File path: python/pyarrow/parquet.py ## @@ -1390,6 +1390,26 @@ def __init__(self, path_or_paths, filesystem=None, filters=None,

[GitHub] [arrow] jorisvandenbossche commented on pull request #7156: ARROW-8074: [C++][Dataset][Python] FileFragments from buffers and NativeFiles

2020-06-18 Thread GitBox
jorisvandenbossche commented on pull request #7156: URL: https://github.com/apache/arrow/pull/7156#issuecomment-645971102 The travis failure is an unrelated Flight failure This is an automated message from the Apache Git

[GitHub] [arrow] pitrou closed pull request #7469: ARROW-8832: [Python] Provide better error message when S3/HDFS is not enabled in installation

2020-06-18 Thread GitBox
pitrou closed pull request #7469: URL: https://github.com/apache/arrow/pull/7469 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] jorisvandenbossche commented on pull request #7156: ARROW-8074: [C++][Dataset][Python] FileFragments from buffers and NativeFiles

2020-06-18 Thread GitBox
jorisvandenbossche commented on pull request #7156: URL: https://github.com/apache/arrow/pull/7156#issuecomment-645976313 @bkietz did you already open some follow-up JIRAs? (eg for https://github.com/apache/arrow/pull/7156#discussion_r439503475) I will handle my comment at

[GitHub] [arrow] lidavidm commented on pull request #7476: ARROW-9168: [C++][Flight] configure TCP connection sharing in benchmark

2020-06-18 Thread GitBox
lidavidm commented on pull request #7476: URL: https://github.com/apache/arrow/pull/7476#issuecomment-645984765 > @lidavidm Looks we need another level of abstraction above flight client `generic_options`. Otherwise, we have to expose grpc constants(backend implementation) to client code.

[GitHub] [arrow] pitrou commented on a change in pull request #7452: ARROW-8961: [C++] Add utf8proc library to toolchain

2020-06-18 Thread GitBox
pitrou commented on a change in pull request #7452: URL: https://github.com/apache/arrow/pull/7452#discussion_r442127449 ## File path: cpp/cmake_modules/Findutf8proc.cmake ## @@ -0,0 +1,51 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #7474: ARROW-8802: [C++][Dataset] Preserve dataset schema's metadata on column projection

2020-06-18 Thread GitBox
jorisvandenbossche commented on a change in pull request #7474: URL: https://github.com/apache/arrow/pull/7474#discussion_r442102463 ## File path: python/pyarrow/tests/test_dataset.py ## @@ -1566,3 +1566,21 @@ def test_parquet_dataset_factory_partitioned(tempdir): result

[GitHub] [arrow] pitrou commented on pull request #7476: ARROW-9168: [C++][Flight] configure TCP connection sharing in benchmark

2020-06-18 Thread GitBox
pitrou commented on pull request #7476: URL: https://github.com/apache/arrow/pull/7476#issuecomment-645928473 I think we could simply enable this option by default in the Flight client. Sharing the same TCP connection for all Flight clients doesn't like a good idea at all.

[GitHub] [arrow] pitrou commented on pull request #7469: ARROW-8832: [Python] Provide better error message when S3/HDFS is not enabled in installation

2020-06-18 Thread GitBox
pitrou commented on pull request #7469: URL: https://github.com/apache/arrow/pull/7469#issuecomment-645916830 Only Linux wheels ;-) This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [arrow] jorisvandenbossche commented on pull request #7467: ARROW-9159: [Python] Enable Array.isnull method

2020-06-18 Thread GitBox
jorisvandenbossche commented on pull request #7467: URL: https://github.com/apache/arrow/pull/7467#issuecomment-645933837 Added `isvalid` as well. Two more questions: - Do we rather prefer the underscore version (`is_valid`, `is_null`)? Since the existing `isnull` wasn't working

[GitHub] [arrow] jorisvandenbossche closed pull request #7156: ARROW-8074: [C++][Dataset][Python] FileFragments from buffers and NativeFiles

2020-06-18 Thread GitBox
jorisvandenbossche closed pull request #7156: URL: https://github.com/apache/arrow/pull/7156 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [arrow] jorisvandenbossche commented on pull request #7469: ARROW-8832: [Python] Provide better error message when S3/HDFS is not enabled in installation

2020-06-18 Thread GitBox
jorisvandenbossche commented on pull request #7469: URL: https://github.com/apache/arrow/pull/7469#issuecomment-645924215 Ah, yes ;) Certainly still useful then. Are you fine with not testing this on CI? (I tested it locally by disabling S3, and in the end it is only a different

[GitHub] [arrow] chairmank commented on issue #3491: parquet lz4 interop with spark appears broken

2020-06-18 Thread GitBox
chairmank commented on issue #3491: URL: https://github.com/apache/arrow/issues/3491#issuecomment-646030004 I created https://issues.apache.org/jira/browse/PARQUET-1878. This is an automated message from the Apache Git

[GitHub] [arrow] kszucs commented on pull request #7475: ARROW-8500: [C++] Add benchmark for using Filter on RecordBatch

2020-06-18 Thread GitBox
kszucs commented on pull request #7475: URL: https://github.com/apache/arrow/pull/7475#issuecomment-646042154 Another guess is `FilterRecordBatch` doesn't exist in the contender. This is an automated message from the Apache

[GitHub] [arrow] ursabot removed a comment on pull request #7475: ARROW-8500: [C++] Add benchmark for using Filter on RecordBatch

2020-06-18 Thread GitBox
ursabot removed a comment on pull request #7475: URL: https://github.com/apache/arrow/pull/7475#issuecomment-646036977 [AMD64 Ubuntu 18.04 C++ Benchmark (#113280)](https://ci.ursalabs.org/#builders/73/builds/86) builder failed with an exception. Revision:

[GitHub] [arrow] Demetrio92 edited a comment on issue #1688: Possible to read categoricals back into Pandas from Parquet using Pyarrow?

2020-06-18 Thread GitBox
Demetrio92 edited a comment on issue #1688: URL: https://github.com/apache/arrow/issues/1688#issuecomment-646026392 Stumbled across this bug again. `pyarrow` preserves `category` as `dtype`, `fastparquet` **does not**. Docs don't mention it. They even kinda mislead the users:

[GitHub] [arrow] kszucs edited a comment on pull request #7475: ARROW-8500: [C++] Add benchmark for using Filter on RecordBatch

2020-06-18 Thread GitBox
kszucs edited a comment on pull request #7475: URL: https://github.com/apache/arrow/pull/7475#issuecomment-646025902 Buildbot parses a specific stdio format from the archery command which was a bit different for this invocation, my guess is passing a specific commit makes the output

[GitHub] [arrow] kszucs commented on pull request #7475: ARROW-8500: [C++] Add benchmark for using Filter on RecordBatch

2020-06-18 Thread GitBox
kszucs commented on pull request #7475: URL: https://github.com/apache/arrow/pull/7475#issuecomment-646026066 @ursabot benchmark --benchmark-filter=FilterRecordBatch This is an automated message from the Apache Git Service.

[GitHub] [arrow] Demetrio92 commented on issue #1688: Possible to read categoricals back into Pandas from Parquet using Pyarrow?

2020-06-18 Thread GitBox
Demetrio92 commented on issue #1688: URL: https://github.com/apache/arrow/issues/1688#issuecomment-646026392 Stumbled across this bug again. `pyarrow` preserves `category` as `dtype`, `fastparquet` **does not**. Docs don't mention it. They even kinda mislead the users:

[GitHub] [arrow] github-actions[bot] commented on pull request #7479: ARROW-7607: [C++] Example of using Arrow as a dependency of another CMake project

2020-06-18 Thread GitBox
github-actions[bot] commented on pull request #7479: URL: https://github.com/apache/arrow/pull/7479#issuecomment-646047855 https://issues.apache.org/jira/browse/ARROW-7607 This is an automated message from the Apache Git

[GitHub] [arrow] pitrou commented on a change in pull request #7478: ARROW-9055: [C++] Add sum/mean kernels for Boolean type

2020-06-18 Thread GitBox
pitrou commented on a change in pull request #7478: URL: https://github.com/apache/arrow/pull/7478#discussion_r442274099 ## File path: cpp/src/arrow/compute/kernels/aggregate_test.cc ## @@ -127,6 +127,41 @@ void ValidateSum(const Array& array) { ValidateSum(array,

[GitHub] [arrow] wesm commented on pull request #7475: ARROW-8500: [C++] Add benchmark for using Filter on RecordBatch

2020-06-18 Thread GitBox
wesm commented on pull request #7475: URL: https://github.com/apache/arrow/pull/7475#issuecomment-646078371 Note: those Filter benchmarks are garbage because they don't include the RandomArrayGenerator::Boolean bugfix This

[GitHub] [arrow] fsaintjacques closed pull request #7468: ARROW-8283: [Python] Limit FileSystemDataset constructor from fragments/paths, no filesystem interaction

2020-06-18 Thread GitBox
fsaintjacques closed pull request #7468: URL: https://github.com/apache/arrow/pull/7468 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow] fsaintjacques closed pull request #7474: ARROW-8802: [C++][Dataset] Preserve dataset schema's metadata on column projection

2020-06-18 Thread GitBox
fsaintjacques closed pull request #7474: URL: https://github.com/apache/arrow/pull/7474 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow] wesm commented on issue #3491: parquet lz4 interop with spark appears broken

2020-06-18 Thread GitBox
wesm commented on issue #3491: URL: https://github.com/apache/arrow/issues/3491#issuecomment-646023497 Can you please open a JIRA issue? This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [arrow] kszucs commented on a change in pull request #7478: ARROW-9055: [C++] Add sum/mean kernels for Boolean type

2020-06-18 Thread GitBox
kszucs commented on a change in pull request #7478: URL: https://github.com/apache/arrow/pull/7478#discussion_r442233159 ## File path: cpp/src/arrow/compute/kernels/aggregate_basic.cc ## @@ -255,6 +255,28 @@ struct SumState { } }; +template <> +struct SumState { + using

[GitHub] [arrow] wesm commented on pull request #7475: ARROW-8500: [C++] Add benchmark for using Filter on RecordBatch

2020-06-18 Thread GitBox
wesm commented on pull request #7475: URL: https://github.com/apache/arrow/pull/7475#issuecomment-646022727 @ursabot benchmark --benchmark-filter=FilterRecordBatch 22f3741 This is an automated message from the Apache Git

[GitHub] [arrow] chairmank commented on issue #3491: parquet lz4 interop with spark appears broken

2020-06-18 Thread GitBox
chairmank commented on issue #3491: URL: https://github.com/apache/arrow/issues/3491#issuecomment-646015745 I believe that [PARQUET-1241](https://issues.apache.org/jira/browse/PARQUET-1241) ("[C++] Use LZ4 frame format") does not directly address the issue that was reported here,

[GitHub] [arrow] kszucs commented on pull request #7475: ARROW-8500: [C++] Add benchmark for using Filter on RecordBatch

2020-06-18 Thread GitBox
kszucs commented on pull request #7475: URL: https://github.com/apache/arrow/pull/7475#issuecomment-646025902 Buildbot parses a specific stdio format from the archery command which was a bit different for this invocation, my guess is passing a specific commit makes the output format

[GitHub] [arrow] kszucs edited a comment on pull request #7475: ARROW-8500: [C++] Add benchmark for using Filter on RecordBatch

2020-06-18 Thread GitBox
kszucs edited a comment on pull request #7475: URL: https://github.com/apache/arrow/pull/7475#issuecomment-646025902 Buildbot parses a specific stdio format from the archery command which was a bit different for this invocation, my guess is passing a specific commit makes the output

[GitHub] [arrow] kszucs commented on a change in pull request #7478: ARROW-9055: [C++] Add sum/mean kernels for Boolean type

2020-06-18 Thread GitBox
kszucs commented on a change in pull request #7478: URL: https://github.com/apache/arrow/pull/7478#discussion_r442233159 ## File path: cpp/src/arrow/compute/kernels/aggregate_basic.cc ## @@ -255,6 +255,28 @@ struct SumState { } }; +template <> +struct SumState { + using

[GitHub] [arrow] pitrou commented on a change in pull request #7478: ARROW-9055: [C++] Add sum/mean kernels for Boolean type

2020-06-18 Thread GitBox
pitrou commented on a change in pull request #7478: URL: https://github.com/apache/arrow/pull/7478#discussion_r442285685 ## File path: cpp/src/arrow/compute/kernels/aggregate_test.cc ## @@ -127,6 +127,41 @@ void ValidateSum(const Array& array) { ValidateSum(array,

[GitHub] [arrow] wesm commented on a change in pull request #7478: ARROW-9055: [C++] Add sum/mean kernels for Boolean type

2020-06-18 Thread GitBox
wesm commented on a change in pull request #7478: URL: https://github.com/apache/arrow/pull/7478#discussion_r442290638 ## File path: cpp/src/arrow/compute/kernels/aggregate_basic.cc ## @@ -255,6 +255,28 @@ struct SumState { } }; +template <> +struct SumState { + using

[GitHub] [arrow] kszucs opened a new pull request #7478: ARROW-9055: [C++] Add sum/mean kernels for Boolean type

2020-06-18 Thread GitBox
kszucs opened a new pull request #7478: URL: https://github.com/apache/arrow/pull/7478 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow] wesm commented on pull request #7475: ARROW-8500: [C++] Add benchmark for using Filter on RecordBatch

2020-06-18 Thread GitBox
wesm commented on pull request #7475: URL: https://github.com/apache/arrow/pull/7475#issuecomment-646022860 @fsaintjacques @kszucs any idea what went wrong with buildbot? This is an automated message from the Apache Git

[GitHub] [arrow] ursabot commented on pull request #7475: ARROW-8500: [C++] Add benchmark for using Filter on RecordBatch

2020-06-18 Thread GitBox
ursabot commented on pull request #7475: URL: https://github.com/apache/arrow/pull/7475#issuecomment-646036977 [AMD64 Ubuntu 18.04 C++ Benchmark (#113280)](https://ci.ursalabs.org/#builders/73/builds/86) builder failed with an exception. Revision:

[GitHub] [arrow] pitrou opened a new pull request #7479: ARROW-7607: [C++] Example of using Arrow as a dependency of another CMake project

2020-06-18 Thread GitBox
pitrou opened a new pull request #7479: URL: https://github.com/apache/arrow/pull/7479 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow] ursabot commented on pull request #7475: ARROW-8500: [C++] Add benchmark for using Filter on RecordBatch

2020-06-18 Thread GitBox
ursabot commented on pull request #7475: URL: https://github.com/apache/arrow/pull/7475#issuecomment-646062913 [AMD64 Ubuntu 18.04 C++ Benchmark (#113289)](https://ci.ursalabs.org/#builders/73/builds/87) builder has been succeeded. Revision: 999865b042c3131920b52b40a2387535168f3a08

[GitHub] [arrow] pitrou commented on a change in pull request #7478: ARROW-9055: [C++] Add sum/mean kernels for Boolean type

2020-06-18 Thread GitBox
pitrou commented on a change in pull request #7478: URL: https://github.com/apache/arrow/pull/7478#discussion_r442286764 ## File path: cpp/src/arrow/compute/kernels/aggregate_test.cc ## @@ -127,6 +127,41 @@ void ValidateSum(const Array& array) { ValidateSum(array,

[GitHub] [arrow] pitrou commented on a change in pull request #7478: ARROW-9055: [C++] Add sum/mean kernels for Boolean type

2020-06-18 Thread GitBox
pitrou commented on a change in pull request #7478: URL: https://github.com/apache/arrow/pull/7478#discussion_r442286604 ## File path: cpp/src/arrow/compute/kernels/aggregate_test.cc ## @@ -127,6 +127,41 @@ void ValidateSum(const Array& array) { ValidateSum(array,

[GitHub] [arrow] pitrou commented on a change in pull request #7478: ARROW-9055: [C++] Add sum/mean kernels for Boolean type

2020-06-18 Thread GitBox
pitrou commented on a change in pull request #7478: URL: https://github.com/apache/arrow/pull/7478#discussion_r442286180 ## File path: cpp/src/arrow/compute/kernels/aggregate_test.cc ## @@ -127,6 +127,41 @@ void ValidateSum(const Array& array) { ValidateSum(array,

[GitHub] [arrow] github-actions[bot] commented on pull request #7478: ARROW-9055: [C++] Add sum/mean kernels for Boolean type

2020-06-18 Thread GitBox
github-actions[bot] commented on pull request #7478: URL: https://github.com/apache/arrow/pull/7478#issuecomment-646027414 https://issues.apache.org/jira/browse/ARROW-9055 This is an automated message from the Apache Git

[GitHub] [arrow] kszucs removed a comment on pull request #7475: ARROW-8500: [C++] Add benchmark for using Filter on RecordBatch

2020-06-18 Thread GitBox
kszucs removed a comment on pull request #7475: URL: https://github.com/apache/arrow/pull/7475#issuecomment-646026066 @ursabot benchmark --benchmark-filter=FilterRecordBatch This is an automated message from the Apache Git

[GitHub] [arrow] kszucs commented on pull request #7475: ARROW-8500: [C++] Add benchmark for using Filter on RecordBatch

2020-06-18 Thread GitBox
kszucs commented on pull request #7475: URL: https://github.com/apache/arrow/pull/7475#issuecomment-646041235 @ursabot benchmark --benchmark-filter=Filter 22f3741 This is an automated message from the Apache Git Service. To

[GitHub] [arrow] wesm commented on pull request #7475: ARROW-8500: [C++] Add benchmark for using Filter on RecordBatch

2020-06-18 Thread GitBox
wesm commented on pull request #7475: URL: https://github.com/apache/arrow/pull/7475#issuecomment-646076369 @kszucs oh right, that would do it This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] pitrou closed pull request #7476: ARROW-9168: [C++][Flight] Don't share TCP connection among clients

2020-06-18 Thread GitBox
pitrou closed pull request #7476: URL: https://github.com/apache/arrow/pull/7476 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] pitrou commented on pull request #7476: ARROW-9168: [C++][Flight] Don't share TCP connection among clients

2020-06-18 Thread GitBox
pitrou commented on pull request #7476: URL: https://github.com/apache/arrow/pull/7476#issuecomment-646094205 Thank you @cyb70289 ! This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [arrow] fsaintjacques closed pull request #7437: ARROW-8943: [C++][Python][Dataset] Add partitioning support to ParquetDatasetFactory

2020-06-18 Thread GitBox
fsaintjacques closed pull request #7437: URL: https://github.com/apache/arrow/pull/7437 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow] github-actions[bot] commented on pull request #7483: [Go] [ARROW-9174]: Fix table panic on 386

2020-06-18 Thread GitBox
github-actions[bot] commented on pull request #7483: URL: https://github.com/apache/arrow/pull/7483#issuecomment-646189795 Thanks for opening a pull request! Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW Then

[GitHub] [arrow] emkornfield commented on a change in pull request #6213: ARROW-7592: [C++] Fix crashes on corrupt IPC input

2020-06-18 Thread GitBox
emkornfield commented on a change in pull request #6213: URL: https://github.com/apache/arrow/pull/6213#discussion_r442434501 ## File path: cpp/src/arrow/type.cc ## @@ -501,20 +501,35 @@ Status Decimal128Type::Make(int32_t precision, int32_t scale, //

[GitHub] [arrow] github-actions[bot] commented on pull request #7480: ARROW-9176: [Rust] Fix for memory leaks in Arrow allocator

2020-06-18 Thread GitBox
github-actions[bot] commented on pull request #7480: URL: https://github.com/apache/arrow/pull/7480#issuecomment-646148492 https://issues.apache.org/jira/browse/ARROW-9176 This is an automated message from the Apache Git

[GitHub] [arrow] github-actions[bot] commented on pull request #7483: ARROW-9174: [Go] Fix table panic on 386

2020-06-18 Thread GitBox
github-actions[bot] commented on pull request #7483: URL: https://github.com/apache/arrow/pull/7483#issuecomment-646203360 https://issues.apache.org/jira/browse/ARROW-9174 This is an automated message from the Apache Git

[GitHub] [arrow] rgsl888prabhu commented on a change in pull request #6213: ARROW-7592: [C++] Fix crashes on corrupt IPC input

2020-06-18 Thread GitBox
rgsl888prabhu commented on a change in pull request #6213: URL: https://github.com/apache/arrow/pull/6213#discussion_r442430921 ## File path: cpp/src/arrow/type.cc ## @@ -501,20 +501,35 @@ Status Decimal128Type::Make(int32_t precision, int32_t scale, //

[GitHub] [arrow] wesm commented on pull request #7475: ARROW-8500: [C++] Add benchmark for using Filter on RecordBatch

2020-06-18 Thread GitBox
wesm commented on pull request #7475: URL: https://github.com/apache/arrow/pull/7475#issuecomment-646179658 +1, awaiting CI This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [arrow] pitrou opened a new pull request #7482: ARROW-9173: [C++][Doc] Document how to use Arrow from a third-party CMake project

2020-06-18 Thread GitBox
pitrou opened a new pull request #7482: URL: https://github.com/apache/arrow/pull/7482 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow] pitrou commented on a change in pull request #7481: ARROW-9175: [FlightRPC][C++] Expose peer to server

2020-06-18 Thread GitBox
pitrou commented on a change in pull request #7481: URL: https://github.com/apache/arrow/pull/7481#discussion_r442376353 ## File path: python/pyarrow/_flight.pyx ## @@ -1413,6 +1413,10 @@ cdef class ServerCallContext: """ return

[GitHub] [arrow] wesm commented on pull request #7479: ARROW-7607: [C++] Example of using Arrow as a dependency of another CMake project

2020-06-18 Thread GitBox
wesm commented on pull request #7479: URL: https://github.com/apache/arrow/pull/7479#issuecomment-646182037 I'll await @kszucs to have a look at the GHA / Docker configuration before merging This is an automated message

[GitHub] [arrow] raulbocanegra commented on pull request #7284: ARROW-7409: [C++][Python] Windows link error LNK1104: cannot open file 'python37_d.lib'

2020-06-18 Thread GitBox
raulbocanegra commented on pull request #7284: URL: https://github.com/apache/arrow/pull/7284#issuecomment-646182178 Hi! First of all, sorry for the delay guys. I tried the `-DPYTHON_EXECUTABLE=python_d` but it then raise an error related to some numpy library. I am really busy this

[GitHub] [arrow] wesm commented on pull request #7467: ARROW-9159: [Python] Implement Array.isnull/isvalid methods

2020-06-18 Thread GitBox
wesm commented on pull request #7467: URL: https://github.com/apache/arrow/pull/7467#issuecomment-646182878 Let's do the underscores version. Maybe people will curse us but "a foolish consistency is the hobgoblin of little minds"

[GitHub] [arrow] wesm closed pull request #7475: ARROW-8500: [C++] Add benchmark for using Filter on RecordBatch

2020-06-18 Thread GitBox
wesm closed pull request #7475: URL: https://github.com/apache/arrow/pull/7475 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] fsaintjacques commented on pull request #7475: ARROW-8500: [C++] Add benchmark for using Filter on RecordBatch

2020-06-18 Thread GitBox
fsaintjacques commented on pull request #7475: URL: https://github.com/apache/arrow/pull/7475#issuecomment-646243118 I confirm locally with taxi dataset, runtime for a low selectivity (total_amount > 200$, 120k / 1.5b rows) goes from 9s to 3s. Niice improvement.

[GitHub] [arrow] wesm commented on pull request #7475: ARROW-8500: [C++] Add benchmark for using Filter on RecordBatch

2020-06-18 Thread GitBox
wesm commented on pull request #7475: URL: https://github.com/apache/arrow/pull/7475#issuecomment-646264314 Good stuff. This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [arrow] lidavidm commented on pull request #7481: ARROW-9175: [FlightRPC][C++] Expose peer to server

2020-06-18 Thread GitBox
lidavidm commented on pull request #7481: URL: https://github.com/apache/arrow/pull/7481#issuecomment-646185813 > This looks like a good idea. What kind of adresses does grpc give out? Looks like this: ``` ipv6:[::1]:37626 ``` So probably most useful within a

[GitHub] [arrow] lidavidm commented on a change in pull request #7481: ARROW-9175: [FlightRPC][C++] Expose peer to server

2020-06-18 Thread GitBox
lidavidm commented on a change in pull request #7481: URL: https://github.com/apache/arrow/pull/7481#discussion_r442380286 ## File path: python/pyarrow/_flight.pyx ## @@ -1413,6 +1413,10 @@ cdef class ServerCallContext: """ return

[GitHub] [arrow] rgsl888prabhu commented on a change in pull request #6213: ARROW-7592: [C++] Fix crashes on corrupt IPC input

2020-06-18 Thread GitBox
rgsl888prabhu commented on a change in pull request #6213: URL: https://github.com/apache/arrow/pull/6213#discussion_r442419187 ## File path: cpp/src/arrow/type.cc ## @@ -501,20 +501,35 @@ Status Decimal128Type::Make(int32_t precision, int32_t scale, //

[GitHub] [arrow] pitrou commented on a change in pull request #6213: ARROW-7592: [C++] Fix crashes on corrupt IPC input

2020-06-18 Thread GitBox
pitrou commented on a change in pull request #6213: URL: https://github.com/apache/arrow/pull/6213#discussion_r442423611 ## File path: cpp/src/arrow/type.cc ## @@ -501,20 +501,35 @@ Status Decimal128Type::Make(int32_t precision, int32_t scale, //

[GitHub] [arrow] raulbocanegra edited a comment on pull request #7284: ARROW-7409: [C++][Python] Windows link error LNK1104: cannot open file 'python37_d.lib'

2020-06-18 Thread GitBox
raulbocanegra edited a comment on pull request #7284: URL: https://github.com/apache/arrow/pull/7284#issuecomment-646182178 Hi! First of all, sorry for the delay guys. I tried the `-DPYTHON_EXECUTABLE=python_d` suggested by @kou but it then raised an error related to some numpy

[GitHub] [arrow] wesm commented on a change in pull request #6213: ARROW-7592: [C++] Fix crashes on corrupt IPC input

2020-06-18 Thread GitBox
wesm commented on a change in pull request #6213: URL: https://github.com/apache/arrow/pull/6213#discussion_r442461673 ## File path: cpp/src/arrow/type.cc ## @@ -501,20 +501,35 @@ Status Decimal128Type::Make(int32_t precision, int32_t scale, //

[GitHub] [arrow] wesm commented on pull request #7454: ARROW-5744: [C++] Allow Table::CombineChunks to leave string columns chunked

2020-06-18 Thread GitBox
wesm commented on pull request #7454: URL: https://github.com/apache/arrow/pull/7454#issuecomment-646345142 Made the small change so can get this merged This is an automated message from the Apache Git Service. To respond to

[GitHub] [arrow] brills commented on pull request #7459: ARROW-6848: [C++] Support building libraries targeting C++14 or higher

2020-06-18 Thread GitBox
brills commented on pull request #7459: URL: https://github.com/apache/arrow/pull/7459#issuecomment-646345325 We ([tfx-bsl](https://github.com/tensorflow/tfx-bsl)) have been building against the shared library shipped with pyarrow wheels (current built with gnu++11) without a problem

[GitHub] [arrow] wesm closed pull request #7420: ARROW-9022: [C++] Add/Sub/Mul arithmetic kernels with overflow check

2020-06-18 Thread GitBox
wesm closed pull request #7420: URL: https://github.com/apache/arrow/pull/7420 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] pitrou commented on a change in pull request #7456: ARROW-9106: [Python] Allow specifying CSV file encoding

2020-06-18 Thread GitBox
pitrou commented on a change in pull request #7456: URL: https://github.com/apache/arrow/pull/7456#discussion_r442545757 ## File path: python/pyarrow/tests/test_io.py ## @@ -1289,6 +1290,56 @@ def test_compressed_recordbatch_stream(compression): assert got_table == table

[GitHub] [arrow] wesm commented on a change in pull request #7420: ARROW-9022: [C++] Add/Sub/Mul arithmetic kernels with overflow check

2020-06-18 Thread GitBox
wesm commented on a change in pull request #7420: URL: https://github.com/apache/arrow/pull/7420#discussion_r442545107 ## File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc ## @@ -60,6 +68,42 @@ struct Add { } }; +struct AddChecked { +#if

[GitHub] [arrow] wesm commented on pull request #7420: ARROW-9022: [C++] Add/Sub/Mul arithmetic kernels with overflow check

2020-06-18 Thread GitBox
wesm commented on pull request #7420: URL: https://github.com/apache/arrow/pull/7420#issuecomment-646346895 @kszucs, yes, let's have benchmarks for the overflow operators, but can be done in a follow up PR This is an

[GitHub] [arrow] wesm closed pull request #7481: ARROW-9175: [FlightRPC][C++] Expose peer to server

2020-06-18 Thread GitBox
wesm closed pull request #7481: URL: https://github.com/apache/arrow/pull/7481 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] wesm commented on pull request #7488: ARROW-8522: [Release][Developer] Add option to bootstrap NPM when running release verification script

2020-06-18 Thread GitBox
wesm commented on pull request #7488: URL: https://github.com/apache/arrow/pull/7488#issuecomment-646352051 cc @nealrichardson This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [arrow] wesm opened a new pull request #7488: ARROW-8522: [Release][Developer] Add option to bootstrap NPM when running release verification script

2020-06-18 Thread GitBox
wesm opened a new pull request #7488: URL: https://github.com/apache/arrow/pull/7488 I'd like to make this on by default but don't want to break the Crossbow release verification stuff This is an automated message from the

[GitHub] [arrow] github-actions[bot] commented on pull request #7489: ARROW-8762: [C++] Use arrow::internal::BitmapAnd directly in Gandiva

2020-06-18 Thread GitBox
github-actions[bot] commented on pull request #7489: URL: https://github.com/apache/arrow/pull/7489#issuecomment-646355625 https://issues.apache.org/jira/browse/ARROW-8762 This is an automated message from the Apache Git

[GitHub] [arrow] github-actions[bot] commented on pull request #7488: ARROW-8522: [Release][Developer] Add option to bootstrap NPM when running release verification script

2020-06-18 Thread GitBox
github-actions[bot] commented on pull request #7488: URL: https://github.com/apache/arrow/pull/7488#issuecomment-646355626 https://issues.apache.org/jira/browse/ARROW-8522 This is an automated message from the Apache Git

[GitHub] [arrow] github-actions[bot] commented on pull request #7490: ARROW-9181: [C++] Instantiate fewer templates for cast kernels

2020-06-18 Thread GitBox
github-actions[bot] commented on pull request #7490: URL: https://github.com/apache/arrow/pull/7490#issuecomment-646365248 https://issues.apache.org/jira/browse/ARROW-9181 This is an automated message from the Apache Git

[GitHub] [arrow] kou commented on pull request #7459: ARROW-6848: [C++] Support building libraries targeting C++14 or higher

2020-06-18 Thread GitBox
kou commented on pull request #7459: URL: https://github.com/apache/arrow/pull/7459#issuecomment-646377635 @github-actions crossbow submit centos-7-aarch64 ubuntu-focal-arm64 This is an automated message from the Apache Git

[GitHub] [arrow] github-actions[bot] commented on pull request #7459: ARROW-6848: [C++] Support building libraries targeting C++14 or higher

2020-06-18 Thread GitBox
github-actions[bot] commented on pull request #7459: URL: https://github.com/apache/arrow/pull/7459#issuecomment-646378008 Revision: 5d0b5a30ad0fd905f34eef4fce391590b9cb9b0f Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] github-actions[bot] commented on pull request #7459: ARROW-6848: [C++] Support building libraries targeting C++14 or higher

2020-06-18 Thread GitBox
github-actions[bot] commented on pull request #7459: URL: https://github.com/apache/arrow/pull/7459#issuecomment-646382147 Revision: b63e674a54ae24a751d426b2adb0a68fa7bcad60 Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] github-actions[bot] commented on pull request #7459: ARROW-6848: [C++] Support building libraries targeting C++14 or higher

2020-06-18 Thread GitBox
github-actions[bot] commented on pull request #7459: URL: https://github.com/apache/arrow/pull/7459#issuecomment-646381994 Revision: b63e674a54ae24a751d426b2adb0a68fa7bcad60 Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] kou commented on a change in pull request #7420: ARROW-9022: [C++] Add/Sub/Mul arithmetic kernels with overflow check

2020-06-18 Thread GitBox
kou commented on a change in pull request #7420: URL: https://github.com/apache/arrow/pull/7420#discussion_r442586794 ## File path: cpp/src/arrow/compute/api_scalar.h ## @@ -43,7 +48,9 @@ namespace compute { /// \param[in] ctx the function execution context, optional ///

[GitHub] [arrow] wesm commented on pull request #7479: ARROW-7607: [C++] Example of using Arrow as a dependency of another CMake project

2020-06-18 Thread GitBox
wesm commented on pull request #7479: URL: https://github.com/apache/arrow/pull/7479#issuecomment-646343461 I'm gonna go ahead and merge this. Thanks @pitrou ! This is an automated message from the Apache Git Service. To

[GitHub] [arrow] wesm closed pull request #7479: ARROW-7607: [C++] Example of using Arrow as a dependency of another CMake project

2020-06-18 Thread GitBox
wesm closed pull request #7479: URL: https://github.com/apache/arrow/pull/7479 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] wesm commented on a change in pull request #7484: ARROW-9180: [Developer] Remove usage of whitelist, blacklist, slave, etc.

2020-06-18 Thread GitBox
wesm commented on a change in pull request #7484: URL: https://github.com/apache/arrow/pull/7484#discussion_r442547174 ## File path: cpp/build-support/cpplint.py ## @@ -4008,9 +4008,9 @@ def CheckTrailingSemicolon(filename, clean_lines, linenum, error): # Block bodies

[GitHub] [arrow] brills edited a comment on pull request #7459: ARROW-6848: [C++] Support building libraries targeting C++14 or higher

2020-06-18 Thread GitBox
brills edited a comment on pull request #7459: URL: https://github.com/apache/arrow/pull/7459#issuecomment-646345325 We ([tfx-bsl](https://github.com/tensorflow/tfx-bsl)) have been building against the shared library shipped with pyarrow wheels (current built with gnu++11) without a

[GitHub] [arrow] brills edited a comment on pull request #7459: ARROW-6848: [C++] Support building libraries targeting C++14 or higher

2020-06-18 Thread GitBox
brills edited a comment on pull request #7459: URL: https://github.com/apache/arrow/pull/7459#issuecomment-646345325 We ([tfx-bsl](https://github.com/tensorflow/tfx-bsl)) have been building against the shared library shipped with pyarrow wheels (current built with gnu++11) without a

[GitHub] [arrow] wesm opened a new pull request #7490: ARROW-9181: [C++] Instantiate fewer templates for cast kernels

2020-06-18 Thread GitBox
wesm opened a new pull request #7490: URL: https://github.com/apache/arrow/pull/7490 I discovered this unnecessary template instantiation by looking at the symbol sizes in object files with `nm --print-size --size-sort $OBJECT_FILE`. This trims about 200K from libarrow.so in release

[GitHub] [arrow] wesm opened a new pull request #7491: ARROW-9182: [C++] Use "applicator" namespace for some kernel execution functors. Streamline some applicator implementations

2020-06-18 Thread GitBox
wesm opened a new pull request #7491: URL: https://github.com/apache/arrow/pull/7491 In addition to using the "applicator" name, I am moving the argument unboxing a level higher in several places This is an automated

[GitHub] [arrow] kou commented on pull request #7459: ARROW-6848: [C++] Support building libraries targeting C++14 or higher

2020-06-18 Thread GitBox
kou commented on pull request #7459: URL: https://github.com/apache/arrow/pull/7459#issuecomment-646381474 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

  1   2   >