[GitHub] [arrow] wesm closed pull request #7605: ARROW-9283: [Python] Expose build info

2020-07-11 Thread GitBox
wesm closed pull request #7605: URL: https://github.com/apache/arrow/pull/7605 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] bkietz commented on a change in pull request #7704: ARROW-9297: [C++][Parquet] Support chunked row groups in RowGroupRecordBatchReader

2020-07-11 Thread GitBox
bkietz commented on a change in pull request #7704: URL: https://github.com/apache/arrow/pull/7704#discussion_r453241420 ## File path: cpp/src/parquet/arrow/arrow_schema_test.cc ## @@ -644,6 +649,78 @@ TEST_F(TestConvertParquetSchema, ParquetRepeatedNestedSchema) {

[GitHub] [arrow] wesm closed pull request #7643: ARROW-9331: [C++] Improve the performance of Tensor-to-SparseTensor conversion

2020-07-11 Thread GitBox
wesm closed pull request #7643: URL: https://github.com/apache/arrow/pull/7643 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] wesm closed pull request #7715: ARROW-9389: [C++] Add binary metafunctions for the set lookup kernels isin and match that can be called with CallFunction

2020-07-11 Thread GitBox
wesm closed pull request #7715: URL: https://github.com/apache/arrow/pull/7715 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] wesm commented on pull request #7715: ARROW-9389: [C++] Add binary metafunctions for the set lookup kernels isin and match that can be called with CallFunction

2020-07-11 Thread GitBox
wesm commented on pull request #7715: URL: https://github.com/apache/arrow/pull/7715#issuecomment-657145211 +1. I will rebase the R PR where this is needed This is an automated message from the Apache Git Service. To respond

[GitHub] [arrow] wesm commented on pull request #7656: ARROW-9268: [C++] add string_is{alpnum,alpha...,upper} kernels

2020-07-11 Thread GitBox
wesm commented on pull request #7656: URL: https://github.com/apache/arrow/pull/7656#issuecomment-657174882 +1. Here are the benchmarks with gcc-8 comparing the last commit with 3739e6681 which is right before I started making changes ``` benchmark baseline

[GitHub] [arrow] github-actions[bot] commented on pull request #7717: PARQUET-1839: Set values read for required column

2020-07-11 Thread GitBox
github-actions[bot] commented on pull request #7717: URL: https://github.com/apache/arrow/pull/7717#issuecomment-657174860 https://issues.apache.org/jira/browse/PARQUET-1839 This is an automated message from the Apache Git

[GitHub] [arrow] wesm closed pull request #7656: ARROW-9268: [C++] add string_is{alpnum,alpha...,upper} kernels

2020-07-11 Thread GitBox
wesm closed pull request #7656: URL: https://github.com/apache/arrow/pull/7656 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] wesm commented on a change in pull request #7589: ARROW-9276: [Dev] Enable ARROW_CUDA when generating API documentations

2020-07-11 Thread GitBox
wesm commented on a change in pull request #7589: URL: https://github.com/apache/arrow/pull/7589#discussion_r453241713 ## File path: dev/release/post-09-docs.sh ## @@ -42,20 +42,20 @@ popd pushd "${ARROW_DIR}" git checkout "${release_tag}" -docker-compose build ubuntu-cpp

[GitHub] [arrow] wesm opened a new pull request #7715: ARROW-9389: [C++] Add binary metafunctions for the set lookup kernels isin and match that can be called with CallFunction

2020-07-11 Thread GitBox
wesm opened a new pull request #7715: URL: https://github.com/apache/arrow/pull/7715 This improves the usability of these kernels in language bindings, reducing the need to create bindings for `SetLookupOptions`. This is

[GitHub] [arrow] wesm closed pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

2020-07-11 Thread GitBox
wesm closed pull request #7290: URL: https://github.com/apache/arrow/pull/7290 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] wesm commented on pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

2020-07-11 Thread GitBox
wesm commented on pull request #7290: URL: https://github.com/apache/arrow/pull/7290#issuecomment-657142013 Tests pass so I'm merging. Hooray! This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] wesm commented on a change in pull request #7684: ARROW-9374: [C++][Python] Expose MakeArrayFromScalar [WIP]

2020-07-11 Thread GitBox
wesm commented on a change in pull request #7684: URL: https://github.com/apache/arrow/pull/7684#discussion_r453243667 ## File path: python/pyarrow/tests/test_scalars.py ## @@ -458,6 +458,7 @@ def test_map(): assert len(s) == 2 assert isinstance(s, pa.MapScalar)

[GitHub] [arrow] wesm commented on pull request #7668: ARROW-6982: [R] Add bindings for compare and boolean kernels

2020-07-11 Thread GitBox
wesm commented on pull request #7668: URL: https://github.com/apache/arrow/pull/7668#issuecomment-657146231 I rebased. You should be able to support `%in%` by calling `isin_meta_binary` (https://github.com/apache/arrow/blob/master/cpp/src/arrow/compute/kernels/scalar_set_lookup.cc#L398)

[GitHub] [arrow] wesm commented on pull request #7656: ARROW-9268: [C++] add string_is{alpnum,alpha...,upper} kernels

2020-07-11 Thread GitBox
wesm commented on pull request #7656: URL: https://github.com/apache/arrow/pull/7656#issuecomment-657159308 I simplified many of the templates to not require an Arrow type parameter and do less inlining (which doesn't seem to make much of a difference performance wise and it inflates

[GitHub] [arrow] kiszk opened a new pull request #7716: ARROW-9417: [C++] Write length in IPC message by using little-endian

2020-07-11 Thread GitBox
kiszk opened a new pull request #7716: URL: https://github.com/apache/arrow/pull/7716 This PR forces to write metadata_length and footer_length in IPC messages by using little-endian to follow [the

[GitHub] [arrow] github-actions[bot] commented on pull request #7715: ARROW-9389: [C++] Add binary metafunctions for the set lookup kernels isin and match that can be called with CallFunction

2020-07-11 Thread GitBox
github-actions[bot] commented on pull request #7715: URL: https://github.com/apache/arrow/pull/7715#issuecomment-657140234 https://issues.apache.org/jira/browse/ARROW-9389 This is an automated message from the Apache Git

[GitHub] [arrow] wesm closed pull request #7589: ARROW-9276: [Dev] Enable ARROW_CUDA when generating API documentations

2020-07-11 Thread GitBox
wesm closed pull request #7589: URL: https://github.com/apache/arrow/pull/7589 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] wesm commented on pull request #6587: ARROW-8081: [Plasma] Fix memory limit bug & improve code

2020-07-11 Thread GitBox
wesm commented on pull request #6587: URL: https://github.com/apache/arrow/pull/6587#issuecomment-657142416 @suquark could you take a look? This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [arrow] wesm commented on pull request #6208: ARROW-7533: [Java] Move ArrowBufPointer out of the java the memory package

2020-07-11 Thread GitBox
wesm commented on pull request #6208: URL: https://github.com/apache/arrow/pull/6208#issuecomment-657142270 This needs to be rebased This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [arrow] wesm commented on pull request #6758: ARROW-7960: [C++][Parquet][WIP] Add schema translation for missing logicalTypes

2020-07-11 Thread GitBox
wesm commented on pull request #6758: URL: https://github.com/apache/arrow/pull/6758#issuecomment-657142370 I'm closing this until it can be picked up again This is an automated message from the Apache Git Service. To

[GitHub] [arrow] wesm closed pull request #6758: ARROW-7960: [C++][Parquet][WIP] Add schema translation for missing logicalTypes

2020-07-11 Thread GitBox
wesm closed pull request #6758: URL: https://github.com/apache/arrow/pull/6758 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] wesm closed pull request #7702: ARROW-9395: [Python] allow configuring MetadataVersion

2020-07-11 Thread GitBox
wesm closed pull request #7702: URL: https://github.com/apache/arrow/pull/7702 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] emkornfield commented on pull request #6156: ARROW-7539: [Java] FieldVector getFieldBuffers API should not set reader/writer indices

2020-07-11 Thread GitBox
emkornfield commented on pull request #6156: URL: https://github.com/apache/arrow/pull/6156#issuecomment-657172845 > I think 2-4 need more conversation about what we want to expose. I'd definitely avoid introducing a new method (3 on your list) until we figure out what sets of

[GitHub] [arrow] kiszk commented on pull request #7716: ARROW-9417: [C++] Write length in IPC message by using little-endian

2020-07-11 Thread GitBox
kiszk commented on pull request #7716: URL: https://github.com/apache/arrow/pull/7716#issuecomment-657174179 Tests on s390x have passed at https://travis-ci.org/github/apache/arrow/jobs/707299972 ``` 100% tests passed, 0 tests failed out of 65 ```

[GitHub] [arrow] wesm commented on pull request #7545: ARROW-9139: [Python] Switch parquet.read_table to use new datasets API by default

2020-07-11 Thread GitBox
wesm commented on pull request #7545: URL: https://github.com/apache/arrow/pull/7545#issuecomment-657140613 Where does this PR stand? It needs a rebase This is an automated message from the Apache Git Service. To respond to

[GitHub] [arrow] wesm commented on pull request #7656: ARROW-9268: [C++] add string_is{alpnum,alpha...,upper} kernels

2020-07-11 Thread GitBox
wesm commented on pull request #7656: URL: https://github.com/apache/arrow/pull/7656#issuecomment-657153974 The ASAN/UBSAN test failure is caused by outdated utf8proc (Ubuntu 18.04 has 2.1.0 which has test failures), so I'm switching it to use the BUNDLED version. I'll refactor the

[GitHub] [arrow] wesm commented on pull request #7656: ARROW-9268: [C++] add string_is{alpnum,alpha...,upper} kernels

2020-07-11 Thread GitBox
wesm commented on pull request #7656: URL: https://github.com/apache/arrow/pull/7656#issuecomment-657161836 ah I think I figured out the mystery of the failed compilation -- it's caused by the missing utf8proc. This is an

[GitHub] [arrow] github-actions[bot] commented on pull request #7716: ARROW-9417: [C++] Write length in IPC message by using little-endian

2020-07-11 Thread GitBox
github-actions[bot] commented on pull request #7716: URL: https://github.com/apache/arrow/pull/7716#issuecomment-657169564 https://issues.apache.org/jira/browse/ARROW-9417 This is an automated message from the Apache Git

[GitHub] [arrow] emkornfield opened a new pull request #7717: PARQUET-1839: Set values read for required column

2020-07-11 Thread GitBox
emkornfield opened a new pull request #7717: URL: https://github.com/apache/arrow/pull/7717 I think this might be dead code, so I'm not sure a unit test is useful but I can one if necessary. This is an automated message

[GitHub] [arrow] emkornfield commented on pull request #7717: PARQUET-1839: Set values read for required column

2020-07-11 Thread GitBox
emkornfield commented on pull request #7717: URL: https://github.com/apache/arrow/pull/7717#issuecomment-657174236 CC @wesm This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [arrow] maartenbreddels commented on pull request #7656: ARROW-9268: [C++] add string_is{alpnum,alpha...,upper} kernels

2020-07-11 Thread GitBox
maartenbreddels commented on pull request #7656: URL: https://github.com/apache/arrow/pull/7656#issuecomment-657000143 Indeed, you're right. It seems to hit https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65333 but that reports GCC 5, and is quite old, looking at the logs it seems to be

[GitHub] [arrow] kiszk edited a comment on pull request #7711: ARROW-9415: [C++] Arrow does not compile on Power9

2020-07-11 Thread GitBox
kiszk edited a comment on pull request #7711: URL: https://github.com/apache/arrow/pull/7711#issuecomment-657079885 Just curious. Does this happen with gcc x.xx or clang x.xx ? Or both? This is an automated message from the

[GitHub] [arrow] kiszk commented on pull request #7711: ARROW-9415: [C++] Arrow does not compile on Power9

2020-07-11 Thread GitBox
kiszk commented on pull request #7711: URL: https://github.com/apache/arrow/pull/7711#issuecomment-657079885 Just curious. Does this happen with gcc or clang? Or both? This is an automated message from the Apache Git

[GitHub] [arrow] dota17 opened a new pull request #7712: ARROW-9416: [Go]add testcases for some datatypes

2020-07-11 Thread GitBox
dota17 opened a new pull request #7712: URL: https://github.com/apache/arrow/pull/7712 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow] github-actions[bot] commented on pull request #7712: ARROW-9416: [Go]add testcases for some datatypes

2020-07-11 Thread GitBox
github-actions[bot] commented on pull request #7712: URL: https://github.com/apache/arrow/pull/7712#issuecomment-657006538 https://issues.apache.org/jira/browse/ARROW-9416 This is an automated message from the Apache Git

[GitHub] [arrow] jorisvandenbossche commented on pull request #7604: ARROW-9223: [Python] Propagate timezone information in pandas conversion

2020-07-11 Thread GitBox
jorisvandenbossche commented on pull request #7604: URL: https://github.com/apache/arrow/pull/7604#issuecomment-657057892 We are doing integration tests for both spark master as spark's 3.0 branch, lately. But yes, @BryanCutler indicated that a fix for those branches could be possible, so

[GitHub] [arrow] maartenbreddels commented on pull request #7656: ARROW-9268: [C++] add string_is{alpnum,alpha...,upper} kernels

2020-07-11 Thread GitBox
maartenbreddels commented on pull request #7656: URL: https://github.com/apache/arrow/pull/7656#issuecomment-657009236 I tried to isolate the issue, by creating gccbug.cc: ```c++ #include #include #include class KernelContext; class Datum; class FunctionRegistry;

[GitHub] [arrow] wesm closed pull request #7604: ARROW-9223: [Python] Propagate timezone information in pandas conversion

2020-07-11 Thread GitBox
wesm closed pull request #7604: URL: https://github.com/apache/arrow/pull/7604 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] wesm closed pull request #7709: ARROW-9411: [Rust] Update dependencies

2020-07-11 Thread GitBox
wesm closed pull request #7709: URL: https://github.com/apache/arrow/pull/7709 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] paddyhoran commented on pull request #7666: ARROW-8559: [Rust] Consolidate Record Batch reader traits in main arrow crate

2020-07-11 Thread GitBox
paddyhoran commented on pull request #7666: URL: https://github.com/apache/arrow/pull/7666#issuecomment-657110158 I adopted `SchemaRef` throughout the codebase also. I figure if we are going to use `SchemaRef` in some places we might as well adopt it universally.

[GitHub] [arrow] paddyhoran commented on pull request #7666: ARROW-8559: [Rust] Consolidate Record Batch reader traits in main arrow crate

2020-07-11 Thread GitBox
paddyhoran commented on pull request #7666: URL: https://github.com/apache/arrow/pull/7666#issuecomment-657109850 Ok, I think we are all in agreement. Thanks. This is an automated message from the Apache Git Service. To

[GitHub] [arrow] bkietz commented on a change in pull request #7704: ARROW-9297: [C++][Parquet] Support chunked row groups in RowGroupRecordBatchReader

2020-07-11 Thread GitBox
bkietz commented on a change in pull request #7704: URL: https://github.com/apache/arrow/pull/7704#discussion_r453223708 ## File path: cpp/src/parquet/arrow/schema.h ## @@ -163,24 +165,28 @@ struct PARQUET_EXPORT SchemaManifest { return it->second; } - bool

[GitHub] [arrow] bkietz commented on a change in pull request #7704: ARROW-9297: [C++][Parquet] Support chunked row groups in RowGroupRecordBatchReader

2020-07-11 Thread GitBox
bkietz commented on a change in pull request #7704: URL: https://github.com/apache/arrow/pull/7704#discussion_r453223663 ## File path: cpp/src/parquet/arrow/reader.cc ## @@ -301,62 +302,50 @@ class FileReaderImpl : public FileReader { class RowGroupRecordBatchReader :

[GitHub] [arrow] bkietz commented on a change in pull request #7704: ARROW-9297: [C++][Parquet] Support chunked row groups in RowGroupRecordBatchReader

2020-07-11 Thread GitBox
bkietz commented on a change in pull request #7704: URL: https://github.com/apache/arrow/pull/7704#discussion_r453224969 ## File path: cpp/src/parquet/arrow/reader.cc ## @@ -781,6 +770,9 @@ Status FileReaderImpl::GetRecordBatchReader(const std::vector& row_group_in

[GitHub] [arrow] bkietz commented on a change in pull request #7704: ARROW-9297: [C++][Parquet] Support chunked row groups in RowGroupRecordBatchReader

2020-07-11 Thread GitBox
bkietz commented on a change in pull request #7704: URL: https://github.com/apache/arrow/pull/7704#discussion_r453224951 ## File path: cpp/src/parquet/arrow/schema.h ## @@ -163,24 +165,28 @@ struct PARQUET_EXPORT SchemaManifest { return it->second; } - bool

[GitHub] [arrow] bkietz commented on a change in pull request #7704: ARROW-9297: [C++][Parquet] Support chunked row groups in RowGroupRecordBatchReader

2020-07-11 Thread GitBox
bkietz commented on a change in pull request #7704: URL: https://github.com/apache/arrow/pull/7704#discussion_r453225253 ## File path: cpp/src/parquet/arrow/reader.cc ## @@ -301,62 +302,50 @@ class FileReaderImpl : public FileReader { class RowGroupRecordBatchReader :

[GitHub] [arrow] emkornfield commented on a change in pull request #7704: ARROW-9297: [C++][Parquet] Support chunked row groups in RowGroupRecordBatchReader

2020-07-11 Thread GitBox
emkornfield commented on a change in pull request #7704: URL: https://github.com/apache/arrow/pull/7704#discussion_r453227795 ## File path: cpp/src/parquet/arrow/reader.cc ## @@ -781,6 +770,9 @@ Status FileReaderImpl::GetRecordBatchReader(const std::vector& row_group_in

[GitHub] [arrow] bkietz commented on a change in pull request #7704: ARROW-9297: [C++][Parquet] Support chunked row groups in RowGroupRecordBatchReader

2020-07-11 Thread GitBox
bkietz commented on a change in pull request #7704: URL: https://github.com/apache/arrow/pull/7704#discussion_r453228008 ## File path: cpp/src/parquet/arrow/schema.h ## @@ -163,24 +165,28 @@ struct PARQUET_EXPORT SchemaManifest { return it->second; } - bool

[GitHub] [arrow] emkornfield commented on pull request #7704: ARROW-9297: [C++][Parquet] Support chunked row groups in RowGroupRecordBatchReader

2020-07-11 Thread GitBox
emkornfield commented on pull request #7704: URL: https://github.com/apache/arrow/pull/7704#issuecomment-657119271 I prefer checks to assertions for 'in the wild' errors. Especially around code that is acting on passed data. On Saturday, July 11, 2020, Benjamin Kietzman

[GitHub] [arrow] jacques-n commented on pull request #6156: ARROW-7539: [Java] FieldVector getFieldBuffers API should not set reader/writer indices

2020-07-11 Thread GitBox
jacques-n commented on pull request #6156: URL: https://github.com/apache/arrow/pull/6156#issuecomment-657095875 > @tianchen92 rereading, after rereading all the comments. I think we should > > 1. Remove setReaderWriterIndeces in getFieldBuffers > 2. Deprecate getBuffers > 3.

[GitHub] [arrow] github-actions[bot] commented on pull request #7713: ARROW-8261 [Rust-DataFusion] Made limit accept integers and no longer accept expressions.

2020-07-11 Thread GitBox
github-actions[bot] commented on pull request #7713: URL: https://github.com/apache/arrow/pull/7713#issuecomment-657095725 https://issues.apache.org/jira/browse/ARROW-8261 This is an automated message from the Apache Git

[GitHub] [arrow] jacques-n commented on pull request #7030: ARROW-7808: [Java][Dataset] Implement Datasets Java API by JNI to C++

2020-07-11 Thread GitBox
jacques-n commented on pull request #7030: URL: https://github.com/apache/arrow/pull/7030#issuecomment-657096664 > @jacques-n > > I think we have several choices: we can try implement a c++ memory pool which is backed by a Java Allocator, actually I have a PoC branch for that:

[GitHub] [arrow] jorgecarleitao opened a new pull request #7713: ARROW-8261 [Rust-DataFusion] Made limit accept integers and no longer accept expressions.

2020-07-11 Thread GitBox
jorgecarleitao opened a new pull request #7713: URL: https://github.com/apache/arrow/pull/7713 Also made the argument consistent across the project (=usize). This change is backward incompatible. The rational for this change is that limit() is almost never called with an

[GitHub] [arrow] lidavidm commented on pull request #7702: ARROW-9395: [Python] allow configuring MetadataVersion

2020-07-11 Thread GitBox
lidavidm commented on pull request #7702: URL: https://github.com/apache/arrow/pull/7702#issuecomment-657099127 I built Spark 3.0 with Arrow 0.15.1 (the default) -> Python tests pass with PyArrow 0.17.1 -> Python tests pass with PyArrow from this PR with the environment variable set

[GitHub] [arrow] wesm opened a new pull request #7714: ARROW-9407: [Python] Recognize more pandas null sentinels in sequence type inference when converting to Arrow

2020-07-11 Thread GitBox
wesm opened a new pull request #7714: URL: https://github.com/apache/arrow/pull/7714 Null sentinel objects other than `NaN` were not being considered in the type inference. This is follow up to the prior patch ARROW-842 and wasn't fully implemented/tested there.

[GitHub] [arrow] github-actions[bot] commented on pull request #7714: ARROW-9407: [Python] Recognize more pandas null sentinels in sequence type inference when converting to Arrow

2020-07-11 Thread GitBox
github-actions[bot] commented on pull request #7714: URL: https://github.com/apache/arrow/pull/7714#issuecomment-657123180 https://issues.apache.org/jira/browse/ARROW-9407 This is an automated message from the Apache Git

[GitHub] [arrow] wesm edited a comment on pull request #7711: ARROW-9415: [C++] Arrow does not compile on Power9

2020-07-11 Thread GitBox
wesm edited a comment on pull request #7711: URL: https://github.com/apache/arrow/pull/7711#issuecomment-657123476 @jglaser can you move the undef into arrow/util/hashing.h until the problem is fixed in xxhash? We can remove it whenever we update the vendored code in the future

[GitHub] [arrow] wesm commented on pull request #7711: ARROW-9415: [C++] Arrow does not compile on Power9

2020-07-11 Thread GitBox
wesm commented on pull request #7711: URL: https://github.com/apache/arrow/pull/7711#issuecomment-657123476 @jglaser can you move the undef into arrow/util/hashing.h until the problem is patches in xxhash? We can remove it whenever we update the vendored code in the future

[GitHub] [arrow] wesm commented on pull request #7711: ARROW-9415: [C++] Arrow does not compile on Power9

2020-07-11 Thread GitBox
wesm commented on pull request #7711: URL: https://github.com/apache/arrow/pull/7711#issuecomment-657124420 I went ahead and did it. This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [arrow] wesm closed pull request #7708: ARROW-9292: [Doc] Remove Rust from feature matrix

2020-07-11 Thread GitBox
wesm closed pull request #7708: URL: https://github.com/apache/arrow/pull/7708 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] jglaser commented on pull request #7711: ARROW-9415: [C++] Arrow does not compile on Power9

2020-07-11 Thread GitBox
jglaser commented on pull request #7711: URL: https://github.com/apache/arrow/pull/7711#issuecomment-657124634 Thank you, @wesm ! This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [arrow] wesm commented on pull request #7708: ARROW-9292: [Doc] Remove Rust from feature matrix

2020-07-11 Thread GitBox
wesm commented on pull request #7708: URL: https://github.com/apache/arrow/pull/7708#issuecomment-657124615 +1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [arrow] wesm commented on pull request #7672: WIP ARROW-9348: [C++] Replace usages of TestBase::MakeRandomArray in testing/gtest_util.h with RandomArrayGenerator

2020-07-11 Thread GitBox
wesm commented on pull request #7672: URL: https://github.com/apache/arrow/pull/7672#issuecomment-657124930 I'm going to close this for now. Please reopen when the PR needs to be reviewed This is an automated message from

[GitHub] [arrow] wesm closed pull request #7672: WIP ARROW-9348: [C++] Replace usages of TestBase::MakeRandomArray in testing/gtest_util.h with RandomArrayGenerator

2020-07-11 Thread GitBox
wesm closed pull request #7672: URL: https://github.com/apache/arrow/pull/7672 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] wesm closed pull request #7705: ARROW-9408: [Integration] Fix Windows numpy datagen issues

2020-07-11 Thread GitBox
wesm closed pull request #7705: URL: https://github.com/apache/arrow/pull/7705 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] wesm commented on pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

2020-07-11 Thread GitBox
wesm commented on pull request #7290: URL: https://github.com/apache/arrow/pull/7290#issuecomment-657125577 Rebased since https://github.com/apache/arrow/pull/7685 was merged This is an automated message from the Apache Git

[GitHub] [arrow] wesm closed pull request #7685: ARROW-9362: [Java] Support reading/writing V5 MetadataVersion

2020-07-11 Thread GitBox
wesm closed pull request #7685: URL: https://github.com/apache/arrow/pull/7685 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] wesm closed pull request #7714: ARROW-9407: [Python] Recognize more pandas null sentinels in sequence type inference when converting to Arrow

2020-07-11 Thread GitBox
wesm closed pull request #7714: URL: https://github.com/apache/arrow/pull/7714 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] houqp commented on pull request #7666: ARROW-8559: [Rust] Consolidate Record Batch reader traits in main arrow crate

2020-07-11 Thread GitBox
houqp commented on pull request #7666: URL: https://github.com/apache/arrow/pull/7666#issuecomment-657125605 linter error will be addressed by https://github.com/apache/arrow/pull/7710 This is an automated message from the

[GitHub] [arrow] wesm commented on pull request #7635: ARROW-1567: [C++] implement fill null

2020-07-11 Thread GitBox
wesm commented on pull request #7635: URL: https://github.com/apache/arrow/pull/7635#issuecomment-657127070 This is close to what I'm looking for. I'm going to push some changes to this branch in a little while and then will merge this

[GitHub] [arrow] sbinet commented on a change in pull request #7712: ARROW-9416: [Go] Add testcases for some datatypes

2020-07-11 Thread GitBox
sbinet commented on a change in pull request #7712: URL: https://github.com/apache/arrow/pull/7712#discussion_r453235235 ## File path: go/arrow/compare_test.go ## @@ -147,6 +150,37 @@ func TestTypeEqual(t *testing.T) { }, false,

[GitHub] [arrow] wesm commented on pull request #7635: ARROW-1567: [C++] Implement "fill_null" function that replaces null values with a scalar value

2020-07-11 Thread GitBox
wesm commented on pull request #7635: URL: https://github.com/apache/arrow/pull/7635#issuecomment-657133231 +1, will merge once the build passes This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] wesm commented on pull request #7711: ARROW-9415: [C++] Arrow does not compile on Power9

2020-07-11 Thread GitBox
wesm commented on pull request #7711: URL: https://github.com/apache/arrow/pull/7711#issuecomment-657133272 +1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [arrow] wesm commented on a change in pull request #7635: ARROW-1567: [C++] implement fill null

2020-07-11 Thread GitBox
wesm commented on a change in pull request #7635: URL: https://github.com/apache/arrow/pull/7635#discussion_r453236952 ## File path: cpp/src/arrow/compute/kernels/scalar_fill_null.cc ## @@ -32,195 +34,117 @@ namespace internal { namespace { -template +template struct

[GitHub] [arrow] wesm commented on pull request #7635: ARROW-1567: [C++] implement fill null

2020-07-11 Thread GitBox
wesm commented on pull request #7635: URL: https://github.com/apache/arrow/pull/7635#issuecomment-657132910 I changed the implementations to do a single memory allocation and avoid the builder classes, which will be faster, and fixed some other stuff. Additionally, instantiating fewer

[GitHub] [arrow] wesm closed pull request #7711: ARROW-9415: [C++] Arrow does not compile on Power9

2020-07-11 Thread GitBox
wesm closed pull request #7711: URL: https://github.com/apache/arrow/pull/7711 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] bkietz commented on a change in pull request #7704: ARROW-9297: [C++][Parquet] Support chunked row groups in RowGroupRecordBatchReader

2020-07-11 Thread GitBox
bkietz commented on a change in pull request #7704: URL: https://github.com/apache/arrow/pull/7704#discussion_r453237839 ## File path: cpp/src/parquet/arrow/reader.cc ## @@ -301,62 +302,50 @@ class FileReaderImpl : public FileReader { class RowGroupRecordBatchReader :

[GitHub] [arrow] bkietz commented on a change in pull request #7704: ARROW-9297: [C++][Parquet] Support chunked row groups in RowGroupRecordBatchReader

2020-07-11 Thread GitBox
bkietz commented on a change in pull request #7704: URL: https://github.com/apache/arrow/pull/7704#discussion_r453238293 ## File path: cpp/src/parquet/arrow/schema.h ## @@ -163,24 +165,28 @@ struct PARQUET_EXPORT SchemaManifest { return it->second; } - bool

[GitHub] [arrow] bkietz commented on a change in pull request #7704: ARROW-9297: [C++][Parquet] Support chunked row groups in RowGroupRecordBatchReader

2020-07-11 Thread GitBox
bkietz commented on a change in pull request #7704: URL: https://github.com/apache/arrow/pull/7704#discussion_r453239004 ## File path: cpp/src/parquet/arrow/reader.cc ## @@ -301,62 +302,50 @@ class FileReaderImpl : public FileReader { class RowGroupRecordBatchReader :

[GitHub] [arrow] bkietz commented on a change in pull request #7704: ARROW-9297: [C++][Parquet] Support chunked row groups in RowGroupRecordBatchReader

2020-07-11 Thread GitBox
bkietz commented on a change in pull request #7704: URL: https://github.com/apache/arrow/pull/7704#discussion_r453238293 ## File path: cpp/src/parquet/arrow/schema.h ## @@ -163,24 +165,28 @@ struct PARQUET_EXPORT SchemaManifest { return it->second; } - bool

[GitHub] [arrow] wesm commented on pull request #7605: ARROW-9283: [Python] Expose build info

2020-07-11 Thread GitBox
wesm commented on pull request #7605: URL: https://github.com/apache/arrow/pull/7605#issuecomment-657136410 I implemented some changes: * Passing through `ARROW_VERSION` unmodified to arrow/util/config.h, so it will say 1.0.0-SNAPSHOT now * Add cpp_ prefix to the Python variables

[GitHub] [arrow] wesm edited a comment on pull request #7605: ARROW-9283: [Python] Expose build info

2020-07-11 Thread GitBox
wesm edited a comment on pull request #7605: URL: https://github.com/apache/arrow/pull/7605#issuecomment-657136410 I implemented some changes: * Passing through `ARROW_VERSION` unmodified to arrow/util/config.h, so it will say 1.0.0-SNAPSHOT now * Add cpp_ prefix to the Python

[GitHub] [arrow] wesm commented on pull request #7605: ARROW-9283: [Python] Expose build info

2020-07-11 Thread GitBox
wesm commented on pull request #7605: URL: https://github.com/apache/arrow/pull/7605#issuecomment-657136826 +1. If the setup.py changes create problems for the release we will have some time to fix things This is an

[GitHub] [arrow] wesm commented on pull request #7635: ARROW-1567: [C++] Implement "fill_null" function that replaces null values with a scalar value

2020-07-11 Thread GitBox
wesm commented on pull request #7635: URL: https://github.com/apache/arrow/pull/7635#issuecomment-657136940 @c-jamie thanks for the patch, could you let me know your ASF JIRA username (or create one if you don't have one) so I can assign the issue to you?

[GitHub] [arrow] wesm closed pull request #7635: ARROW-1567: [C++] Implement "fill_null" function that replaces null values with a scalar value

2020-07-11 Thread GitBox
wesm closed pull request #7635: URL: https://github.com/apache/arrow/pull/7635 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] wesm closed pull request #7701: ARROW-9403: [Python] add Array.tolist as alias of .to_pylist

2020-07-11 Thread GitBox
wesm closed pull request #7701: URL: https://github.com/apache/arrow/pull/7701 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] bkietz commented on a change in pull request #7704: ARROW-9297: [C++][Parquet] Support chunked row groups in RowGroupRecordBatchReader

2020-07-11 Thread GitBox
bkietz commented on a change in pull request #7704: URL: https://github.com/apache/arrow/pull/7704#discussion_r453239911 ## File path: cpp/src/parquet/arrow/arrow_schema_test.cc ## @@ -644,6 +649,78 @@ TEST_F(TestConvertParquetSchema, ParquetRepeatedNestedSchema) {