[GitHub] [arrow] emkornfield commented on pull request #6985: ARROW-8413: [C++][Parquet] Refactor Generating validity bitmap for values column

2020-04-25 Thread GitBox
emkornfield commented on pull request #6985: URL: https://github.com/apache/arrow/pull/6985#issuecomment-619488093 @pitrou I think I addressed your comments. One of them that went stale was the complexity for "AppendWord", I tried to remove parts that did not seem to affect performance

[GitHub] [arrow] emkornfield commented on a change in pull request #6985: ARROW-8413: [C++][Parquet] Refactor Generating validity bitmap for values column

2020-04-25 Thread GitBox
emkornfield commented on a change in pull request #6985: URL: https://github.com/apache/arrow/pull/6985#discussion_r415221052 ## File path: cpp/cmake_modules/SetupCxxFlags.cmake ## @@ -40,12 +40,13 @@ if(ARROW_CPU_FLAG STREQUAL "x86") set(CXX_SUPPORTS_SSE4_2 TRUE)

[GitHub] [arrow] emkornfield commented on a change in pull request #6985: ARROW-8413: [C++][Parquet] Refactor Generating validity bitmap for values column

2020-04-25 Thread GitBox
emkornfield commented on a change in pull request #6985: URL: https://github.com/apache/arrow/pull/6985#discussion_r415221009 ## File path: cpp/src/parquet/level_conversion_test.cc ## @@ -0,0 +1,162 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more

[GitHub] [arrow] emkornfield commented on a change in pull request #6985: ARROW-8413: [C++][Parquet] Refactor Generating validity bitmap for values column

2020-04-25 Thread GitBox
emkornfield commented on a change in pull request #6985: URL: https://github.com/apache/arrow/pull/6985#discussion_r414295562 ## File path: cpp/cmake_modules/SetupCxxFlags.cmake ## @@ -40,12 +40,13 @@ if(ARROW_CPU_FLAG STREQUAL "x86") set(CXX_SUPPORTS_SSE4_2 TRUE)

[GitHub] [arrow] liyafan82 commented on a change in pull request #6323: ARROW-7610: [Java] Finish support for 64 bit int allocations

2020-04-25 Thread GitBox
liyafan82 commented on a change in pull request #6323: URL: https://github.com/apache/arrow/pull/6323#discussion_r415203879 ## File path: java/memory/src/test/java/org/apache/arrow/memory/TestNettyAllocationManager.java ## @@ -0,0 +1,98 @@ +/* + * Licensed to the Apache

[GitHub] [arrow] liyafan82 commented on a change in pull request #6323: ARROW-7610: [Java] Finish support for 64 bit int allocations

2020-04-25 Thread GitBox
liyafan82 commented on a change in pull request #6323: URL: https://github.com/apache/arrow/pull/6323#discussion_r415203851 ## File path: java/memory/src/test/java/org/apache/arrow/memory/TestNettyAllocationManager.java ## @@ -0,0 +1,98 @@ +/* + * Licensed to the Apache

[GitHub] [arrow] liyafan82 commented on a change in pull request #6323: ARROW-7610: [Java] Finish support for 64 bit int allocations

2020-04-25 Thread GitBox
liyafan82 commented on a change in pull request #6323: URL: https://github.com/apache/arrow/pull/6323#discussion_r415203808 ## File path: java/memory/src/main/java/org/apache/arrow/memory/NettyAllocationManager.java ## @@ -17,48 +17,97 @@ package org.apache.arrow.memory;

[GitHub] [arrow] liyafan82 commented on a change in pull request #6323: ARROW-7610: [Java] Finish support for 64 bit int allocations

2020-04-25 Thread GitBox
liyafan82 commented on a change in pull request #6323: URL: https://github.com/apache/arrow/pull/6323#discussion_r415203665 ## File path: java/memory/src/main/java/org/apache/arrow/memory/NettyAllocationManager.java ## @@ -17,48 +17,97 @@ package org.apache.arrow.memory;

[GitHub] [arrow] liyafan82 commented on a change in pull request #6323: ARROW-7610: [Java] Finish support for 64 bit int allocations

2020-04-25 Thread GitBox
liyafan82 commented on a change in pull request #6323: URL: https://github.com/apache/arrow/pull/6323#discussion_r415203349 ## File path: java/memory/src/test/java/org/apache/arrow/memory/TestLargeArrowBuf.java ## @@ -0,0 +1,68 @@ +/* + * Licensed to the Apache Software

[GitHub] [arrow] liyafan82 commented on a change in pull request #6323: ARROW-7610: [Java] Finish support for 64 bit int allocations

2020-04-25 Thread GitBox
liyafan82 commented on a change in pull request #6323: URL: https://github.com/apache/arrow/pull/6323#discussion_r415202903 ## File path: java/memory/src/main/java/org/apache/arrow/memory/NettyAllocationManager.java ## @@ -17,48 +17,97 @@ package org.apache.arrow.memory;

[GitHub] [arrow] liyafan82 commented on a change in pull request #6323: ARROW-7610: [Java] Finish support for 64 bit int allocations

2020-04-25 Thread GitBox
liyafan82 commented on a change in pull request #6323: URL: https://github.com/apache/arrow/pull/6323#discussion_r415202953 ## File path: java/memory/src/main/java/org/apache/arrow/memory/NettyAllocationManager.java ## @@ -17,48 +17,97 @@ package org.apache.arrow.memory;

[GitHub] [arrow] wjones1 commented on pull request #6979: ARROW-7800 [Python] implement iter_batches() method for ParquetFile and ParquetReader

2020-04-25 Thread GitBox
wjones1 commented on pull request #6979: URL: https://github.com/apache/arrow/pull/6979#issuecomment-619463693 I found the cause of the test failure: If the `batch_size` isn't aligned with the `chunk_size`, categorical columns will fail with the error: ```

[GitHub] [arrow] github-actions[bot] commented on pull request #7041: ARROW-8584: [C++] Fix ORC link order

2020-04-25 Thread GitBox
github-actions[bot] commented on pull request #7041: URL: https://github.com/apache/arrow/pull/7041#issuecomment-619457325 https://issues.apache.org/jira/browse/ARROW-8584 This is an automated message from the Apache Git

[GitHub] [arrow] github-actions[bot] commented on pull request #7041: ARROW-8584: [C++] Fix ORC link order

2020-04-25 Thread GitBox
github-actions[bot] commented on pull request #7041: URL: https://github.com/apache/arrow/pull/7041#issuecomment-619456125 Revision: 3e57660bbcb5002a8c53146754146fc7c92b1ead Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] kou commented on pull request #7041: ARROW-8584: [C++] Fix ORC link order

2020-04-25 Thread GitBox
kou commented on pull request #7041: URL: https://github.com/apache/arrow/pull/7041#issuecomment-619456003 @github-actions crossbow submit -g linux This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] mayuropensource commented on pull request #7022: ARROW-8562: [C++] IO: Parameterize I/O Coalescing using S3 metrics

2020-04-25 Thread GitBox
mayuropensource commented on pull request #7022: URL: https://github.com/apache/arrow/pull/7022#issuecomment-619456016 A better calculation for bandwidth (by removing TTFB from total time) is done using following script: curl --negotiate -u: -o /dev/null -w

[GitHub] [arrow] mayuropensource edited a comment on pull request #7022: ARROW-8562: [C++] IO: Parameterize I/O Coalescing using S3 metrics

2020-04-25 Thread GitBox
mayuropensource edited a comment on pull request #7022: URL: https://github.com/apache/arrow/pull/7022#issuecomment-619456016 A better calculation for bandwidth (by removing TTFB from total time) is done using following script: `curl --negotiate -u: -o /dev/null -w

[GitHub] [arrow] kou opened a new pull request #7041: ARROW-8584: [C++] Fix ORC link order

2020-04-25 Thread GitBox
kou opened a new pull request #7041: URL: https://github.com/apache/arrow/pull/7041 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] wjones1 commented on pull request #6979: ARROW-7800 [Python] implement iter_batches() method for ParquetFile and ParquetReader

2020-04-25 Thread GitBox
wjones1 commented on pull request #6979: URL: https://github.com/apache/arrow/pull/6979#issuecomment-619437378 Two failing checks right now. For the linting one, it seems to be alarmed by some Rust code that I didn't touch. Am I missing something in that output? For the

[GitHub] [arrow] wjones1 commented on a change in pull request #6979: ARROW-7800 [Python] implement iter_batches() method for ParquetFile and ParquetReader

2020-04-25 Thread GitBox
wjones1 commented on a change in pull request #6979: URL: https://github.com/apache/arrow/pull/6979#discussion_r415130068 ## File path: python/pyarrow/tests/test_parquet.py ## @@ -179,6 +179,99 @@ def alltypes_sample(size=1, seed=0, categorical=False):

[GitHub] [arrow] github-actions[bot] commented on pull request #7040: ARROW-8505: [Release][C#] "sourcelink test" is failed by Apache.ArrowAssemblyInfo.cs

2020-04-25 Thread GitBox
github-actions[bot] commented on pull request #7040: URL: https://github.com/apache/arrow/pull/7040#issuecomment-619408251 https://issues.apache.org/jira/browse/ARROW-8505 This is an automated message from the Apache Git

[GitHub] [arrow] eerhardt opened a new pull request #7040: ARROW-8505: [Release][C#] "sourcelink test" is failed by Apache.ArrowAssemblyInfo.cs

2020-04-25 Thread GitBox
eerhardt opened a new pull request #7040: URL: https://github.com/apache/arrow/pull/7040 Workaround https://github.com/dotnet/sourcelink/issues/572 by explicitly embedding the AssemblyAttributes file into the pdb. This is

[GitHub] [arrow] eerhardt commented on pull request #6121: ARROW-6603: [C#] - Nullable Array Support

2020-04-25 Thread GitBox
eerhardt commented on pull request #6121: URL: https://github.com/apache/arrow/pull/6121#issuecomment-619396952 Thank you for this contribution, @abbotware. However, my opinion is that #7032 is more inline with how null support should be designed for the builder APIs. It also more

[GitHub] [arrow] eerhardt commented on pull request #7032: ARROW-6603, ARROW-5708, ARROW-5634: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-25 Thread GitBox
eerhardt commented on pull request #7032: URL: https://github.com/apache/arrow/pull/7032#issuecomment-619396229 > ARROW-5634 by properly setting the readonly value for NullCount, which previously was hardcoded to -1. I don't believe this change addresses the issue correctly. Can we

[GitHub] [arrow] eerhardt commented on a change in pull request #7032: ARROW-6603, ARROW-5708, ARROW-5634: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-25 Thread GitBox
eerhardt commented on a change in pull request #7032: URL: https://github.com/apache/arrow/pull/7032#discussion_r415072524 ## File path: csharp/src/Apache.Arrow/Apache.Arrow.csproj ## @@ -4,7 +4,7 @@ netstandard1.3;netcoreapp2.1 true

[GitHub] [arrow] github-actions[bot] commented on pull request #7039: ARROW-8513: [Python] Expose Take with Table input in Python

2020-04-25 Thread GitBox
github-actions[bot] commented on pull request #7039: URL: https://github.com/apache/arrow/pull/7039#issuecomment-619367004 https://issues.apache.org/jira/browse/ARROW-8513 This is an automated message from the Apache Git

[GitHub] [arrow] gramirezespinoza opened a new pull request #7039: ARROW-8513: [Python] Expose Take with Table input in Python

2020-04-25 Thread GitBox
gramirezespinoza opened a new pull request #7039: URL: https://github.com/apache/arrow/pull/7039 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [arrow] rollokb commented on a change in pull request #6979: ARROW-7800 [Python] implement iter_batches() method for ParquetFile and ParquetReader

2020-04-25 Thread GitBox
rollokb commented on a change in pull request #6979: URL: https://github.com/apache/arrow/pull/6979#discussion_r415027375 ## File path: python/pyarrow/tests/test_parquet.py ## @@ -179,6 +179,99 @@ def alltypes_sample(size=1, seed=0, categorical=False):

[GitHub] [arrow] github-actions[bot] commented on pull request #7038: ARROW-8593: [C++][Parquet] Fix build with musl libc

2020-04-25 Thread GitBox
github-actions[bot] commented on pull request #7038: URL: https://github.com/apache/arrow/pull/7038#issuecomment-619346808 https://issues.apache.org/jira/browse/ARROW-8593 This is an automated message from the Apache Git

[GitHub] [arrow] tobim opened a new pull request #7038: ARROW-8593: [C++][Parquet] Fix build with musl libc

2020-04-25 Thread GitBox
tobim opened a new pull request #7038: URL: https://github.com/apache/arrow/pull/7038 Converts local constants in `file_serialize_test.cc` to snake_case. Fixes a confilct with the `PAGE_SIZE` macro declared in the `limits.h` header that is shipped with musl libc.