[GitHub] [arrow] andygrove closed pull request #8233: ARROW-10055: [Rust] DoubleEndedIterator implementation for NullableIter

2020-09-21 Thread GitBox
andygrove closed pull request #8233: URL: https://github.com/apache/arrow/pull/8233 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] kou commented on a change in pull request #8234: ARROW-10035: [C++] Update vendored libraries

2020-09-21 Thread GitBox
kou commented on a change in pull request #8234: URL: https://github.com/apache/arrow/pull/8234#discussion_r492378032 ## File path: LICENSE.txt ## @@ -849,9 +849,9 @@ THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

[GitHub] [arrow] github-actions[bot] commented on pull request #8235: ARROW-10059: [R][Doc] Give more advice on how to set up C++ build

2020-09-21 Thread GitBox
github-actions[bot] commented on pull request #8235: URL: https://github.com/apache/arrow/pull/8235#issuecomment-696431010 https://issues.apache.org/jira/browse/ARROW-10059 This is an automated message from the Apache Git

[GitHub] [arrow] josiahyan edited a comment on pull request #8214: ARROW-9965: [Java] Improve performance of BaseFixedWidthVector.setSafe by optimizing capacity calculations

2020-09-21 Thread GitBox
josiahyan edited a comment on pull request #8214: URL: https://github.com/apache/arrow/pull/8214#issuecomment-696381588 @jacques-n I haven't done very much investigation on other speedups - I just happened to notice performance irregularities as compared to our other (legacy) codepaths,

[GitHub] [arrow] nealrichardson opened a new pull request #8235: ARROW-10059: [R][Doc] Give more advice on how to set up C++ build

2020-09-21 Thread GitBox
nealrichardson opened a new pull request #8235: URL: https://github.com/apache/arrow/pull/8235 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [arrow] josiahyan commented on pull request #8214: ARROW-9965: [Java] Improve performance of BaseFixedWidthVector.setSafe by optimizing capacity calculations

2020-09-21 Thread GitBox
josiahyan commented on pull request #8214: URL: https://github.com/apache/arrow/pull/8214#issuecomment-696456877 *Option 2 being the best case of the append-only builder style interface; something like IntWriter, where direct access to the buffer was not permissible, and so its safe to do

[GitHub] [arrow] josiahyan commented on pull request #8214: ARROW-9965: [Java] Improve performance of BaseFixedWidthVector.setSafe by optimizing capacity calculations

2020-09-21 Thread GitBox
josiahyan commented on pull request #8214: URL: https://github.com/apache/arrow/pull/8214#issuecomment-696466268 > I'm clearly missing something. Why doesn't item 2 when directly in the vector solve the same purpose as 1/3? Sorry, I didn't realize that the ArrowBuf was that

[GitHub] [arrow] jorgecarleitao commented on a change in pull request #8236: ARROW-10060: [Rust] [DataFusion] Fixed error on which Err were discarded in MergeExec.

2020-09-21 Thread GitBox
jorgecarleitao commented on a change in pull request #8236: URL: https://github.com/apache/arrow/pull/8236#discussion_r492447876 ## File path: rust/datafusion/src/physical_plan/merge.rs ## @@ -111,9 +111,9 @@ impl ExecutionPlan for MergeExec { let

[GitHub] [arrow] t829702 edited a comment on pull request #2035: ARROW-2116: [JS] implement IPC writers

2020-09-21 Thread GitBox
t829702 edited a comment on pull request #2035: URL: https://github.com/apache/arrow/pull/2035#issuecomment-696480501 > Providing a separate utility in Arrow to parse dates I didn't mean to duplicate JS parsing code, but a way to provide a special parser function to the constructor,

[GitHub] [arrow] jorgecarleitao commented on a change in pull request #8236: ARROW-10060: [Rust] [DataFusion] Fixed error on which Err were discarded in MergeExec.

2020-09-21 Thread GitBox
jorgecarleitao commented on a change in pull request #8236: URL: https://github.com/apache/arrow/pull/8236#discussion_r492447876 ## File path: rust/datafusion/src/physical_plan/merge.rs ## @@ -111,9 +111,9 @@ impl ExecutionPlan for MergeExec { let

[GitHub] [arrow] xhochy commented on pull request #8218: ARROW-10037: [C++] Workaround to force find AWS SDK to look for shared libraries

2020-09-21 Thread GitBox
xhochy commented on pull request #8218: URL: https://github.com/apache/arrow/pull/8218#issuecomment-696098976 > Uh... was Boost upgraded in the meantime? There are compile errors on AppVeyor: >

[GitHub] [arrow] jorgecarleitao commented on pull request #8172: ARROW-9937: [Rust] [DataFusion] Improved aggregations

2020-09-21 Thread GitBox
jorgecarleitao commented on pull request #8172: URL: https://github.com/apache/arrow/pull/8172#issuecomment-695805702 @andygrove , that is great news! Really good to know that this stands a stronger benchmark. Thanks a lot for taking the time to run it. I rebased against master and

[GitHub] [arrow] emkornfield commented on pull request #8052: ARROW-9761: [C/C++] Add experimental C stream inferface

2020-09-21 Thread GitBox
emkornfield commented on pull request #8052: URL: https://github.com/apache/arrow/pull/8052#issuecomment-696286568 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow] wesm closed issue #8217: How to transform the Arrow data column to array of array efficiently?

2020-09-21 Thread GitBox
wesm closed issue #8217: URL: https://github.com/apache/arrow/issues/8217 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] bkietz commented on pull request #8052: ARROW-9761: [C/C++] Add experimental C stream inferface

2020-09-21 Thread GitBox
bkietz commented on pull request #8052: URL: https://github.com/apache/arrow/pull/8052#issuecomment-696303007 Rewinding doesn't strike me as something which needs to be part of the C stream protocol. APIs can still provide rewind and other semantics while using a simple-as-possible stream

[GitHub] [arrow] github-actions[bot] commented on pull request #8228: ARROW-10049: [C++/Python] Sync conda recipe with conda-forge

2020-09-21 Thread GitBox
github-actions[bot] commented on pull request #8228: URL: https://github.com/apache/arrow/pull/8228#issuecomment-695829873 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [arrow] liyafan82 commented on pull request #8214: ARROW-9965: [Java] Improve performance of BaseFixedWidthVector.setSafe by optimizing capacity calculations

2020-09-21 Thread GitBox
liyafan82 commented on pull request #8214: URL: https://github.com/apache/arrow/pull/8214#issuecomment-695873809 @josiahyan Thank you for the additional details. I think one of your concern is that, the underlying buffers can be changed unintentionally, which lefts the vector in an

[GitHub] [arrow] emkornfield commented on a change in pull request #8219: ARROW-9603: [C++] Fix parquet write to not assume leaf-array validity bitmaps have the same values as parent structs

2020-09-21 Thread GitBox
emkornfield commented on a change in pull request #8219: URL: https://github.com/apache/arrow/pull/8219#discussion_r492461706 ## File path: cpp/src/parquet/arrow/arrow_reader_writer_test.cc ## @@ -2360,6 +2361,49 @@ TEST(ArrowReadWrite, SingleColumnNullableStruct) { 3);

[GitHub] [arrow] pitrou edited a comment on pull request #8052: ARROW-9761: [C/C++] Add experimental C stream inferface

2020-09-21 Thread GitBox
pitrou edited a comment on pull request #8052: URL: https://github.com/apache/arrow/pull/8052#issuecomment-696276591 Would `rewind` go back to the start of stream always? This is an automated message from the Apache Git

[GitHub] [arrow] andygrove closed pull request #8102: ARROW-9902: [Rust] [DataFusion] Add array() built-in function

2020-09-21 Thread GitBox
andygrove closed pull request #8102: URL: https://github.com/apache/arrow/pull/8102 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] josiahyan commented on pull request #8214: ARROW-9965: [Java] Improve performance of BaseFixedWidthVector.setSafe by optimizing capacity calculations

2020-09-21 Thread GitBox
josiahyan commented on pull request #8214: URL: https://github.com/apache/arrow/pull/8214#issuecomment-696450195 Here are the results of my testing. I'm not really that familiar with Arrow, and some of the code is sloppy, so please check that what I'm doing matches up with your

[GitHub] [arrow] josiahyan commented on pull request #8214: ARROW-9965: [Java] Improve performance of BaseFixedWidthVector.setSafe by optimizing capacity calculations

2020-09-21 Thread GitBox
josiahyan commented on pull request #8214: URL: https://github.com/apache/arrow/pull/8214#issuecomment-696455205 Sorry, did you mean the specialized append interface (as Option 2), that assumes buffer ownership? I mislabeled the options in the paragraph you quoted (now corrected).

[GitHub] [arrow] jacques-n commented on pull request #8214: ARROW-9965: [Java] Improve performance of BaseFixedWidthVector.setSafe by optimizing capacity calculations

2020-09-21 Thread GitBox
jacques-n commented on pull request #8214: URL: https://github.com/apache/arrow/pull/8214#issuecomment-696459730 > I think there are two opportunities here - simply optimizing setSafe, which can be done by either specializing for the power-of-two size where possible, or by caching sizes

[GitHub] [arrow] jacques-n edited a comment on pull request #8214: ARROW-9965: [Java] Improve performance of BaseFixedWidthVector.setSafe by optimizing capacity calculations

2020-09-21 Thread GitBox
jacques-n edited a comment on pull request #8214: URL: https://github.com/apache/arrow/pull/8214#issuecomment-696459730 > I think there are two opportunities here - simply optimizing setSafe, which can be done by either specializing for the power-of-two size where possible, or by caching

[GitHub] [arrow] wesm commented on pull request #8052: ARROW-9761: [C/C++] Add experimental C stream inferface

2020-09-21 Thread GitBox
wesm commented on pull request #8052: URL: https://github.com/apache/arrow/pull/8052#issuecomment-696357318 Another thing that occurred to me is whether we want to enable batch-level metadata (which would be implementation-defined). This is supported in Flight for example

[GitHub] [arrow] lidavidm commented on pull request #8196: ARROW-10013: [FlightRPC][C++] fix setting generic client options

2020-09-21 Thread GitBox
lidavidm commented on pull request #8196: URL: https://github.com/apache/arrow/pull/8196#issuecomment-696331234 CC @pitrou, this will finally let AppVeyor pass again :) This is an automated message from the Apache Git

[GitHub] [arrow] zeroshade commented on pull request #8175: ARROW-8601: [Go][Flight] Implementations Flight RPC server and client

2020-09-21 Thread GitBox
zeroshade commented on pull request #8175: URL: https://github.com/apache/arrow/pull/8175#issuecomment-695894267 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [arrow] kszucs commented on a change in pull request #8088: ARROW-9992: [C++][Python] Refactor python to arrow conversions based on a reusable conversion API

2020-09-21 Thread GitBox
kszucs commented on a change in pull request #8088: URL: https://github.com/apache/arrow/pull/8088#discussion_r491890396 ## File path: cpp/src/arrow/util/converter.h ## @@ -0,0 +1,348 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor

[GitHub] [arrow] kszucs commented on pull request #8218: ARROW-10037: [C++] Workaround to force find AWS SDK to look for shared libraries

2020-09-21 Thread GitBox
kszucs commented on pull request #8218: URL: https://github.com/apache/arrow/pull/8218#issuecomment-696098116 Seems so, but that's a different issue. This is an automated message from the Apache Git Service. To respond to

[GitHub] [arrow] pitrou closed pull request #8218: ARROW-10037: [C++] Workaround to force find AWS SDK to look for shared libraries

2020-09-21 Thread GitBox
pitrou closed pull request #8218: URL: https://github.com/apache/arrow/pull/8218 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] jacques-n commented on pull request #8214: ARROW-9965: [Java] Improve performance of BaseFixedWidthVector.setSafe by optimizing capacity calculations

2020-09-21 Thread GitBox
jacques-n commented on pull request #8214: URL: https://github.com/apache/arrow/pull/8214#issuecomment-696453930 > @lidavidm @liyafan82 @jacques-n > Interpreting the results: > This patch could be improved (performance wise) by more aggressive caching (option 3), at the potential

[GitHub] [arrow] lidavidm commented on pull request #8214: ARROW-9965: [Java] Improve performance of BaseFixedWidthVector.setSafe by optimizing capacity calculations

2020-09-21 Thread GitBox
lidavidm commented on pull request #8214: URL: https://github.com/apache/arrow/pull/8214#issuecomment-696458726 I think there are two opportunities here - simply optimizing setSafe, which can be done by either specializing for the power-of-two size where possible, or by caching sizes

[GitHub] [arrow] jacques-n commented on pull request #8214: ARROW-9965: [Java] Improve performance of BaseFixedWidthVector.setSafe by optimizing capacity calculations

2020-09-21 Thread GitBox
jacques-n commented on pull request #8214: URL: https://github.com/apache/arrow/pull/8214#issuecomment-696458884 Why couldn't option 2 be done inside the vector (as opposed to in a wrapper class). ArrowBuf doesn't support reallocation (addr is final). It does allow downsizing but I'm not

[GitHub] [arrow] lidavidm commented on pull request #8214: ARROW-9965: [Java] Improve performance of BaseFixedWidthVector.setSafe by optimizing capacity calculations

2020-09-21 Thread GitBox
lidavidm commented on pull request #8214: URL: https://github.com/apache/arrow/pull/8214#issuecomment-696463757 Sorry - I got too hung up on the idea of a builder, and was thinking ArrowBuf could be reallocated-in-place, which it can't - option 2 and 3 are the same, they just cache values

[GitHub] [arrow] t829702 edited a comment on pull request #2035: ARROW-2116: [JS] implement IPC writers

2020-09-21 Thread GitBox
t829702 edited a comment on pull request #2035: URL: https://github.com/apache/arrow/pull/2035#issuecomment-696480501 > Providing a separate utility in Arrow to parse dates I didn't mean to duplicate JS parsing code, but a way to provide a special parser function to the constructor,

[GitHub] [arrow] jorgecarleitao commented on issue #8217: How to transform the Arrow data column to array of array efficiently?

2020-09-21 Thread GitBox
jorgecarleitao commented on issue #8217: URL: https://github.com/apache/arrow/issues/8217#issuecomment-695783084 Hi @Zarca, 1. In any particular language? 2. Arrow is a columnar format. Thus, it is already formatted like you wrote. If you mean is the transpose (i.e. `array[i]`

[GitHub] [arrow] pitrou commented on pull request #8052: ARROW-9761: [C/C++] Add experimental C stream inferface

2020-09-21 Thread GitBox
pitrou commented on pull request #8052: URL: https://github.com/apache/arrow/pull/8052#issuecomment-696183147 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [arrow] jorgecarleitao commented on pull request #8226: ARROW-10048: [Rust] Fixed error in computing min/max with null entries.

2020-09-21 Thread GitBox
jorgecarleitao commented on pull request #8226: URL: https://github.com/apache/arrow/pull/8226#issuecomment-695813581 fyi @andygrove : I pushed this to #8215, but I did not rebase #8172 against #8215, and thus the error remained. I found this as I was rebasing PRs against master.

[GitHub] [arrow] pitrou commented on pull request #8177: ARROW-8494: [C++][Parquet] Full support for reading mixed list and structs

2020-09-21 Thread GitBox
pitrou commented on pull request #8177: URL: https://github.com/apache/arrow/pull/8177#issuecomment-696148961 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [arrow] github-actions[bot] commented on pull request #8230: ARROW-10050: [C++][Gandiva] Implement concat() in Gandiva for up to 10 arguments

2020-09-21 Thread GitBox
github-actions[bot] commented on pull request #8230: URL: https://github.com/apache/arrow/pull/8230#issuecomment-695901406 https://issues.apache.org/jira/browse/ARROW-10050 This is an automated message from the Apache Git

[GitHub] [arrow] t829702 edited a comment on pull request #2035: ARROW-2116: [JS] implement IPC writers

2020-09-21 Thread GitBox
t829702 edited a comment on pull request #2035: URL: https://github.com/apache/arrow/pull/2035#issuecomment-696480501 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #8188: ARROW-9924: [C++][Dataset] Enable per-column parallelism for single ParquetFileFragment scans

2020-09-21 Thread GitBox
jorisvandenbossche commented on a change in pull request #8188: URL: https://github.com/apache/arrow/pull/8188#discussion_r492271433 ## File path: python/pyarrow/_dataset.pyx ## @@ -1013,27 +1013,38 @@ cdef class ParquetReadOptions(_Weakrefable): dictionary_columns : list

[GitHub] [arrow] pitrou commented on pull request #8218: ARROW-10037: [C++] Workaround to force find AWS SDK to look for shared libraries

2020-09-21 Thread GitBox
pitrou commented on pull request #8218: URL: https://github.com/apache/arrow/pull/8218#issuecomment-696096983 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [arrow] github-actions[bot] commented on pull request #8232: ARROW-10051: [C++][Compute] Make aggregate kernel state mutable

2020-09-21 Thread GitBox
github-actions[bot] commented on pull request #8232: URL: https://github.com/apache/arrow/pull/8232#issuecomment-695918114 https://issues.apache.org/jira/browse/ARROW-10051 This is an automated message from the Apache Git

[GitHub] [arrow] cyb70289 commented on pull request #8232: ARROW-10051: [C++][Compute] Make aggregate kernel state mutable

2020-09-21 Thread GitBox
cyb70289 commented on pull request #8232: URL: https://github.com/apache/arrow/pull/8232#issuecomment-695913785 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [arrow] github-actions[bot] commented on pull request #8234: ARROW-10035: [C++] Update vendored libraries

2020-09-21 Thread GitBox
github-actions[bot] commented on pull request #8234: URL: https://github.com/apache/arrow/pull/8234#issuecomment-696300434 https://issues.apache.org/jira/browse/ARROW-10035 This is an automated message from the Apache Git

[GitHub] [arrow] xhochy removed a comment on pull request #8228: ARROW-10049: [C++/Python] Sync conda recipe with conda-forge

2020-09-21 Thread GitBox
xhochy removed a comment on pull request #8228: URL: https://github.com/apache/arrow/pull/8228#issuecomment-695904892 @github-actions crossbow submit conda-linux-gcc-py36-cpu -- This is an automated

[GitHub] [arrow] jacques-n edited a comment on pull request #8214: ARROW-9965: [Java] Improve performance of BaseFixedWidthVector.setSafe by optimizing capacity calculations

2020-09-21 Thread GitBox
jacques-n edited a comment on pull request #8214: URL: https://github.com/apache/arrow/pull/8214#issuecomment-696459730 > I think there are two opportunities here - simply optimizing setSafe, which can be done by either specializing for the power-of-two size where possible, or by caching

[GitHub] [arrow] t829702 edited a comment on pull request #2035: ARROW-2116: [JS] implement IPC writers

2020-09-21 Thread GitBox
t829702 edited a comment on pull request #2035: URL: https://github.com/apache/arrow/pull/2035#issuecomment-696480501 > Providing a separate utility in Arrow to parse dates I didn't mean to duplicate JS parsing code, but a way to provide a special parser function to the constructor,

[GitHub] [arrow] t829702 commented on pull request #2035: ARROW-2116: [JS] implement IPC writers

2020-09-21 Thread GitBox
t829702 commented on pull request #2035: URL: https://github.com/apache/arrow/pull/2035#issuecomment-696480501 > Providing a separate utility in Arrow to parse dates I didn't mean to duplicate JS parsing code, but a way to provide a special parser function to the constructor,

[GitHub] [arrow] t829702 edited a comment on pull request #2035: ARROW-2116: [JS] implement IPC writers

2020-09-21 Thread GitBox
t829702 edited a comment on pull request #2035: URL: https://github.com/apache/arrow/pull/2035#issuecomment-696480501 > Providing a separate utility in Arrow to parse dates I didn't mean to duplicate JS parsing code, but a way to provide a special parser function to the constructor,

[GitHub] [arrow] josiahyan commented on pull request #8214: ARROW-9965: [Java] Improve performance of BaseFixedWidthVector.setSafe by optimizing capacity calculations

2020-09-21 Thread GitBox
josiahyan commented on pull request #8214: URL: https://github.com/apache/arrow/pull/8214#issuecomment-696065926 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [arrow] nealrichardson closed pull request #8227: ARROW-9946: [R] Check `sink` argument class in `ParquetFileWriter`

2020-09-21 Thread GitBox
nealrichardson closed pull request #8227: URL: https://github.com/apache/arrow/pull/8227 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow] pitrou closed pull request #8205: ARROW-9775: [C++] Automatic S3 region selection

2020-09-21 Thread GitBox
pitrou closed pull request #8205: URL: https://github.com/apache/arrow/pull/8205 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] jhorstmann commented on pull request #8223: ARROW-10040: [Rust] Add slice that realigns Buffer

2020-09-21 Thread GitBox
jhorstmann commented on pull request #8223: URL: https://github.com/apache/arrow/pull/8223#issuecomment-695831200 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow] wesm commented on a change in pull request #8219: ARROW-9603: [C++] Fix parquet write

2020-09-21 Thread GitBox
wesm commented on a change in pull request #8219: URL: https://github.com/apache/arrow/pull/8219#discussion_r492407465 ## File path: cpp/src/parquet/arrow/arrow_reader_writer_test.cc ## @@ -2360,6 +2361,49 @@ TEST(ArrowReadWrite, SingleColumnNullableStruct) { 3); }

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #8088: ARROW-9992: [C++][Python] Refactor python to arrow conversions based on a reusable conversion API

2020-09-21 Thread GitBox
jorisvandenbossche commented on a change in pull request #8088: URL: https://github.com/apache/arrow/pull/8088#discussion_r488714572 ## File path: python/pyarrow/tests/test_types.py ## @@ -280,6 +284,13 @@ def test_tzinfo_to_string_errors():

[GitHub] [arrow] pitrou closed pull request #8178: ARROW-9969: [C++] Fix RecordBatchBuilder with dictionary types

2020-09-21 Thread GitBox
pitrou closed pull request #8178: URL: https://github.com/apache/arrow/pull/8178 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] vertexclique commented on pull request #8233: ARROW-10055: [Rust] DoubleEndedIterator implementation for NullableIter

2020-09-21 Thread GitBox
vertexclique commented on pull request #8233: URL: https://github.com/apache/arrow/pull/8233#issuecomment-696146358 Hi! Would be nice if I can put this into upstream, there is a dependent implementation I am currently working on. Is it possible to review? @paddyhoran @andygrove

[GitHub] [arrow] pitrou commented on pull request #8229: ARROW-9579: [C++] Provide the plugin API to support customized compression codec for parquet

2020-09-21 Thread GitBox
pitrou commented on pull request #8229: URL: https://github.com/apache/arrow/pull/8229#issuecomment-696176912 Hmm, reading the mailing-list discussion again, I don't think we had agreed on a design. The first question for me is what the end-user API should be. * should the user calling

[GitHub] [arrow] kszucs commented on a change in pull request #8218: ARROW-10037: [C++] Workaround to force find AWS SDK to look for shared libraries

2020-09-21 Thread GitBox
kszucs commented on a change in pull request #8218: URL: https://github.com/apache/arrow/pull/8218#discussion_r491975704 ## File path: cpp/cmake_modules/ThirdpartyToolchain.cmake ## @@ -2697,6 +2703,10 @@ if(ARROW_S3) sts) endif() +

[GitHub] [arrow] hannesmuehleisen commented on pull request #8052: ARROW-9761: [C/C++] Add experimental C stream inferface

2020-09-21 Thread GitBox
hannesmuehleisen commented on pull request #8052: URL: https://github.com/apache/arrow/pull/8052#issuecomment-696276127 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [arrow] nealrichardson commented on a change in pull request #8227: ARROW-9946: [R] Check `sink` argument class in `ParquetFileWriter`

2020-09-21 Thread GitBox
nealrichardson commented on a change in pull request #8227: URL: https://github.com/apache/arrow/pull/8227#discussion_r492109660 ## File path: r/R/parquet.R ## @@ -373,6 +380,9 @@ ParquetFileWriter$create <- function(schema, sink,

[GitHub] [arrow] trxcllnt commented on a change in pull request #8216: ARROW-8394: [JS] Upgrade to TypeScript 4.0.2, fix typings for TS 3.9+

2020-09-21 Thread GitBox
trxcllnt commented on a change in pull request #8216: URL: https://github.com/apache/arrow/pull/8216#discussion_r492260498 ## File path: .env ## @@ -30,7 +30,7 @@ LLVM=10 CLANG_TOOLS=8 RUST=nightly-2020-04-22 GO=1.12 -NODE=11 +NODE=14 Review comment: No we still

[GitHub] [arrow] github-actions[bot] commented on pull request #8226: ARROW-10048: [Rust] Fixed error in computing min/max with null entries.

2020-09-21 Thread GitBox
github-actions[bot] commented on pull request #8226: URL: https://github.com/apache/arrow/pull/8226#issuecomment-695813753 https://issues.apache.org/jira/browse/ARROW-10048 This is an automated message from the Apache Git

[GitHub] [arrow] xhochy commented on pull request #8228: ARROW-10049: [C++/Python] Sync conda recipe with conda-forge

2020-09-21 Thread GitBox
xhochy commented on pull request #8228: URL: https://github.com/apache/arrow/pull/8228#issuecomment-695829727 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [arrow] liyafan82 commented on pull request #8194: ARROW-10017: [Java] Fix LargeMemoryUtil long conversion

2020-09-21 Thread GitBox
liyafan82 commented on pull request #8194: URL: https://github.com/apache/arrow/pull/8194#issuecomment-695884515 Merging. Thanks for the PR @pwoody This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] andygrove commented on a change in pull request #8172: ARROW-9937: [Rust] [DataFusion] Improved aggregations

2020-09-21 Thread GitBox
andygrove commented on a change in pull request #8172: URL: https://github.com/apache/arrow/pull/8172#discussion_r491702697 ## File path: rust/datafusion/src/sql/planner.rs ## @@ -343,7 +343,7 @@ impl<'a, S: SchemaProvider> SqlToRel<'a, S> { match *limit {

[GitHub] [arrow] jorgecarleitao closed pull request #8215: ARROW-9977: [Rust] Added min/max of [Large]StringArray

2020-09-21 Thread GitBox
jorgecarleitao closed pull request #8215: URL: https://github.com/apache/arrow/pull/8215 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow] github-actions[bot] commented on pull request #8236: ARROW-10060: [Rust] [DataFusion] Fixed error on which Err were discarded in MergeExec.

2020-09-21 Thread GitBox
github-actions[bot] commented on pull request #8236: URL: https://github.com/apache/arrow/pull/8236#issuecomment-696484939 https://issues.apache.org/jira/browse/ARROW-10060 This is an automated message from the Apache Git

[GitHub] [arrow] andygrove closed pull request #8118: ARROW-9922: [Rust] Add StructArray::TryFrom (+40%)

2020-09-21 Thread GitBox
andygrove closed pull request #8118: URL: https://github.com/apache/arrow/pull/8118 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] andygrove closed pull request #8233: ARROW-10055: [Rust] DoubleEndedIterator implementation for NullableIter

2020-09-21 Thread GitBox
andygrove closed pull request #8233: URL: https://github.com/apache/arrow/pull/8233 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] pitrou commented on pull request #8205: ARROW-9775: [C++] Automatic S3 region selection

2020-09-21 Thread GitBox
pitrou commented on pull request #8205: URL: https://github.com/apache/arrow/pull/8205#issuecomment-696085494 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [arrow] emkornfield commented on a change in pull request #8219: ARROW-9603: [C++] Fix parquet write to not assume leaf-array validity bitmaps have the same values as parent structs

2020-09-21 Thread GitBox
emkornfield commented on a change in pull request #8219: URL: https://github.com/apache/arrow/pull/8219#discussion_r492465434 ## File path: cpp/src/parquet/arrow/path_internal.cc ## @@ -871,6 +877,8 @@ class MultipathLevelBuilderImpl : public MultipathLevelBuilder {

[GitHub] [arrow] emkornfield commented on pull request #8219: ARROW-9603: [C++] Fix parquet write to not assume leaf-array validity bitmaps have the same values as parent structs

2020-09-21 Thread GitBox
emkornfield commented on pull request #8219: URL: https://github.com/apache/arrow/pull/8219#issuecomment-696503073 @xhochy did you want to review? This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] github-actions[bot] commented on pull request #8233: ARROW-10055: [Rust] DoubleEndedIterator implementation for NullableIter

2020-09-21 Thread GitBox
github-actions[bot] commented on pull request #8233: URL: https://github.com/apache/arrow/pull/8233#issuecomment-696149597 https://issues.apache.org/jira/browse/ARROW-10055 This is an automated message from the Apache Git

[GitHub] [arrow] josiahyan edited a comment on pull request #8214: ARROW-9965: [Java] Improve performance of BaseFixedWidthVector.setSafe by optimizing capacity calculations

2020-09-21 Thread GitBox
josiahyan edited a comment on pull request #8214: URL: https://github.com/apache/arrow/pull/8214#issuecomment-696381588 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [arrow] andygrove commented on a change in pull request #8118: ARROW-9922: [Rust] Add StructArray::TryFrom (+40%)

2020-09-21 Thread GitBox
andygrove commented on a change in pull request #8118: URL: https://github.com/apache/arrow/pull/8118#discussion_r492065611 ## File path: rust/arrow/src/array/array.rs ## @@ -834,7 +840,7 @@ impl From>> for BooleanArray { fn from(data: Vec>) -> Self { let

[GitHub] [arrow] andygrove closed pull request #8226: ARROW-10048: [Rust] Fixed error in computing min/max with null entries.

2020-09-21 Thread GitBox
andygrove closed pull request #8226: URL: https://github.com/apache/arrow/pull/8226 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] jhorstmann commented on a change in pull request #8222: ARROW-10043: [Rust][DataFusion] Implement COUNT(DISTINCT col)

2020-09-21 Thread GitBox
jhorstmann commented on a change in pull request #8222: URL: https://github.com/apache/arrow/pull/8222#discussion_r491869035 ## File path: rust/datafusion/src/physical_plan/distinct_expressions.rs ## @@ -0,0 +1,303 @@ +// Licensed to the Apache Software Foundation (ASF) under

[GitHub] [arrow] wesm commented on a change in pull request #8219: ARROW-9603: [C++] Fix parquet write

2020-09-21 Thread GitBox
wesm commented on a change in pull request #8219: URL: https://github.com/apache/arrow/pull/8219#discussion_r492407465 ## File path: cpp/src/parquet/arrow/arrow_reader_writer_test.cc ## @@ -2360,6 +2361,49 @@ TEST(ArrowReadWrite, SingleColumnNullableStruct) { 3); }

[GitHub] [arrow] jorgecarleitao opened a new pull request #8236: ARROW-10060: [Rust] [DataFusion] Fixed error on which Err were discarded in MergeExec.

2020-09-21 Thread GitBox
jorgecarleitao opened a new pull request #8236: URL: https://github.com/apache/arrow/pull/8236 Just found this sneaky error while working on UDAFs... This is an automated message from the Apache Git Service. To respond to

[GitHub] [arrow] t829702 edited a comment on pull request #2035: ARROW-2116: [JS] implement IPC writers

2020-09-21 Thread GitBox
t829702 edited a comment on pull request #2035: URL: https://github.com/apache/arrow/pull/2035#issuecomment-696480501 > Providing a separate utility in Arrow to parse dates I didn't mean to duplicate JS parsing code, but a way to provide a special parser function to the constructor,

[GitHub] [arrow] t829702 edited a comment on pull request #2035: ARROW-2116: [JS] implement IPC writers

2020-09-21 Thread GitBox
t829702 edited a comment on pull request #2035: URL: https://github.com/apache/arrow/pull/2035#issuecomment-696480501 > Providing a separate utility in Arrow to parse dates I didn't mean to duplicate JS parsing code, but a way to provide a special parser function to the constructor,

[GitHub] [arrow] xhochy commented on a change in pull request #8218: ARROW-10037: [C++] Workaround to force find AWS SDK to look for shared libraries

2020-09-21 Thread GitBox
xhochy commented on a change in pull request #8218: URL: https://github.com/apache/arrow/pull/8218#discussion_r492031242 ## File path: ci/conda_env_cpp.yml ## @@ -17,7 +17,7 @@ aws-sdk-cpp benchmark=1.4.1 -boost-cpp>=1.68.0 Review comment: The lower-limit is no

[GitHub] [arrow] github-actions[bot] commented on pull request #8229: ARROW-9579: [C++] Provide the plugin API to support customized compression codec for parquet

2020-09-21 Thread GitBox
github-actions[bot] commented on pull request #8229: URL: https://github.com/apache/arrow/pull/8229#issuecomment-695862390 https://issues.apache.org/jira/browse/ARROW-9579 This is an automated message from the Apache Git

[GitHub] [arrow] emkornfield commented on a change in pull request #8177: ARROW-8494: [C++][Parquet] Full support for reading mixed list and structs

2020-09-21 Thread GitBox
emkornfield commented on a change in pull request #8177: URL: https://github.com/apache/arrow/pull/8177#discussion_r492160325 ## File path: cpp/src/parquet/CMakeLists.txt ## @@ -202,6 +203,19 @@ set(PARQUET_SRCS stream_writer.cc types.cc) +if(CXX_SUPPORTS_AVX2) +

[GitHub] [arrow] pitrou commented on a change in pull request #8177: ARROW-8494: [C++][Parquet] Full support for reading mixed list and structs

2020-09-21 Thread GitBox
pitrou commented on a change in pull request #8177: URL: https://github.com/apache/arrow/pull/8177#discussion_r492084011 ## File path: cpp/src/parquet/CMakeLists.txt ## @@ -202,6 +203,19 @@ set(PARQUET_SRCS stream_writer.cc types.cc) +if(CXX_SUPPORTS_AVX2) + #

[GitHub] [arrow] andygrove closed pull request #8221: ARROW-9338: [Rust] Add clippy instructions

2020-09-21 Thread GitBox
andygrove closed pull request #8221: URL: https://github.com/apache/arrow/pull/8221 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] t829702 commented on pull request #2035: ARROW-2116: [JS] implement IPC writers

2020-09-21 Thread GitBox
t829702 commented on pull request #2035: URL: https://github.com/apache/arrow/pull/2035#issuecomment-696480501 > Providing a separate utility in Arrow to parse dates I didn't mean to duplicate JS parsing code, but a way to provide a special parser function to the constructor,

[GitHub] [arrow] wesm commented on pull request #8219: ARROW-9603: [C++] Fix parquet write

2020-09-21 Thread GitBox
wesm commented on pull request #8219: URL: https://github.com/apache/arrow/pull/8219#issuecomment-696248994 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [arrow] andygrove closed pull request #8172: ARROW-9937: [Rust] [DataFusion] Improved aggregations

2020-09-21 Thread GitBox
andygrove closed pull request #8172: URL: https://github.com/apache/arrow/pull/8172 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] andygrove commented on a change in pull request #8224: ARROW-10044: [Rust] Improved Arrow's README.

2020-09-21 Thread GitBox
andygrove commented on a change in pull request #8224: URL: https://github.com/apache/arrow/pull/8224#discussion_r492069938 ## File path: rust/arrow/README.md ## @@ -21,10 +21,62 @@ [![Coverage

[GitHub] [arrow] liyafan82 closed pull request #8194: ARROW-10017: [Java] Fix LargeMemoryUtil long conversion

2020-09-21 Thread GitBox
liyafan82 closed pull request #8194: URL: https://github.com/apache/arrow/pull/8194 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] vertexclique edited a comment on pull request #8233: ARROW-10055: [Rust] DoubleEndedIterator implementation for NullableIter

2020-09-21 Thread GitBox
vertexclique edited a comment on pull request #8233: URL: https://github.com/apache/arrow/pull/8233#issuecomment-696146358 Hi! Would be nice if I can merge this into upstream, there is a dependent implementation I am currently working on. Is it possible to review it? @paddyhoran

[GitHub] [arrow] emkornfield commented on a change in pull request #8219: ARROW-9603: [C++] Fix parquet write

2020-09-21 Thread GitBox
emkornfield commented on a change in pull request #8219: URL: https://github.com/apache/arrow/pull/8219#discussion_r492259759 ## File path: cpp/src/parquet/arrow/path_internal.cc ## @@ -838,10 +841,13 @@ class PathBuilder { #undef NOT_IMPLEMENTED_VISIT std::vector& paths()

[GitHub] [arrow] emkornfield commented on pull request #8229: ARROW-9579: [C++] Provide the plugin API to support customized compression codec for parquet

2020-09-21 Thread GitBox
emkornfield commented on pull request #8229: URL: https://github.com/apache/arrow/pull/8229#issuecomment-696146448 Thank you for the PR this will likely need a great deal of review from both code and design perspective. Before it is reviewed it should have thorough unit tests. And since

[GitHub] [arrow] github-actions[bot] commented on pull request #8227: ARROW-9946: [R] Check `sink` argument class in `ParquetFileWriter`

2020-09-21 Thread GitBox
github-actions[bot] commented on pull request #8227: URL: https://github.com/apache/arrow/pull/8227#issuecomment-695821544 https://issues.apache.org/jira/browse/ARROW-9946 This is an automated message from the Apache Git

[GitHub] [arrow] winningsix commented on pull request #8229: ARROW-9579: [C++] Provide the plugin API to support customized compression codec for parquet

2020-09-21 Thread GitBox
winningsix commented on pull request #8229: URL: https://github.com/apache/arrow/pull/8229#issuecomment-696193668 @pitrou @emkornfield FYI. This is Java side PR. https://github.com/apache/parquet-mr/pull/803/files This

[GitHub] [arrow] andygrove commented on pull request #8222: ARROW-10043: [Rust][DataFusion] Implement COUNT(DISTINCT col)

2020-09-21 Thread GitBox
andygrove commented on pull request #8222: URL: https://github.com/apache/arrow/pull/8222#issuecomment-695804292 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

  1   2   3   >