[GitHub] [arrow] jianxind commented on pull request #8049: ARROW-9851: [C++] Disable AVX512 runtime paths with Valgrind

2020-08-25 Thread GitBox
jianxind commented on pull request #8049: URL: https://github.com/apache/arrow/pull/8049#issuecomment-680371722 > @jianxind What is your opinion on the approach here? It's fine to introduce ARROW_RUNTIME_SIMD_LEVEL flag which give more user options. Another potential approach is the

[GitHub] [arrow] andygrove closed pull request #8034: ARROW-9464: [Rust] [DataFusion] Physical plan optimization rule to insert MergeExec when needed

2020-08-25 Thread GitBox
andygrove closed pull request #8034: URL: https://github.com/apache/arrow/pull/8034 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] jianxind commented on pull request #8049: ARROW-9851: [C++] Disable AVX512 runtime paths with Valgrind

2020-08-25 Thread GitBox
jianxind commented on pull request #8049: URL: https://github.com/apache/arrow/pull/8049#issuecomment-680426063 > OK. Then I think that the environment variable approach is better because we already have `ARROW_USER_SIMD_LEVEL` support (I didn't know!). The following patch will work: >

[GitHub] [arrow] emkornfield commented on pull request #7326: ARROW-9010: [Java] Framework and interface changes for RecordBatch IPC buffer compression

2020-08-25 Thread GitBox
emkornfield commented on pull request #7326: URL: https://github.com/apache/arrow/pull/7326#issuecomment-680574395 thanks @liyafan82 I added a comment I think the change puts back more places then necessary to check for null? I might be missing something though.

[GitHub] [arrow] emkornfield commented on a change in pull request #7326: ARROW-9010: [Java] Framework and interface changes for RecordBatch IPC buffer compression

2020-08-25 Thread GitBox
emkornfield commented on a change in pull request #7326: URL: https://github.com/apache/arrow/pull/7326#discussion_r477024511 ## File path: java/vector/src/main/java/org/apache/arrow/vector/ipc/message/ArrowRecordBatch.java ## @@ -194,12 +190,17 @@ public int

[GitHub] [arrow] emkornfield commented on pull request #7973: ARROW-8493: [C++][Parquet] Start populating repeated ancestor defintion

2020-08-25 Thread GitBox
emkornfield commented on pull request #7973: URL: https://github.com/apache/arrow/pull/7973#issuecomment-680594450 @pitrou does the updated PR look OK? This is an automated message from the Apache Git Service. To respond to

[GitHub] [arrow] jorgecarleitao commented on a change in pull request #8032: ARROW-9836: [Rust][DataFusion] Improve API for usage of UDFs

2020-08-25 Thread GitBox
jorgecarleitao commented on a change in pull request #8032: URL: https://github.com/apache/arrow/pull/8032#discussion_r477045153 ## File path: rust/datafusion/src/logicalplan.rs ## @@ -1042,6 +998,35 @@ pub fn can_coerce_from(type_into: , type_from: ) -> bool { } }

[GitHub] [arrow] hannesmuehleisen commented on pull request #8052: ARROW-9761: [C/C++] Add experimental C ArrowArrayStream ABI

2020-08-25 Thread GitBox
hannesmuehleisen commented on pull request #8052: URL: https://github.com/apache/arrow/pull/8052#issuecomment-680673004 I have one suggestion: The proposal indicates that the stream is finished when the array resulting from get_next is released. This seems a bit odd, how about just

[GitHub] [arrow] kou commented on pull request #8049: ARROW-9851: [C++] Disable AVX512 runtime paths with Valgrind

2020-08-25 Thread GitBox
kou commented on pull request #8049: URL: https://github.com/apache/arrow/pull/8049#issuecomment-680501869 > ARROW_USER_SIMD_LEVEL is a ENV variants, will this change pass AVX2 to ARROW_USER_SIMD_LEVEL env? Yes. But the patch was wrong. We need to use lower case for

[GitHub] [arrow] emkornfield commented on a change in pull request #8052: ARROW-9761: [C/C++] Add experimental C ArrowArrayStream ABI

2020-08-25 Thread GitBox
emkornfield commented on a change in pull request #8052: URL: https://github.com/apache/arrow/pull/8052#discussion_r477022712 ## File path: cpp/src/arrow/c/abi.h ## @@ -60,6 +60,31 @@ struct ArrowArray { void* private_data; }; +// EXPERIMENTAL +struct ArrowArrayStream {

[GitHub] [arrow] emkornfield commented on a change in pull request #8052: ARROW-9761: [C/C++] Add experimental C ArrowArrayStream ABI

2020-08-25 Thread GitBox
emkornfield commented on a change in pull request #8052: URL: https://github.com/apache/arrow/pull/8052#discussion_r477023156 ## File path: cpp/src/arrow/c/abi.h ## @@ -60,6 +60,31 @@ struct ArrowArray { void* private_data; }; +// EXPERIMENTAL +struct ArrowArrayStream {

[GitHub] [arrow] emkornfield commented on a change in pull request #8052: ARROW-9761: [C/C++] Add experimental C ArrowArrayStream ABI

2020-08-25 Thread GitBox
emkornfield commented on a change in pull request #8052: URL: https://github.com/apache/arrow/pull/8052#discussion_r477022487 ## File path: cpp/src/arrow/c/abi.h ## @@ -60,6 +60,31 @@ struct ArrowArray { void* private_data; }; +// EXPERIMENTAL +struct ArrowArrayStream {

[GitHub] [arrow] arw2019 commented on a change in pull request #8044: ARROW-7663: [Python] Raise better error message when passing mixed-type (int/string) Pandas dataframe to pyarrow Table

2020-08-25 Thread GitBox
arw2019 commented on a change in pull request #8044: URL: https://github.com/apache/arrow/pull/8044#discussion_r476648698 ## File path: python/pyarrow/tests/test_convert_builtin.py ## @@ -382,11 +382,8 @@ def test_sequence_custom_integers(seq):

[GitHub] [arrow] arw2019 opened a new pull request #8055: ARROW-7226: [Python][Doc] Add note re: JSON format support

2020-08-25 Thread GitBox
arw2019 opened a new pull request #8055: URL: https://github.com/apache/arrow/pull/8055 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow] github-actions[bot] commented on pull request #8055: ARROW-7226: [Python][Doc] Add note re: JSON format support

2020-08-25 Thread GitBox
github-actions[bot] commented on pull request #8055: URL: https://github.com/apache/arrow/pull/8055#issuecomment-680611318 https://issues.apache.org/jira/browse/ARROW-7226 This is an automated message from the Apache Git

[GitHub] [arrow] emkornfield commented on a change in pull request #8052: ARROW-9761: [C/C++] Add experimental C ArrowArrayStream ABI

2020-08-25 Thread GitBox
emkornfield commented on a change in pull request #8052: URL: https://github.com/apache/arrow/pull/8052#discussion_r477023405 ## File path: cpp/src/arrow/c/abi.h ## @@ -60,6 +60,31 @@ struct ArrowArray { void* private_data; }; +// EXPERIMENTAL +struct ArrowArrayStream {

[GitHub] [arrow] emkornfield commented on a change in pull request #8023: ARROW-9318: [C++] Parquet encryption key management

2020-08-25 Thread GitBox
emkornfield commented on a change in pull request #8023: URL: https://github.com/apache/arrow/pull/8023#discussion_r477025570 ## File path: cpp/src/parquet/encryption.h ## @@ -47,15 +47,15 @@ using ColumnPathToEncryptionPropertiesMap = class PARQUET_EXPORT

[GitHub] [arrow] emkornfield commented on pull request #8015: PARQUET-1899: [C++] Deprecated ReadValuesSpaced

2020-08-25 Thread GitBox
emkornfield commented on pull request #8015: URL: https://github.com/apache/arrow/pull/8015#issuecomment-680668457 Thanks will give that a try (likely not for a little bit since working on nested functionality is going to be my main focus).

[GitHub] [arrow] jorgecarleitao commented on a change in pull request #8032: ARROW-9836: [Rust][DataFusion] Improve API for usage of UDFs

2020-08-25 Thread GitBox
jorgecarleitao commented on a change in pull request #8032: URL: https://github.com/apache/arrow/pull/8032#discussion_r477045153 ## File path: rust/datafusion/src/logicalplan.rs ## @@ -1042,6 +998,35 @@ pub fn can_coerce_from(type_into: , type_from: ) -> bool { } }

[GitHub] [arrow] jorgecarleitao commented on pull request #8045: Simplified argument types of ScalarFunctions.

2020-08-25 Thread GitBox
jorgecarleitao commented on pull request #8045: URL: https://github.com/apache/arrow/pull/8045#issuecomment-679842380 FYI @andygrove and @alamb This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] jorgecarleitao opened a new pull request #8045: Simplified argument types of ScalarFunctions.

2020-08-25 Thread GitBox
jorgecarleitao opened a new pull request #8045: URL: https://github.com/apache/arrow/pull/8045 Deprecates "Field" as argument to the UDF declaration, since we are only using its type. This is a spin-off of #8032 with a much smaller scope, as the other one is getting to large to

[GitHub] [arrow] fredgan opened a new pull request #8046: ARROW-9850:[Go] Defer should not be used inside a loop

2020-08-25 Thread GitBox
fredgan opened a new pull request #8046: URL: https://github.com/apache/arrow/pull/8046 As is described in the second section in https://blog.learngoprogramming.com/gotchas-of-defer-in-go-1-8d070894cb01 defer inside the loop may cause unforeseen problems.

[GitHub] [arrow] github-actions[bot] commented on pull request #8046: ARROW-9850:[Go] Defer should not be used inside a loop

2020-08-25 Thread GitBox
github-actions[bot] commented on pull request #8046: URL: https://github.com/apache/arrow/pull/8046#issuecomment-679892653 https://issues.apache.org/jira/browse/ARROW-9850 This is an automated message from the Apache Git

[GitHub] [arrow] jorgecarleitao commented on a change in pull request #8034: ARROW-9464: [Rust] [DataFusion] Physical plan optimization rule to insert MergeExec when needed

2020-08-25 Thread GitBox
jorgecarleitao commented on a change in pull request #8034: URL: https://github.com/apache/arrow/pull/8034#discussion_r476553596 ## File path: rust/datafusion/src/execution/physical_plan/planner.rs ## @@ -61,6 +61,55 @@ impl PhysicalPlanner for DefaultPhysicalPlanner {

[GitHub] [arrow] wqc200 commented on a change in pull request #8033: ARROW-9837: [Rust][DataFusion] Add provider for variable

2020-08-25 Thread GitBox
wqc200 commented on a change in pull request #8033: URL: https://github.com/apache/arrow/pull/8033#discussion_r476572025 ## File path: rust/datafusion/src/variable/system.rs ## @@ -0,0 +1,18 @@ +use crate::logicalplan::ScalarValue; +use crate::error::Result; +use

[GitHub] [arrow] jorgecarleitao commented on pull request #8045: ARROW-9849: [Rust] [DataFusion] Simplified argument types of ScalarFunctions.

2020-08-25 Thread GitBox
jorgecarleitao commented on pull request #8045: URL: https://github.com/apache/arrow/pull/8045#issuecomment-679988326 > Could you please add the JIRA number and the category to the title? wops. Thanks @kiszk , forgot about it.

[GitHub] [arrow] github-actions[bot] commented on pull request #8053: ARROW-9855: [R] Fix bad merge/Rcpp conflict

2020-08-25 Thread GitBox
github-actions[bot] commented on pull request #8053: URL: https://github.com/apache/arrow/pull/8053#issuecomment-680193503 https://issues.apache.org/jira/browse/ARROW-9855 This is an automated message from the Apache Git

[GitHub] [arrow] nealrichardson opened a new pull request #8053: ARROW-9855: [R] Fix bad merge/Rcpp conflict

2020-08-25 Thread GitBox
nealrichardson opened a new pull request #8053: URL: https://github.com/apache/arrow/pull/8053 Also adds a trailing slash to a URL, the lack of which technically causes a redirect and that led CRAN to pull the latest `arrow` submission for manual inspection 

[GitHub] [arrow] arw2019 commented on a change in pull request #8044: ARROW-7663: [Python] Raise better error message when passing mixed-type (int/string) Pandas dataframe to pyarrow Table

2020-08-25 Thread GitBox
arw2019 commented on a change in pull request #8044: URL: https://github.com/apache/arrow/pull/8044#discussion_r476648698 ## File path: python/pyarrow/tests/test_convert_builtin.py ## @@ -382,11 +382,8 @@ def test_sequence_custom_integers(seq):

[GitHub] [arrow] pitrou commented on pull request #8050: ARROW-9852: [C++] Validate dictionaries fully on IPC read

2020-08-25 Thread GitBox
pitrou commented on pull request #8050: URL: https://github.com/apache/arrow/pull/8050#issuecomment-680108214 Another option: only validate dictionaries when deltas actually appear, and explicitly disallow delta dictionaries on the GPU (in any case, I don't think array concatenation

[GitHub] [arrow] emkornfield commented on pull request #8040: ARROW-9824: [C++] Export file_offset in RowGroupMetaData

2020-08-25 Thread GitBox
emkornfield commented on pull request #8040: URL: https://github.com/apache/arrow/pull/8040#issuecomment-680108252 Since this is simply a proxy object around the underlying metadata I think it is OK to expose but should have some documentation around the value of this field if the offset

[GitHub] [arrow] jhorstmann opened a new pull request #8051: ARROW-9853: [RUST] Implement take kernel for dictionary arrays

2020-08-25 Thread GitBox
jhorstmann opened a new pull request #8051: URL: https://github.com/apache/arrow/pull/8051 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [arrow] jhorstmann commented on pull request #8051: ARROW-9853: [RUST] Implement take kernel for dictionary arrays

2020-08-25 Thread GitBox
jhorstmann commented on pull request #8051: URL: https://github.com/apache/arrow/pull/8051#issuecomment-680176701 @andygrove @jorgecarleitao can I ask for your review? This is an automated message from the Apache Git

[GitHub] [arrow] nealrichardson closed pull request #8053: ARROW-9855: [R] Fix bad merge/Rcpp conflict

2020-08-25 Thread GitBox
nealrichardson closed pull request #8053: URL: https://github.com/apache/arrow/pull/8053 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow] github-actions[bot] commented on pull request #8051: ARROW-9853: [RUST] Implement take kernel for dictionary arrays

2020-08-25 Thread GitBox
github-actions[bot] commented on pull request #8051: URL: https://github.com/apache/arrow/pull/8051#issuecomment-680153274 https://issues.apache.org/jira/browse/ARROW-9853 This is an automated message from the Apache Git

[GitHub] [arrow] pitrou opened a new pull request #8052: ARROW-9761: [C/C++] Add experimental C ArrowArrayStream ABI

2020-08-25 Thread GitBox
pitrou opened a new pull request #8052: URL: https://github.com/apache/arrow/pull/8052 The goal is to have a standardized ABI to communicate streams of homogeneous arrays or record batches (for example for database result sets). The trickiest part is error reporting. This proposal

[GitHub] [arrow] pitrou commented on pull request #8052: ARROW-9761: [C/C++] Add experimental C ArrowArrayStream ABI

2020-08-25 Thread GitBox
pitrou commented on pull request #8052: URL: https://github.com/apache/arrow/pull/8052#issuecomment-680167277 cc @wesm This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [arrow] github-actions[bot] commented on pull request #8052: ARROW-9761: [C/C++] Add experimental C ArrowArrayStream ABI

2020-08-25 Thread GitBox
github-actions[bot] commented on pull request #8052: URL: https://github.com/apache/arrow/pull/8052#issuecomment-680168437 https://issues.apache.org/jira/browse/ARROW-9761 This is an automated message from the Apache Git

[GitHub] [arrow] bkietz commented on a change in pull request #8041: ARROW-8001: [R][Dataset] Bindings for dataset writing

2020-08-25 Thread GitBox
bkietz commented on a change in pull request #8041: URL: https://github.com/apache/arrow/pull/8041#discussion_r476555171 ## File path: r/vignettes/dataset.Rmd ## @@ -281,3 +284,79 @@ this would mean you could point to an S3 bucked of Parquet data and a directory of CSVs on

[GitHub] [arrow] emkornfield commented on a change in pull request #8011: ARROW-9803: [Go] Add initial support for s390x

2020-08-25 Thread GitBox
emkornfield commented on a change in pull request #8011: URL: https://github.com/apache/arrow/pull/8011#discussion_r476565438 ## File path: go/arrow/type_traits_decimal128.go ## @@ -39,8 +40,13 @@ func (decimal128Traits) BytesRequired(n int) int { return Decimal128SizeBytes *

[GitHub] [arrow] nealrichardson closed pull request #8041: ARROW-8001: [R][Dataset] Bindings for dataset writing

2020-08-25 Thread GitBox
nealrichardson closed pull request #8041: URL: https://github.com/apache/arrow/pull/8041 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow] jorgecarleitao commented on a change in pull request #8051: ARROW-9853: [RUST] Implement take kernel for dictionary arrays

2020-08-25 Thread GitBox
jorgecarleitao commented on a change in pull request #8051: URL: https://github.com/apache/arrow/pull/8051#discussion_r476637153 ## File path: rust/arrow/src/compute/kernels/take.rs ## @@ -657,4 +694,71 @@ mod tests { vec![None], ); } + +#[test]

[GitHub] [arrow] pitrou commented on a change in pull request #8008: ARROW-9369: [Python] Support conversion from python sequence to dictionary type

2020-08-25 Thread GitBox
pitrou commented on a change in pull request #8008: URL: https://github.com/apache/arrow/pull/8008#discussion_r476391155 ## File path: cpp/src/arrow/python/python_to_arrow.cc ## @@ -903,6 +897,75 @@ class FixedSizeListConverter : public BaseListConverter +class

[GitHub] [arrow] kiszk commented on a change in pull request #8035: ARROW-9641: [C++][Gandiva] Implement round() for floating point and double floating point numbers

2020-08-25 Thread GitBox
kiszk commented on a change in pull request #8035: URL: https://github.com/apache/arrow/pull/8035#discussion_r476395110 ## File path: cpp/src/gandiva/precompiled/extended_math_ops_test.cc ## @@ -87,6 +87,20 @@ TEST(TestExtendedMathOps, TestLogWithBase) {

[GitHub] [arrow] pitrou commented on pull request #8049: ARROW-9851: [C++] Disable AVX512 runtime paths with Valgrind

2020-08-25 Thread GitBox
pitrou commented on pull request #8049: URL: https://github.com/apache/arrow/pull/8049#issuecomment-680022432 @github-actions crossbow submit conda-cpp-valgrind This is an automated message from the Apache Git Service. To

[GitHub] [arrow] pitrou opened a new pull request #8049: ARROW-9851: [C++] Disable AVX512 runtime paths with Valgrind

2020-08-25 Thread GitBox
pitrou opened a new pull request #8049: URL: https://github.com/apache/arrow/pull/8049 Valgrind is still lacking support for AVX512 instructions: https://bugs.kde.org/show_bug.cgi?id=383010 This is an automated message

[GitHub] [arrow] pitrou commented on pull request #8049: ARROW-9851: [C++] Disable AVX512 runtime paths with Valgrind

2020-08-25 Thread GitBox
pitrou commented on pull request #8049: URL: https://github.com/apache/arrow/pull/8049#issuecomment-680022894 @ursabot crossbow submit conda-cpp-valgrind This is an automated message from the Apache Git Service. To respond

[GitHub] [arrow] pitrou commented on a change in pull request #8036: ARROW-9811: [C++] Unchecked floating point division by 0 should succeed

2020-08-25 Thread GitBox
pitrou commented on a change in pull request #8036: URL: https://github.com/apache/arrow/pull/8036#discussion_r476467763 ## File path: cpp/src/arrow/compute/kernels/scalar_arithmetic_test.cc ## @@ -165,12 +166,17 @@ class TestBinaryArithmetic : public TestBase { void

[GitHub] [arrow] pitrou commented on a change in pull request #7507: ARROW-8797: [C++] Read RecordBatch in a different endian

2020-08-25 Thread GitBox
pitrou commented on a change in pull request #7507: URL: https://github.com/apache/arrow/pull/7507#discussion_r476481191 ## File path: cpp/src/arrow/type.h ## @@ -1582,13 +1583,26 @@ class ARROW_EXPORT FieldRef { //

[GitHub] [arrow] pitrou commented on pull request #8050: ARROW-9852: [C++] Validate dictionaries fully on IPC read

2020-08-25 Thread GitBox
pitrou commented on pull request #8050: URL: https://github.com/apache/arrow/pull/8050#issuecomment-680077956 @wesm Need your opinion here. This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [arrow] pitrou opened a new pull request #8050: ARROW-9852: [C++] Validate dictionaries fully on IPC read

2020-08-25 Thread GitBox
pitrou opened a new pull request #8050: URL: https://github.com/apache/arrow/pull/8050 Finally, there's no way around it. We need O(N) validation of dictionaries received over IPC, because concatenating deltas may involve arbitrary reads (for example, nested dictionaries are compared for

[GitHub] [arrow] pitrou commented on pull request #8049: ARROW-9851: [C++] Disable AVX512 runtime paths with Valgrind

2020-08-25 Thread GitBox
pitrou commented on pull request #8049: URL: https://github.com/apache/arrow/pull/8049#issuecomment-680089365 @ursabot crossbow submit test-conda-cpp-valgrind This is an automated message from the Apache Git Service. To

[GitHub] [arrow] pitrou commented on pull request #8040: ARROW-9824: [C++] Export file_offset in RowGroupMetaData

2020-08-25 Thread GitBox
pitrou commented on pull request #8040: URL: https://github.com/apache/arrow/pull/8040#issuecomment-679991489 @emkornfield Do you think this is ok to add? This is an automated message from the Apache Git Service. To respond

[GitHub] [arrow] ursabot commented on pull request #8049: ARROW-9851: [C++] Disable AVX512 runtime paths with Valgrind

2020-08-25 Thread GitBox
ursabot commented on pull request #8049: URL: https://github.com/apache/arrow/pull/8049#issuecomment-680024980 [AMD64 Conda Crossbow Submit (#123788)](https://ci.ursalabs.org/#builders/98/builds/648) builder has been succeeded. Revision: c2c7526152ef73b7b54949ecd266594181085454

[GitHub] [arrow] kiszk commented on a change in pull request #7507: ARROW-8797: [C++] Read RecordBatch in a different endian

2020-08-25 Thread GitBox
kiszk commented on a change in pull request #7507: URL: https://github.com/apache/arrow/pull/7507#discussion_r476447442 ## File path: cpp/src/arrow/array/util.cc ## @@ -74,6 +74,177 @@ class ArrayDataWrapper { std::shared_ptr* out_; }; +class ArrayDataEndianSwapper { +

[GitHub] [arrow] pitrou commented on pull request #8049: ARROW-9851: [C++] Disable AVX512 runtime paths with Valgrind

2020-08-25 Thread GitBox
pitrou commented on pull request #8049: URL: https://github.com/apache/arrow/pull/8049#issuecomment-680024383 @ursabot crossbow submit test-conda-cpp-valgrind This is an automated message from the Apache Git Service. To

[GitHub] [arrow] vivkong commented on a change in pull request #8011: ARROW-9803: [Go] Add initial support for s390x

2020-08-25 Thread GitBox
vivkong commented on a change in pull request #8011: URL: https://github.com/apache/arrow/pull/8011#discussion_r476454888 ## File path: go/arrow/type_traits_decimal128.go ## @@ -39,8 +40,13 @@ func (decimal128Traits) BytesRequired(n int) int { return Decimal128SizeBytes *

[GitHub] [arrow] alamb commented on a change in pull request #8020: ARROW-9821: [Rust][DataFusion] Prototype design for UserDefined Logical Plan Nodes NOT FOR MERGING

2020-08-25 Thread GitBox
alamb commented on a change in pull request #8020: URL: https://github.com/apache/arrow/pull/8020#discussion_r476455934 ## File path: rust/datafusion/src/lp_limit.rs ## @@ -0,0 +1,99 @@ +//! Example of how a "User Defined logical plan node would work. Use +//! the

[GitHub] [arrow] github-actions[bot] commented on pull request #8050: ARROW-9852: [C++] Validate dictionaries fully on IPC read

2020-08-25 Thread GitBox
github-actions[bot] commented on pull request #8050: URL: https://github.com/apache/arrow/pull/8050#issuecomment-680084692 https://issues.apache.org/jira/browse/ARROW-9852 This is an automated message from the Apache Git

[GitHub] [arrow] ursabot commented on pull request #8049: ARROW-9851: [C++] Disable AVX512 runtime paths with Valgrind

2020-08-25 Thread GitBox
ursabot commented on pull request #8049: URL: https://github.com/apache/arrow/pull/8049#issuecomment-680090251 [AMD64 Conda Crossbow Submit (#123831)](https://ci.ursalabs.org/#builders/98/builds/651) builder has been succeeded. Revision: 5461e01225db7eaca9484213c0169f8030a3e3ba

[GitHub] [arrow] pitrou commented on a change in pull request #7963: ARROW-9699: [C++][Compute] Optimize mode kernel for small integer types

2020-08-25 Thread GitBox
pitrou commented on a change in pull request #7963: URL: https://github.com/apache/arrow/pull/7963#discussion_r476395265 ## File path: cpp/src/arrow/compute/kernels/aggregate_test.cc ## @@ -685,5 +687,10 @@ TYPED_TEST(TestFloatingModeKernel, Floats) {

[GitHub] [arrow] pitrou closed pull request #7940: ARROW-9702: [C++] Register bpacking SIMD to runtime path.

2020-08-25 Thread GitBox
pitrou closed pull request #7940: URL: https://github.com/apache/arrow/pull/7940 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] alamb commented on pull request #8032: ARROW-9836: [Rust][DataFusion] Improve API for usage of UDFs

2020-08-25 Thread GitBox
alamb commented on pull request #8032: URL: https://github.com/apache/arrow/pull/8032#issuecomment-680037941 For anyone else reading along, the associated document I think is https://docs.google.com/document/d/1Kzz642ScizeKXmVE1bBlbLvR663BKQaGqVIyy9cAscY/edit?usp=sharing

[GitHub] [arrow] alamb commented on a change in pull request #8032: ARROW-9836: [Rust][DataFusion] Improve API for usage of UDFs

2020-08-25 Thread GitBox
alamb commented on a change in pull request #8032: URL: https://github.com/apache/arrow/pull/8032#discussion_r476466084 ## File path: rust/datafusion/src/execution/context.rs ## @@ -193,6 +191,17 @@ impl ExecutionContext { state.scalar_functions.insert(f.name.clone(),

[GitHub] [arrow] pitrou commented on pull request #8049: ARROW-9851: [C++] Disable AVX512 runtime paths with Valgrind

2020-08-25 Thread GitBox
pitrou commented on pull request #8049: URL: https://github.com/apache/arrow/pull/8049#issuecomment-680044371 @ursabot crossbow submit test-conda-cpp-valgrind This is an automated message from the Apache Git Service. To

[GitHub] [arrow] andygrove commented on a change in pull request #8034: ARROW-9464: [Rust] [DataFusion] Physical plan optimization rule to insert MergeExec when needed

2020-08-25 Thread GitBox
andygrove commented on a change in pull request #8034: URL: https://github.com/apache/arrow/pull/8034#discussion_r476484428 ## File path: rust/datafusion/src/execution/physical_plan/planner.rs ## @@ -153,33 +202,28 @@ impl PhysicalPlanner for DefaultPhysicalPlanner {

[GitHub] [arrow] nealrichardson commented on pull request #7997: ARROW-9266: [Python][Packaging] enable C++ S3FS in macOS wheels

2020-08-25 Thread GitBox
nealrichardson commented on pull request #7997: URL: https://github.com/apache/arrow/pull/7997#issuecomment-680090405 If you're going to try to get the bundled `build_awssdk` macro to work, a couple of notes: 1. The features enabled in the bundled ep

[GitHub] [arrow] pitrou opened a new pull request #8048: ARROW-9813: [C++] Disable semantic interposition

2020-08-25 Thread GitBox
pitrou opened a new pull request #8048: URL: https://github.com/apache/arrow/pull/8048 By default, gcc enables "semantic interposition" which allows overriding a symbol using LD_PRELOAD tricks (for example). Disabling it allows faster calling conventions when calling global functions

[GitHub] [arrow] wesm commented on pull request #7997: ARROW-9266: [Python][Packaging] enable C++ S3FS in macOS wheels

2020-08-25 Thread GitBox
wesm commented on pull request #7997: URL: https://github.com/apache/arrow/pull/7997#issuecomment-680040081 We experimented a bit with conda-press a while back but it yielded poor results for us (wheels that were _much_ larger than our current ones). I expect we are going to be fighting

[GitHub] [arrow] wesm commented on pull request #7997: ARROW-9266: [Python][Packaging] enable C++ S3FS in macOS wheels

2020-08-25 Thread GitBox
wesm commented on pull request #7997: URL: https://github.com/apache/arrow/pull/7997#issuecomment-680040870 @github-actions crossbow submit wheel-osx-* This is an automated message from the Apache Git Service. To respond to

[GitHub] [arrow] pitrou commented on a change in pull request #7507: ARROW-8797: [C++] Read RecordBatch in a different endian

2020-08-25 Thread GitBox
pitrou commented on a change in pull request #7507: URL: https://github.com/apache/arrow/pull/7507#discussion_r476484988 ## File path: cpp/src/arrow/array/util.cc ## @@ -74,6 +74,177 @@ class ArrayDataWrapper { std::shared_ptr* out_; }; +class ArrayDataEndianSwapper { +

[GitHub] [arrow] github-actions[bot] commented on pull request #7997: ARROW-9266: [Python][Packaging] enable C++ S3FS in macOS wheels

2020-08-25 Thread GitBox
github-actions[bot] commented on pull request #7997: URL: https://github.com/apache/arrow/pull/7997#issuecomment-680053835 Revision: e7948bed71cc81ee2f25f8a01ebf3c383e1fac9c Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] pitrou commented on pull request #8049: ARROW-9851: [C++] Disable AVX512 runtime paths with Valgrind

2020-08-25 Thread GitBox
pitrou commented on pull request #8049: URL: https://github.com/apache/arrow/pull/8049#issuecomment-680091518 @jianxind What is your opinion on the approach here? This is an automated message from the Apache Git Service. To

[GitHub] [arrow] pitrou commented on pull request #8015: PARQUET-1899: [C++] Deprecated ReadValuesSpaced

2020-08-25 Thread GitBox
pitrou commented on pull request #8015: URL: https://github.com/apache/arrow/pull/8015#issuecomment-679985704 Here is an example: https://www.fluentcpp.com/2019/08/30/how-to-disable-a-warning-in-cpp/ (need to adapt it for the deprecation warning flags / numbers on gcc / clang / MSVC)

[GitHub] [arrow] kiszk commented on pull request #8045: Simplified argument types of ScalarFunctions.

2020-08-25 Thread GitBox
kiszk commented on pull request #8045: URL: https://github.com/apache/arrow/pull/8045#issuecomment-679984751 Could you please add the JIRA number and the category to the title? This is an automated message from the Apache

[GitHub] [arrow] vivkong opened a new pull request #8047: ARROW-9844: [CI] Add Go build job on s390x

2020-08-25 Thread GitBox
vivkong opened a new pull request #8047: URL: https://github.com/apache/arrow/pull/8047 As suggested by @kou in https://github.com/apache/arrow/pull/8011, this will add a Travis CI job for Go on s390x. This is an automated

[GitHub] [arrow] github-actions[bot] commented on pull request #8047: ARROW-9844: [CI] Add Go build job on s390x

2020-08-25 Thread GitBox
github-actions[bot] commented on pull request #8047: URL: https://github.com/apache/arrow/pull/8047#issuecomment-679995772 https://issues.apache.org/jira/browse/ARROW-9844 This is an automated message from the Apache Git

[GitHub] [arrow] alamb commented on pull request #8020: ARROW-9821: [Rust][DataFusion] Prototype design for UserDefined Logical Plan Nodes NOT FOR MERGING

2020-08-25 Thread GitBox
alamb commented on pull request #8020: URL: https://github.com/apache/arrow/pull/8020#issuecomment-680032087 Closing this PR as its goal is feedback. I plan on starting to implement the actual changes either later this week or early next week. Thanks for the feedback @andygrove

[GitHub] [arrow] wesm commented on pull request #7992: ARROW-9660: [C++] Revamp dictionary association in IPC

2020-08-25 Thread GitBox
wesm commented on pull request #7992: URL: https://github.com/apache/arrow/pull/7992#issuecomment-680038748 I'm sorry about the delay on this, I will try to complete this code review today This is an automated message from

[GitHub] [arrow] andygrove commented on a change in pull request #8034: ARROW-9464: [Rust] [DataFusion] Physical plan optimization rule to insert MergeExec when needed

2020-08-25 Thread GitBox
andygrove commented on a change in pull request #8034: URL: https://github.com/apache/arrow/pull/8034#discussion_r476475961 ## File path: rust/datafusion/src/execution/physical_plan/planner.rs ## @@ -61,6 +61,55 @@ impl PhysicalPlanner for DefaultPhysicalPlanner { ,

[GitHub] [arrow] pitrou commented on pull request #8050: ARROW-9852: [C++] Validate dictionaries fully on IPC read

2020-08-25 Thread GitBox
pitrou commented on pull request #8050: URL: https://github.com/apache/arrow/pull/8050#issuecomment-680096981 Hmm... we must also avoid validating GPU buffers. This is an automated message from the Apache Git Service. To

[GitHub] [arrow] github-actions[bot] commented on pull request #8048: ARROW-9813: [C++] Disable semantic interposition

2020-08-25 Thread GitBox
github-actions[bot] commented on pull request #8048: URL: https://github.com/apache/arrow/pull/8048#issuecomment-680003370 https://issues.apache.org/jira/browse/ARROW-9813 This is an automated message from the Apache Git

[GitHub] [arrow] pitrou commented on pull request #8049: ARROW-9851: [C++] Disable AVX512 runtime paths with Valgrind

2020-08-25 Thread GitBox
pitrou commented on pull request #8049: URL: https://github.com/apache/arrow/pull/8049#issuecomment-680035192 @ursabot crossbow submit test-conda-cpp-valgrind This is an automated message from the Apache Git Service. To

[GitHub] [arrow] ursabot commented on pull request #8049: ARROW-9851: [C++] Disable AVX512 runtime paths with Valgrind

2020-08-25 Thread GitBox
ursabot commented on pull request #8049: URL: https://github.com/apache/arrow/pull/8049#issuecomment-680039551 [AMD64 Conda Crossbow Submit (#123805)](https://ci.ursalabs.org/#builders/98/builds/649) builder has been succeeded. Revision: 8cc8c2b10fe963c296673e489cdb9f9080b6aa8b

[GitHub] [arrow] ursabot commented on pull request #8049: ARROW-9851: [C++] Disable AVX512 runtime paths with Valgrind

2020-08-25 Thread GitBox
ursabot commented on pull request #8049: URL: https://github.com/apache/arrow/pull/8049#issuecomment-680045478 [AMD64 Conda Crossbow Submit (#123814)](https://ci.ursalabs.org/#builders/98/builds/650) builder has been succeeded. Revision: 18b0ce28b86b6444efcce5096612c1674137604f

[GitHub] [arrow] pitrou commented on pull request #7871: ARROW-9605: [C++] Speed up aggregate min/max compute kernels on integer types

2020-08-25 Thread GitBox
pitrou commented on pull request #7871: URL: https://github.com/apache/arrow/pull/7871#issuecomment-679993093 @jianxind Sorry for the delay. Could you please rebase this PR? It looks like there are some conflicts now. This

[GitHub] [arrow] pitrou commented on a change in pull request #7898: ARROW-9642: [C++] Let MakeBuilder refer DictionaryType's index_type for deciding the starting bit width of the indices

2020-08-25 Thread GitBox
pitrou commented on a change in pull request #7898: URL: https://github.com/apache/arrow/pull/7898#discussion_r476407343 ## File path: cpp/src/arrow/array/array_dict_test.cc ## @@ -904,6 +904,67 @@ TEST(TestDecimalDictionaryBuilder, DoubleTableSize) {

[GitHub] [arrow] pitrou removed a comment on pull request #8049: ARROW-9851: [C++] Disable AVX512 runtime paths with Valgrind

2020-08-25 Thread GitBox
pitrou removed a comment on pull request #8049: URL: https://github.com/apache/arrow/pull/8049#issuecomment-680022671 @crossbow submit conda-cpp-valgrind This is an automated message from the Apache Git Service. To respond

[GitHub] [arrow] pitrou commented on pull request #8049: ARROW-9851: [C++] Disable AVX512 runtime paths with Valgrind

2020-08-25 Thread GitBox
pitrou commented on pull request #8049: URL: https://github.com/apache/arrow/pull/8049#issuecomment-680022671 @crossbow submit conda-cpp-valgrind This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] ursabot commented on pull request #8049: ARROW-9851: [C++] Disable AVX512 runtime paths with Valgrind

2020-08-25 Thread GitBox
ursabot commented on pull request #8049: URL: https://github.com/apache/arrow/pull/8049#issuecomment-680023886 [AMD64 Conda Crossbow Submit (#123787)](https://ci.ursalabs.org/#builders/98/builds/647) builder failed. Revision: c2c7526152ef73b7b54949ecd266594181085454 Crossbow:

[GitHub] [arrow] alamb commented on a change in pull request #8034: ARROW-9464: [Rust] [DataFusion] Physical plan optimization rule to insert MergeExec when needed

2020-08-25 Thread GitBox
alamb commented on a change in pull request #8034: URL: https://github.com/apache/arrow/pull/8034#discussion_r476449126 ## File path: rust/datafusion/src/execution/physical_plan/planner.rs ## @@ -61,6 +61,55 @@ impl PhysicalPlanner for DefaultPhysicalPlanner { ,

[GitHub] [arrow] github-actions[bot] commented on pull request #8049: ARROW-9851: [C++] Disable AVX512 runtime paths with Valgrind

2020-08-25 Thread GitBox
github-actions[bot] commented on pull request #8049: URL: https://github.com/apache/arrow/pull/8049#issuecomment-680030260 https://issues.apache.org/jira/browse/ARROW-9851 This is an automated message from the Apache Git

[GitHub] [arrow] kiszk commented on a change in pull request #7507: ARROW-8797: [C++] Read RecordBatch in a different endian

2020-08-25 Thread GitBox
kiszk commented on a change in pull request #7507: URL: https://github.com/apache/arrow/pull/7507#discussion_r476686943 ## File path: cpp/src/arrow/array/util.cc ## @@ -74,6 +74,177 @@ class ArrayDataWrapper { std::shared_ptr* out_; }; +class ArrayDataEndianSwapper { +

[GitHub] [arrow] wesm commented on a change in pull request #8050: ARROW-9852: [C++] Validate dictionaries fully on IPC read

2020-08-25 Thread GitBox
wesm commented on a change in pull request #8050: URL: https://github.com/apache/arrow/pull/8050#discussion_r476727957 ## File path: cpp/src/arrow/ipc/reader.cc ## @@ -717,8 +717,10 @@ Status ReadDictionary(const Buffer& metadata, DictionaryMemo* dictionary_memo, return

[GitHub] [arrow] jhorstmann commented on a change in pull request #8051: ARROW-9853: [RUST] Implement take kernel for dictionary arrays

2020-08-25 Thread GitBox
jhorstmann commented on a change in pull request #8051: URL: https://github.com/apache/arrow/pull/8051#discussion_r476688318 ## File path: rust/arrow/src/compute/kernels/take.rs ## @@ -657,4 +694,71 @@ mod tests { vec![None], ); } + +#[test] +

[GitHub] [arrow] jorgecarleitao commented on a change in pull request #8032: ARROW-9836: [Rust][DataFusion] Improve API for usage of UDFs

2020-08-25 Thread GitBox
jorgecarleitao commented on a change in pull request #8032: URL: https://github.com/apache/arrow/pull/8032#discussion_r476704464 ## File path: rust/datafusion/src/execution/physical_plan/functions.rs ## @@ -0,0 +1,208 @@ +// Licensed to the Apache Software Foundation (ASF)

[GitHub] [arrow] jorgecarleitao commented on a change in pull request #8032: ARROW-9836: [Rust][DataFusion] Improve API for usage of UDFs

2020-08-25 Thread GitBox
jorgecarleitao commented on a change in pull request #8032: URL: https://github.com/apache/arrow/pull/8032#discussion_r476704869 ## File path: rust/datafusion/src/execution/physical_plan/functions.rs ## @@ -0,0 +1,208 @@ +// Licensed to the Apache Software Foundation (ASF)

[GitHub] [arrow] Alex-Monahan opened a new issue #8054: Arrow Flight from Python to JavaScript

2020-08-25 Thread GitBox
Alex-Monahan opened a new issue #8054: URL: https://github.com/apache/arrow/issues/8054 I made a JIRA Wish/Issue here, but I wasn't sure if I used the correct format: https://issues.apache.org/jira/browse/ARROW-9860 Essentially, I am looking for the fastest way to send large data

[GitHub] [arrow] jorgecarleitao commented on a change in pull request #8032: ARROW-9836: [Rust][DataFusion] Improve API for usage of UDFs

2020-08-25 Thread GitBox
jorgecarleitao commented on a change in pull request #8032: URL: https://github.com/apache/arrow/pull/8032#discussion_r476703332 ## File path: rust/datafusion/src/execution/dataframe_impl.rs ## @@ -232,6 +238,23 @@ mod tests { Ok(()) } +#[test] +fn

[GitHub] [arrow] github-actions[bot] commented on pull request #8049: ARROW-9851: [C++] Disable AVX512 runtime paths with Valgrind

2020-08-25 Thread GitBox
github-actions[bot] commented on pull request #8049: URL: https://github.com/apache/arrow/pull/8049#issuecomment-680380195 Revision: 5461e01225db7eaca9484213c0169f8030a3e3ba Submitted crossbow builds: [ursa-labs/crossbow @

  1   2   >