[GitHub] [arrow] mrkn commented on a change in pull request #6302: ARROW-7633: [C++][CI] Create fuzz targets for tensors and sparse tensors

2021-01-04 Thread GitBox
mrkn commented on a change in pull request #6302: URL: https://github.com/apache/arrow/pull/6302#discussion_r551765015 ## File path: cpp/src/arrow/ipc/generate_tensor_fuzz_corpus.cc ## @@ -0,0 +1,132 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more

[GitHub] [arrow] mrkn commented on a change in pull request #6302: ARROW-7633: [C++][CI] Create fuzz targets for tensors and sparse tensors

2021-01-04 Thread GitBox
mrkn commented on a change in pull request #6302: URL: https://github.com/apache/arrow/pull/6302#discussion_r551764667 ## File path: cpp/src/arrow/ipc/test_common.h ## @@ -161,6 +162,11 @@ Status MakeUuid(std::shared_ptr* out); ARROW_TESTING_EXPORT Status MakeDictExtension(st

[GitHub] [arrow] nevi-me commented on pull request #9093: ARROW-11125: [Rust] Logical equality for list arrays

2021-01-04 Thread GitBox
nevi-me commented on pull request #9093: URL: https://github.com/apache/arrow/pull/9093#issuecomment-754464034 Thanks for the review @jorgecarleitao. I've addressed your queries and comments, and cleaned up the TODOs This is

[GitHub] [arrow] nevi-me commented on a change in pull request #9093: ARROW-11125: [Rust] Logical equality for list arrays

2021-01-04 Thread GitBox
nevi-me commented on a change in pull request #9093: URL: https://github.com/apache/arrow/pull/9093#discussion_r551762894 ## File path: rust/arrow/src/array/equal/utils.rs ## @@ -76,3 +80,185 @@ pub(super) fn equal_len( ) -> bool { lhs_values[lhs_start..(lhs_start + len)]

[GitHub] [arrow] nevi-me commented on a change in pull request #9093: ARROW-11125: [Rust] Logical equality for list arrays

2021-01-04 Thread GitBox
nevi-me commented on a change in pull request #9093: URL: https://github.com/apache/arrow/pull/9093#discussion_r551762193 ## File path: rust/arrow/src/array/equal/structure.rs ## @@ -37,39 +37,20 @@ fn equal_values( rhs_start: usize, len: usize, ) -> bool { -let

[GitHub] [arrow] jorgecarleitao commented on a change in pull request #9093: ARROW-11125: [Rust] Logical equality for list arrays

2021-01-04 Thread GitBox
jorgecarleitao commented on a change in pull request #9093: URL: https://github.com/apache/arrow/pull/9093#discussion_r551746160 ## File path: rust/arrow/src/array/equal/mod.rs ## @@ -146,118 +146,103 @@ fn equal_values( rhs_start: usize, len: usize, ) -> bool { -

[GitHub] [arrow] nevi-me commented on a change in pull request #9093: ARROW-11125: [Rust] Logical equality for list arrays

2021-01-04 Thread GitBox
nevi-me commented on a change in pull request #9093: URL: https://github.com/apache/arrow/pull/9093#discussion_r551745156 ## File path: rust/arrow/src/array/equal/mod.rs ## @@ -146,118 +146,103 @@ fn equal_values( rhs_start: usize, len: usize, ) -> bool { -// com

[GitHub] [arrow] jorgecarleitao commented on a change in pull request #9093: ARROW-11125: [Rust] Logical equality for list arrays

2021-01-04 Thread GitBox
jorgecarleitao commented on a change in pull request #9093: URL: https://github.com/apache/arrow/pull/9093#discussion_r551739520 ## File path: rust/arrow/src/array/data.rs ## @@ -136,6 +137,84 @@ impl ArrayData { &self.null_bitmap } +/// Computes the logical

[GitHub] [arrow] jorgecarleitao commented on a change in pull request #9093: ARROW-11125: [Rust] Logical equality for list arrays

2021-01-04 Thread GitBox
jorgecarleitao commented on a change in pull request #9093: URL: https://github.com/apache/arrow/pull/9093#discussion_r551738974 ## File path: rust/arrow/src/array/equal/structure.rs ## @@ -55,21 +56,24 @@ fn equal_values( temp_lhs.as_ref()

[GitHub] [arrow] mqy commented on pull request #9025: ARROW-10259: [Rust] Add custom metadata to Field

2021-01-04 Thread GitBox
mqy commented on pull request #9025: URL: https://github.com/apache/arrow/pull/9025#issuecomment-754421764 > Hey @mqy, I'm back in the city, and have access to my desktop; so I'll be able to review this PR and help you enable integration tests during the week. Welcome back! ---

[GitHub] [arrow] jorgecarleitao commented on pull request #9099: ARROW-11129: [Rust][DataFusion] Use tokio for writing parquet

2021-01-04 Thread GitBox
jorgecarleitao commented on pull request #9099: URL: https://github.com/apache/arrow/pull/9099#issuecomment-754386570 Two CI jobs are hanging indefinitely, I canceled them. This is an automated message from the Apache Git Ser

[GitHub] [arrow] arw2019 commented on pull request #8955: ARROW-9948: [C++] in Decimal128::FromString raise when scale is out of bounds

2021-01-04 Thread GitBox
arw2019 commented on pull request #8955: URL: https://github.com/apache/arrow/pull/8955#issuecomment-754383958 Thanks @pitrou @kiszk for reviewing! Planning to push an update in the next day or so This is an automated messag

[GitHub] [arrow] jorgecarleitao commented on a change in pull request #9086: [Rust] [DataFusion] [Experiment] Blocking threads filter

2021-01-04 Thread GitBox
jorgecarleitao commented on a change in pull request #9086: URL: https://github.com/apache/arrow/pull/9086#discussion_r551702345 ## File path: rust/datafusion/src/physical_plan/filter.rs ## @@ -103,25 +103,23 @@ impl ExecutionPlan for FilterExec { } async fn execute

[GitHub] [arrow] zhztheplayer commented on a change in pull request #7030: ARROW-7808: [Java][Dataset] Implement Dataset Java API by JNI to C++

2021-01-04 Thread GitBox
zhztheplayer commented on a change in pull request #7030: URL: https://github.com/apache/arrow/pull/7030#discussion_r551694880 ## File path: cpp/src/arrow/memory_pool.h ## @@ -149,6 +149,43 @@ class ARROW_EXPORT ProxyMemoryPool : public MemoryPool { std::unique_ptr impl_; }

[GitHub] [arrow] zhztheplayer commented on a change in pull request #7030: ARROW-7808: [Java][Dataset] Implement Dataset Java API by JNI to C++

2021-01-04 Thread GitBox
zhztheplayer commented on a change in pull request #7030: URL: https://github.com/apache/arrow/pull/7030#discussion_r551694096 ## File path: cpp/src/arrow/memory_pool.cc ## @@ -534,4 +535,139 @@ int64_t ProxyMemoryPool::max_memory() const { return impl_->max_memory(); } std

[GitHub] [arrow] zhztheplayer commented on a change in pull request #7030: ARROW-7808: [Java][Dataset] Implement Dataset Java API by JNI to C++

2021-01-04 Thread GitBox
zhztheplayer commented on a change in pull request #7030: URL: https://github.com/apache/arrow/pull/7030#discussion_r551693971 ## File path: cpp/src/arrow/memory_pool.cc ## @@ -534,4 +535,139 @@ int64_t ProxyMemoryPool::max_memory() const { return impl_->max_memory(); } std

[GitHub] [arrow] zhztheplayer commented on a change in pull request #7030: ARROW-7808: [Java][Dataset] Implement Dataset Java API by JNI to C++

2021-01-04 Thread GitBox
zhztheplayer commented on a change in pull request #7030: URL: https://github.com/apache/arrow/pull/7030#discussion_r551693725 ## File path: cpp/src/arrow/memory_pool.h ## @@ -149,6 +149,43 @@ class ARROW_EXPORT ProxyMemoryPool : public MemoryPool { std::unique_ptr impl_; }

[GitHub] [arrow] kou closed pull request #9098: ARROW-11127: [C++] ifdef unused cpu_info on non-x86 platforms

2021-01-04 Thread GitBox
kou closed pull request #9098: URL: https://github.com/apache/arrow/pull/9098 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

[GitHub] [arrow] sunchao commented on a change in pull request #9047: ARROW-11072: [Rust] [Parquet] Support reading decimal from physical int types

2021-01-04 Thread GitBox
sunchao commented on a change in pull request #9047: URL: https://github.com/apache/arrow/pull/9047#discussion_r551681563 ## File path: rust/parquet/src/arrow/schema.rs ## @@ -591,6 +591,7 @@ impl ParquetTypeConverter<'_> { LogicalType::INT_32 => Ok(DataType::Int32

[GitHub] [arrow] sunchao commented on a change in pull request #9064: ARROW-11074: [Rust][DataFusion] Implement predicate push-down for parquet tables

2021-01-04 Thread GitBox
sunchao commented on a change in pull request #9064: URL: https://github.com/apache/arrow/pull/9064#discussion_r551680435 ## File path: rust/parquet/src/file/serialized_reader.rs ## @@ -137,6 +137,22 @@ impl SerializedFileReader { metadata, }) } + +

[GitHub] [arrow] sunchao commented on a change in pull request #9064: ARROW-11074: [Rust][DataFusion] Implement predicate push-down for parquet tables

2021-01-04 Thread GitBox
sunchao commented on a change in pull request #9064: URL: https://github.com/apache/arrow/pull/9064#discussion_r551680435 ## File path: rust/parquet/src/file/serialized_reader.rs ## @@ -137,6 +137,22 @@ impl SerializedFileReader { metadata, }) } + +

[GitHub] [arrow] terencehonles commented on pull request #8915: ARROW-10904: [Python] Add support for Python 3.9 macOS wheels

2021-01-04 Thread GitBox
terencehonles commented on pull request #8915: URL: https://github.com/apache/arrow/pull/8915#issuecomment-754339737 @kszucs I noticed there was a list of allowed actions. Could the allow list just be updated or is that enforced at the apache (org) level rather than the repo level? -

[GitHub] [arrow] kszucs commented on pull request #8915: ARROW-10904: [Python] Add support for Python 3.9 macOS wheels

2021-01-04 Thread GitBox
kszucs commented on pull request #8915: URL: https://github.com/apache/arrow/pull/8915#issuecomment-754300248 Yes, the comment bot seems to be failing https://github.com/apache/arrow/actions/runs/462228610 This is most likely caused by the recent security changes, so we need to update t

[GitHub] [arrow] nealrichardson commented on pull request #9097: ARROW-10881: [C++] Fix EXC_BAD_ACCESS in PutSpaced

2021-01-04 Thread GitBox
nealrichardson commented on pull request #9097: URL: https://github.com/apache/arrow/pull/9097#issuecomment-754297915 @xhochy autotune isn't working at the moment--INFRA has blocked some of the Actions we use, including on that workflow. See https://issues.apache.org/jira/browse/INFRA-2123

[GitHub] [arrow] nealrichardson closed pull request #9092: ARROW-10624: [R] Proactively remove "problems" attributes

2021-01-04 Thread GitBox
nealrichardson closed pull request #9092: URL: https://github.com/apache/arrow/pull/9092 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[GitHub] [arrow] yordan-pavlov commented on a change in pull request #9064: ARROW-11074: [Rust][DataFusion] Implement predicate push-down for parquet tables

2021-01-04 Thread GitBox
yordan-pavlov commented on a change in pull request #9064: URL: https://github.com/apache/arrow/pull/9064#discussion_r551621749 ## File path: rust/parquet/src/file/serialized_reader.rs ## @@ -137,6 +137,22 @@ impl SerializedFileReader { metadata, }) }

[GitHub] [arrow] yordan-pavlov commented on a change in pull request #9064: ARROW-11074: [Rust][DataFusion] Implement predicate push-down for parquet tables

2021-01-04 Thread GitBox
yordan-pavlov commented on a change in pull request #9064: URL: https://github.com/apache/arrow/pull/9064#discussion_r551621749 ## File path: rust/parquet/src/file/serialized_reader.rs ## @@ -137,6 +137,22 @@ impl SerializedFileReader { metadata, }) }

[GitHub] [arrow] emkornfield commented on a change in pull request #9024: ARROW-11044: [C++] Add "replace" kernel

2021-01-04 Thread GitBox
emkornfield commented on a change in pull request #9024: URL: https://github.com/apache/arrow/pull/9024#discussion_r551619381 ## File path: cpp/src/arrow/compute/kernels/scalar_replace.cc ## @@ -0,0 +1,309 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or

[GitHub] [arrow] terencehonles commented on pull request #8915: ARROW-10904: [Python] Add support for Python 3.9 macOS wheels

2021-01-04 Thread GitBox
terencehonles commented on pull request #8915: URL: https://github.com/apache/arrow/pull/8915#issuecomment-754272662 @kou / @kszucs you may already know but it looks like the crossbow submitter / github actions bot isn't working(?). It doesn't look like multibuild has tasks for GitHub Acti

[GitHub] [arrow] emkornfield commented on pull request #7030: ARROW-7808: [Java][Dataset] Implement Dataset Java API by JNI to C++

2021-01-04 Thread GitBox
emkornfield commented on pull request #7030: URL: https://github.com/apache/arrow/pull/7030#issuecomment-754270017 @zhztheplayer I need to take a closer look at the JNI changes since this was last approved, and will take another look at the memory stuff once you've added some docs. ---

[GitHub] [arrow] alamb closed pull request #9094: ARROW-11126: [Rust] Document and test ARROW-10656

2021-01-04 Thread GitBox
alamb closed pull request #9094: URL: https://github.com/apache/arrow/pull/9094 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] github-actions[bot] commented on pull request #9099: ARROW-11129: [Rust][DataFusion] Use tokio for loading parquet [WIP]

2021-01-04 Thread GitBox
github-actions[bot] commented on pull request #9099: URL: https://github.com/apache/arrow/pull/9099#issuecomment-754266445 https://issues.apache.org/jira/browse/ARROW-11129 This is an automated message from the Apache Git Ser

[GitHub] [arrow] Dandandan opened a new pull request #9099: ARROW-11129: [Rust][DataFusion] Use tokio for loading parquet [WIP]

2021-01-04 Thread GitBox
Dandandan opened a new pull request #9099: URL: https://github.com/apache/arrow/pull/9099 Inspired by PR https://github.com/apache/arrow/pull/9086 from @jorgecarleitao Seems better to use `task::spawn_blocking` to set upper limit on nr. of threads (512 by default) using tokio. -

[GitHub] [arrow] emkornfield commented on a change in pull request #7030: ARROW-7808: [Java][Dataset] Implement Dataset Java API by JNI to C++

2021-01-04 Thread GitBox
emkornfield commented on a change in pull request #7030: URL: https://github.com/apache/arrow/pull/7030#discussion_r551610971 ## File path: cpp/src/arrow/memory_pool.h ## @@ -149,6 +149,43 @@ class ARROW_EXPORT ProxyMemoryPool : public MemoryPool { std::unique_ptr impl_; };

[GitHub] [arrow] emkornfield commented on a change in pull request #7030: ARROW-7808: [Java][Dataset] Implement Dataset Java API by JNI to C++

2021-01-04 Thread GitBox
emkornfield commented on a change in pull request #7030: URL: https://github.com/apache/arrow/pull/7030#discussion_r551610553 ## File path: cpp/src/arrow/memory_pool.cc ## @@ -534,4 +535,139 @@ int64_t ProxyMemoryPool::max_memory() const { return impl_->max_memory(); } std:

[GitHub] [arrow] emkornfield commented on a change in pull request #7030: ARROW-7808: [Java][Dataset] Implement Dataset Java API by JNI to C++

2021-01-04 Thread GitBox
emkornfield commented on a change in pull request #7030: URL: https://github.com/apache/arrow/pull/7030#discussion_r551610223 ## File path: cpp/src/arrow/memory_pool.cc ## @@ -534,4 +535,139 @@ int64_t ProxyMemoryPool::max_memory() const { return impl_->max_memory(); } std:

[GitHub] [arrow] emkornfield commented on a change in pull request #7030: ARROW-7808: [Java][Dataset] Implement Dataset Java API by JNI to C++

2021-01-04 Thread GitBox
emkornfield commented on a change in pull request #7030: URL: https://github.com/apache/arrow/pull/7030#discussion_r551610076 ## File path: cpp/src/arrow/memory_pool.cc ## @@ -534,4 +535,139 @@ int64_t ProxyMemoryPool::max_memory() const { return impl_->max_memory(); } std:

[GitHub] [arrow] emkornfield commented on a change in pull request #7030: ARROW-7808: [Java][Dataset] Implement Dataset Java API by JNI to C++

2021-01-04 Thread GitBox
emkornfield commented on a change in pull request #7030: URL: https://github.com/apache/arrow/pull/7030#discussion_r551609465 ## File path: cpp/src/arrow/memory_pool.h ## @@ -149,6 +149,43 @@ class ARROW_EXPORT ProxyMemoryPool : public MemoryPool { std::unique_ptr impl_; };

[GitHub] [arrow] emkornfield commented on a change in pull request #7030: ARROW-7808: [Java][Dataset] Implement Dataset Java API by JNI to C++

2021-01-04 Thread GitBox
emkornfield commented on a change in pull request #7030: URL: https://github.com/apache/arrow/pull/7030#discussion_r551609281 ## File path: cpp/src/arrow/memory_pool.h ## @@ -149,6 +149,43 @@ class ARROW_EXPORT ProxyMemoryPool : public MemoryPool { std::unique_ptr impl_; };

[GitHub] [arrow] emkornfield commented on a change in pull request #7030: ARROW-7808: [Java][Dataset] Implement Dataset Java API by JNI to C++

2021-01-04 Thread GitBox
emkornfield commented on a change in pull request #7030: URL: https://github.com/apache/arrow/pull/7030#discussion_r551609196 ## File path: cpp/src/arrow/memory_pool.h ## @@ -149,6 +149,43 @@ class ARROW_EXPORT ProxyMemoryPool : public MemoryPool { std::unique_ptr impl_; };

[GitHub] [arrow] bkietz commented on a change in pull request #8894: ARROW-10322: [C++][Dataset] Minimize Expression

2021-01-04 Thread GitBox
bkietz commented on a change in pull request #8894: URL: https://github.com/apache/arrow/pull/8894#discussion_r551608598 ## File path: cpp/src/arrow/dataset/expression_internal.h ## @@ -0,0 +1,465 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more con

[GitHub] [arrow] emkornfield closed pull request #8597: ARROW-10492: [Java][JDBC] Allow users to config the mapping between SQL types and Arrow types

2021-01-04 Thread GitBox
emkornfield closed pull request #8597: URL: https://github.com/apache/arrow/pull/8597 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[GitHub] [arrow] emkornfield commented on issue #8987: Are the Date Logical Type and Date Converted Type implemented?

2021-01-04 Thread GitBox
emkornfield commented on issue #8987: URL: https://github.com/apache/arrow/issues/8987#issuecomment-754262638 I believe this was addressed on the mailing list. This is an automated message from the Apache Git Service. To resp

[GitHub] [arrow] emkornfield closed issue #8987: Are the Date Logical Type and Date Converted Type implemented?

2021-01-04 Thread GitBox
emkornfield closed issue #8987: URL: https://github.com/apache/arrow/issues/8987 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] emkornfield commented on pull request #8597: ARROW-10492: [Java][JDBC] Allow users to config the mapping between SQL types and Arrow types

2021-01-04 Thread GitBox
emkornfield commented on pull request #8597: URL: https://github.com/apache/arrow/pull/8597#issuecomment-754261654 +1 thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [arrow] emkornfield commented on pull request #8949: ARROW-10880: [Java] Support compressing RecordBatch IPC buffers by LZ4

2021-01-04 Thread GitBox
emkornfield commented on pull request #8949: URL: https://github.com/apache/arrow/pull/8949#issuecomment-754259372 Is it possible to add a test to confirm that this can be read/written from the C++ implementation? This is an

[GitHub] [arrow] emkornfield commented on a change in pull request #8949: ARROW-10880: [Java] Support compressing RecordBatch IPC buffers by LZ4

2021-01-04 Thread GitBox
emkornfield commented on a change in pull request #8949: URL: https://github.com/apache/arrow/pull/8949#discussion_r551603469 ## File path: java/vector/src/main/java/org/apache/arrow/vector/compression/Lz4CompressionCodec.java ## @@ -0,0 +1,119 @@ +/* + * Licensed to the Apach

[GitHub] [arrow] emkornfield commented on a change in pull request #8949: ARROW-10880: [Java] Support compressing RecordBatch IPC buffers by LZ4

2021-01-04 Thread GitBox
emkornfield commented on a change in pull request #8949: URL: https://github.com/apache/arrow/pull/8949#discussion_r551603359 ## File path: java/vector/src/main/java/org/apache/arrow/vector/compression/Lz4CompressionCodec.java ## @@ -0,0 +1,119 @@ +/* + * Licensed to the Apach

[GitHub] [arrow] emkornfield commented on a change in pull request #8949: ARROW-10880: [Java] Support compressing RecordBatch IPC buffers by LZ4

2021-01-04 Thread GitBox
emkornfield commented on a change in pull request #8949: URL: https://github.com/apache/arrow/pull/8949#discussion_r551603225 ## File path: java/vector/src/main/java/org/apache/arrow/vector/compression/Lz4CompressionCodec.java ## @@ -0,0 +1,119 @@ +/* + * Licensed to the Apach

[GitHub] [arrow] emkornfield commented on a change in pull request #8949: ARROW-10880: [Java] Support compressing RecordBatch IPC buffers by LZ4

2021-01-04 Thread GitBox
emkornfield commented on a change in pull request #8949: URL: https://github.com/apache/arrow/pull/8949#discussion_r551603003 ## File path: java/vector/src/main/java/org/apache/arrow/vector/compression/Lz4CompressionCodec.java ## @@ -0,0 +1,119 @@ +/* + * Licensed to the Apach

[GitHub] [arrow] yordan-pavlov commented on a change in pull request #9064: ARROW-11074: [Rust][DataFusion] Implement predicate push-down for parquet tables

2021-01-04 Thread GitBox
yordan-pavlov commented on a change in pull request #9064: URL: https://github.com/apache/arrow/pull/9064#discussion_r551597243 ## File path: rust/datafusion/src/physical_plan/parquet.rs ## @@ -209,6 +251,477 @@ impl ParquetExec { } } +#[derive(Debug, Clone)] +/// Predi

[GitHub] [arrow] github-actions[bot] commented on pull request #9098: ARROW-11127: [C++] ifdef unused cpu_info on non-x86 platforms

2021-01-04 Thread GitBox
github-actions[bot] commented on pull request #9098: URL: https://github.com/apache/arrow/pull/9098#issuecomment-754256502 https://issues.apache.org/jira/browse/ARROW-11127 This is an automated message from the Apache Git Ser

[GitHub] [arrow] xhochy commented on pull request #9097: ARROW-10881: [C++] Fix EXC_BAD_ACCESS in PutSpaced

2021-01-04 Thread GitBox
xhochy commented on pull request #9097: URL: https://github.com/apache/arrow/pull/9097#issuecomment-754256374 @github-actions autotune This is an automated message from the Apache Git Service. To respond to the message, pleas

[GitHub] [arrow] emkornfield commented on pull request #9053: ARROW-11081: [Java] Make IPC option immutable

2021-01-04 Thread GitBox
emkornfield commented on pull request #9053: URL: https://github.com/apache/arrow/pull/9053#issuecomment-754256316 @liyafan82 does this actually make a difference in benchmarks? I agree it is easier to reason about, but is there any way to avoid backward incompability? --

[GitHub] [arrow] xhochy opened a new pull request #9098: ARROW-11127: [C++] ifdef unused cpu_info on non-x86 platforms

2021-01-04 Thread GitBox
xhochy opened a new pull request #9098: URL: https://github.com/apache/arrow/pull/9098 Otherwise we get an error in debug builds because of `-Werror,Wunused` This is an automated message from the Apache Git Service. To respon

[GitHub] [arrow] emkornfield commented on a change in pull request #9053: ARROW-11081: [Java] Make IPC option immutable

2021-01-04 Thread GitBox
emkornfield commented on a change in pull request #9053: URL: https://github.com/apache/arrow/pull/9053#discussion_r551601013 ## File path: java/flight/flight-core/src/test/java/org/apache/arrow/flight/TestMetadataVersion.java ## @@ -56,9 +56,8 @@ public static void setUpClass

[GitHub] [arrow] emkornfield commented on a change in pull request #9053: ARROW-11081: [Java] Make IPC option immutable

2021-01-04 Thread GitBox
emkornfield commented on a change in pull request #9053: URL: https://github.com/apache/arrow/pull/9053#discussion_r551600824 ## File path: java/flight/flight-core/src/main/java/org/apache/arrow/flight/ArrowMessage.java ## @@ -194,10 +194,9 @@ public ArrowMessage(FlightDescrip

[GitHub] [arrow] emkornfield commented on pull request #9097: ARROW-10881: [C++] Fix EXC_BAD_ACCESS in PutSpaced

2021-01-04 Thread GitBox
emkornfield commented on pull request #9097: URL: https://github.com/apache/arrow/pull/9097#issuecomment-754254123 Is it possible to add a unit test? This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] kou commented on pull request #9096: [Python][Packaging] Refactor manylinux and windows wheel building [WIP]

2021-01-04 Thread GitBox
kou commented on pull request #9096: URL: https://github.com/apache/arrow/pull/9096#issuecomment-754252075 Related works: * https://github.com/apache/arrow/pull/8386#issuecomment-728321361 * #8881 This is an aut

[GitHub] [arrow] yordan-pavlov commented on a change in pull request #9064: ARROW-11074: [Rust][DataFusion] Implement predicate push-down for parquet tables

2021-01-04 Thread GitBox
yordan-pavlov commented on a change in pull request #9064: URL: https://github.com/apache/arrow/pull/9064#discussion_r551597243 ## File path: rust/datafusion/src/physical_plan/parquet.rs ## @@ -209,6 +251,477 @@ impl ParquetExec { } } +#[derive(Debug, Clone)] +/// Predi

[GitHub] [arrow] nealrichardson commented on pull request #8549: ARROW-10386 [R]: List column class attributes not preserved in roundtrip

2021-01-04 Thread GitBox
nealrichardson commented on pull request #8549: URL: https://github.com/apache/arrow/pull/8549#issuecomment-754251796 > This increases again the weight of the metadata, which now has to include attributes for each element of a list column How much does it blow up the metadata? Is thi

[GitHub] [arrow] jonkeane commented on pull request #9092: ARROW-10624: [R] Proactively remove "problems" attributes

2021-01-04 Thread GitBox
jonkeane commented on pull request #9092: URL: https://github.com/apache/arrow/pull/9092#issuecomment-754243679 Updated, passing actions on my fork: https://github.com/jonkeane/arrow/actions/runs/461948379 https://github.com/jonkeane/arrow/actions/runs/461948380 -

[GitHub] [arrow] yordan-pavlov commented on a change in pull request #9064: ARROW-11074: [Rust][DataFusion] Implement predicate push-down for parquet tables

2021-01-04 Thread GitBox
yordan-pavlov commented on a change in pull request #9064: URL: https://github.com/apache/arrow/pull/9064#discussion_r551588681 ## File path: rust/datafusion/src/physical_plan/parquet.rs ## @@ -209,6 +251,477 @@ impl ParquetExec { } } +#[derive(Debug, Clone)] +/// Predi

[GitHub] [arrow] yordan-pavlov commented on a change in pull request #9064: ARROW-11074: [Rust][DataFusion] Implement predicate push-down for parquet tables

2021-01-04 Thread GitBox
yordan-pavlov commented on a change in pull request #9064: URL: https://github.com/apache/arrow/pull/9064#discussion_r551587504 ## File path: rust/datafusion/src/physical_plan/parquet.rs ## @@ -209,6 +251,477 @@ impl ParquetExec { } } +#[derive(Debug, Clone)] +/// Predi

[GitHub] [arrow] nevi-me commented on a change in pull request #9089: ARROW-11122: [Rust] Added FFI support for date and time.

2021-01-04 Thread GitBox
nevi-me commented on a change in pull request #9089: URL: https://github.com/apache/arrow/pull/9089#discussion_r551586245 ## File path: rust/arrow-pyarrow-integration-testing/src/lib.rs ## @@ -153,10 +153,25 @@ fn substring(array: PyObject, start: i64, py: Python) -> PyResult

[GitHub] [arrow] nevi-me closed pull request #9065: ARROW-11096: [Rust] C data interface for [Large]binary

2021-01-04 Thread GitBox
nevi-me closed pull request #9065: URL: https://github.com/apache/arrow/pull/9065 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

[GitHub] [arrow] yordan-pavlov commented on a change in pull request #9064: ARROW-11074: [Rust][DataFusion] Implement predicate push-down for parquet tables

2021-01-04 Thread GitBox
yordan-pavlov commented on a change in pull request #9064: URL: https://github.com/apache/arrow/pull/9064#discussion_r551583777 ## File path: rust/parquet/src/file/serialized_reader.rs ## @@ -137,6 +137,22 @@ impl SerializedFileReader { metadata, }) }

[GitHub] [arrow] jonkeane commented on a change in pull request #8549: ARROW-10386 [R]: List column class attributes not preserved in roundtrip

2021-01-04 Thread GitBox
jonkeane commented on a change in pull request #8549: URL: https://github.com/apache/arrow/pull/8549#discussion_r551552588 ## File path: r/R/record-batch.R ## @@ -286,6 +286,20 @@ as.data.frame.RecordBatch <- function(x, row.names = NULL, optional = FALSE, ... apply_arrow_r

[GitHub] [arrow] bkietz commented on a change in pull request #8894: ARROW-10322: [C++][Dataset] Minimize Expression

2021-01-04 Thread GitBox
bkietz commented on a change in pull request #8894: URL: https://github.com/apache/arrow/pull/8894#discussion_r551578256 ## File path: cpp/src/arrow/compute/cast.cc ## @@ -118,8 +118,86 @@ class CastMetaFunction : public MetaFunction { } // namespace +const FunctionDoc st

[GitHub] [arrow] codecov-io edited a comment on pull request #9038: ARROW-10356: [Rust][DataFusion] Add support for is_in (WIP)

2021-01-04 Thread GitBox
codecov-io edited a comment on pull request #9038: URL: https://github.com/apache/arrow/pull/9038#issuecomment-753241531 # [Codecov](https://codecov.io/gh/apache/arrow/pull/9038?src=pr&el=h1) Report > Merging [#9038](https://codecov.io/gh/apache/arrow/pull/9038?src=pr&el=desc) (7a44a99)

[GitHub] [arrow] sweb commented on a change in pull request #9047: ARROW-11072: [Rust] [Parquet] Support reading decimal from physical int types

2021-01-04 Thread GitBox
sweb commented on a change in pull request #9047: URL: https://github.com/apache/arrow/pull/9047#discussion_r551575727 ## File path: rust/parquet/src/arrow/schema.rs ## @@ -591,6 +591,7 @@ impl ParquetTypeConverter<'_> { LogicalType::INT_32 => Ok(DataType::Int32),

[GitHub] [arrow] bkietz commented on a change in pull request #8894: ARROW-10322: [C++][Dataset] Minimize Expression

2021-01-04 Thread GitBox
bkietz commented on a change in pull request #8894: URL: https://github.com/apache/arrow/pull/8894#discussion_r551574884 ## File path: cpp/src/arrow/dataset/expression_internal.h ## @@ -0,0 +1,465 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more con

[GitHub] [arrow] yordan-pavlov commented on a change in pull request #9064: ARROW-11074: [Rust][DataFusion] Implement predicate push-down for parquet tables

2021-01-04 Thread GitBox
yordan-pavlov commented on a change in pull request #9064: URL: https://github.com/apache/arrow/pull/9064#discussion_r551574592 ## File path: rust/datafusion/src/physical_plan/parquet.rs ## @@ -209,6 +251,479 @@ impl ParquetExec { } } +#[derive(Debug, Clone)] +/// Predi

[GitHub] [arrow] seddonm1 commented on pull request #9038: ARROW-10356: [Rust][DataFusion] Add support for is_in (WIP)

2021-01-04 Thread GitBox
seddonm1 commented on pull request #9038: URL: https://github.com/apache/arrow/pull/9038#issuecomment-754223732 Ok, I have done a major refactor against a rebased master. I believe this now meets the ANSI behavior with regard to `NULL` handling but it does not yet support syntax wher

[GitHub] [arrow] github-actions[bot] commented on pull request #9097: ARROW-10881: [C++] Fix EXC_BAD_ACCESS in PutSpaced

2021-01-04 Thread GitBox
github-actions[bot] commented on pull request #9097: URL: https://github.com/apache/arrow/pull/9097#issuecomment-754222849 https://issues.apache.org/jira/browse/ARROW-10881 This is an automated message from the Apache Git Ser

[GitHub] [arrow] xhochy opened a new pull request #9097: ARROW-10881: [C++] Fix EXC_BAD_ACCESS in PutSpaced

2021-01-04 Thread GitBox
xhochy opened a new pull request #9097: URL: https://github.com/apache/arrow/pull/9097 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow] terencehonles commented on pull request #8915: ARROW-10904: [Python] Add support for Python 3.9 macOS wheels

2021-01-04 Thread GitBox
terencehonles commented on pull request #8915: URL: https://github.com/apache/arrow/pull/8915#issuecomment-754216137 @github-actions crossbow submit wheel-osx-*-cp39 This is an automated message from the Apache Git Service. T

[GitHub] [arrow] johncassil commented on pull request #8365: ARROW-6582: [R] Arrow to R fails with embedded nuls in strings

2021-01-04 Thread GitBox
johncassil commented on pull request #8365: URL: https://github.com/apache/arrow/pull/8365#issuecomment-754215072 @jimhester, @nealrichardson, @bkietz @dianaclarke @romainfrancois Just wanted to say thanks for working on this. I reported it a long time ago and have just been period

[GitHub] [arrow] alamb closed pull request #9047: ARROW-11072: [Rust] [Parquet] Support reading decimal from physical int types

2021-01-04 Thread GitBox
alamb closed pull request #9047: URL: https://github.com/apache/arrow/pull/9047 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] nevi-me commented on pull request #8927: ARROW-10766: [Rust] [Parquet] Nested List IO

2021-01-04 Thread GitBox
nevi-me commented on pull request #8927: URL: https://github.com/apache/arrow/pull/8927#issuecomment-754211680 I got stalled with #9093. I think it's the last blocker before I can complete this :( This is an automated messag

[GitHub] [arrow] jonkeane commented on a change in pull request #9092: ARROW-10624: [R] Proactively remove "problems" attributes

2021-01-04 Thread GitBox
jonkeane commented on a change in pull request #9092: URL: https://github.com/apache/arrow/pull/9092#discussion_r551560380 ## File path: r/R/record-batch.R ## @@ -274,6 +274,12 @@ as.data.frame.RecordBatch <- function(x, row.names = NULL, optional = FALSE, ... } .serialize

[GitHub] [arrow] nealrichardson closed pull request #8365: ARROW-6582: [R] Arrow to R fails with embedded nuls in strings

2021-01-04 Thread GitBox
nealrichardson closed pull request #8365: URL: https://github.com/apache/arrow/pull/8365 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[GitHub] [arrow] sunchao commented on a change in pull request #9064: ARROW-11074: [Rust][DataFusion] Implement predicate push-down for parquet tables

2021-01-04 Thread GitBox
sunchao commented on a change in pull request #9064: URL: https://github.com/apache/arrow/pull/9064#discussion_r551558307 ## File path: rust/datafusion/src/datasource/parquet.rs ## @@ -62,17 +64,37 @@ impl TableProvider for ParquetTable { self.schema.clone() } +

[GitHub] [arrow] nevi-me commented on pull request #9090: ARROW-11123: [Rust] Use cast kernel to simplify csv parser

2021-01-04 Thread GitBox
nevi-me commented on pull request #9090: URL: https://github.com/apache/arrow/pull/9090#issuecomment-754208396 > * Implement cast that returns an error on parsing errors instead of null See https://issues.apache.org/jira/browse/ARROW-7364

[GitHub] [arrow] nealrichardson commented on a change in pull request #9092: ARROW-10624: [R] Proactively remove "problems" attributes

2021-01-04 Thread GitBox
nealrichardson commented on a change in pull request #9092: URL: https://github.com/apache/arrow/pull/9092#discussion_r551556251 ## File path: r/R/record-batch.R ## @@ -274,6 +274,12 @@ as.data.frame.RecordBatch <- function(x, row.names = NULL, optional = FALSE, ... } .ser

[GitHub] [arrow] nealrichardson commented on a change in pull request #8365: ARROW-6582: [R] Arrow to R fails with embedded nuls in strings

2021-01-04 Thread GitBox
nealrichardson commented on a change in pull request #8365: URL: https://github.com/apache/arrow/pull/8365#discussion_r551551461 ## File path: r/src/array_to_vector.cpp ## @@ -288,36 +290,104 @@ struct Converter_String : public Converter { } StringArrayType* string_

[GitHub] [arrow] jonkeane commented on a change in pull request #9092: ARROW-10624: [R] Proactively remove "problems" attributes

2021-01-04 Thread GitBox
jonkeane commented on a change in pull request #9092: URL: https://github.com/apache/arrow/pull/9092#discussion_r551549190 ## File path: r/R/record-batch.R ## @@ -274,6 +274,12 @@ as.data.frame.RecordBatch <- function(x, row.names = NULL, optional = FALSE, ... } .serialize

[GitHub] [arrow] bkietz commented on a change in pull request #8365: ARROW-6582: [R] Arrow to R fails with embedded nuls in strings

2021-01-04 Thread GitBox
bkietz commented on a change in pull request #8365: URL: https://github.com/apache/arrow/pull/8365#discussion_r551542391 ## File path: r/src/array_to_vector.cpp ## @@ -288,36 +290,104 @@ struct Converter_String : public Converter { } StringArrayType* string_array =

[GitHub] [arrow] bkietz commented on a change in pull request #8365: ARROW-6582: [R] Arrow to R fails with embedded nuls in strings

2021-01-04 Thread GitBox
bkietz commented on a change in pull request #8365: URL: https://github.com/apache/arrow/pull/8365#discussion_r551542022 ## File path: r/src/array_to_vector.cpp ## @@ -288,36 +290,104 @@ struct Converter_String : public Converter { } StringArrayType* string_array =

[GitHub] [arrow] github-actions[bot] removed a comment on pull request #9095: ARROW-10183: [C++] Apply composable futures to CSV

2021-01-04 Thread GitBox
github-actions[bot] removed a comment on pull request #9095: URL: https://github.com/apache/arrow/pull/9095#issuecomment-754137607 Thanks for opening a pull request! Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW Th

[GitHub] [arrow] sunchao commented on a change in pull request #9047: ARROW-11072: [Rust] [Parquet] Support reading decimal from physical int types

2021-01-04 Thread GitBox
sunchao commented on a change in pull request #9047: URL: https://github.com/apache/arrow/pull/9047#discussion_r551540021 ## File path: rust/parquet/src/arrow/schema.rs ## @@ -591,6 +591,7 @@ impl ParquetTypeConverter<'_> { LogicalType::INT_32 => Ok(DataType::Int32

[GitHub] [arrow] nevi-me commented on pull request #9093: ARROW-11125: [Rust] Logical equality for list arrays

2021-01-04 Thread GitBox
nevi-me commented on pull request #9093: URL: https://github.com/apache/arrow/pull/9093#issuecomment-754186287 I saw the clippy warning, I'll fix it This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] nevi-me commented on a change in pull request #9094: ARROW-11126: [Rust] Document and test ARROW-10656

2021-01-04 Thread GitBox
nevi-me commented on a change in pull request #9094: URL: https://github.com/apache/arrow/pull/9094#discussion_r551536755 ## File path: rust/arrow/src/record_batch.rs ## @@ -80,6 +80,10 @@ impl RecordBatch { Ok(RecordBatch { schema, columns }) } +/// Creates

[GitHub] [arrow] nealrichardson commented on a change in pull request #8365: ARROW-6582: [R] Arrow to R fails with embedded nuls in strings

2021-01-04 Thread GitBox
nealrichardson commented on a change in pull request #8365: URL: https://github.com/apache/arrow/pull/8365#discussion_r551519433 ## File path: r/src/array_to_vector.cpp ## @@ -288,36 +290,104 @@ struct Converter_String : public Converter { } StringArrayType* string_

[GitHub] [arrow] nealrichardson commented on a change in pull request #9092: ARROW-10624: [R] Proactively remove "problems" attributes

2021-01-04 Thread GitBox
nealrichardson commented on a change in pull request #9092: URL: https://github.com/apache/arrow/pull/9092#discussion_r551534161 ## File path: r/R/record-batch.R ## @@ -274,6 +274,12 @@ as.data.frame.RecordBatch <- function(x, row.names = NULL, optional = FALSE, ... } .ser

[GitHub] [arrow] Dandandan edited a comment on pull request #9090: ARROW-11123: [Rust] Use cast kernel to simplify csv parser

2021-01-04 Thread GitBox
Dandandan edited a comment on pull request #9090: URL: https://github.com/apache/arrow/pull/9090#issuecomment-754102662 @jorgecarleitao note that the `csv` `StringRecord` also verifies whether strings are utf8. It adds a bit of overhead, but the utf8 checking itself is not much for now, it

[GitHub] [arrow] Dandandan edited a comment on pull request #9090: ARROW-11123: [Rust] Use cast kernel to simplify csv parser

2021-01-04 Thread GitBox
Dandandan edited a comment on pull request #9090: URL: https://github.com/apache/arrow/pull/9090#issuecomment-754102662 @jorgecarleitao note that the `csv` `StringRecord` also verifies whether strings are utf8. It adds a bit of overhead, but the utf8 checking itself is not much for now, it

[GitHub] [arrow] alamb commented on pull request #9090: ARROW-11123: [Rust] Use cast kernel to simplify csv parser

2021-01-04 Thread GitBox
alamb commented on pull request #9090: URL: https://github.com/apache/arrow/pull/9090#issuecomment-754181158 I also like the general idea of using the `cast` kernel for consistency. Nice idea @Dandandan This is an autom

[GitHub] [arrow] alamb closed pull request #9016: ARROW-11037: [Rust] Optimized creation of string array from iterator.

2021-01-04 Thread GitBox
alamb closed pull request #9016: URL: https://github.com/apache/arrow/pull/9016 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] alamb commented on a change in pull request #9094: ARROW-11126: [Rust] Document and test ARROW-10656

2021-01-04 Thread GitBox
alamb commented on a change in pull request #9094: URL: https://github.com/apache/arrow/pull/9094#discussion_r551527136 ## File path: rust/arrow/src/record_batch.rs ## @@ -80,6 +80,10 @@ impl RecordBatch { Ok(RecordBatch { schema, columns }) } +/// Creates a

  1   2   >