[GitHub] [arrow] ovr commented on a change in pull request #9232: ARROW-10818: [Rust] Implement DecimalType

2021-03-03 Thread GitBox
ovr commented on a change in pull request #9232: URL: https://github.com/apache/arrow/pull/9232#discussion_r586722375 ## File path: rust/datafusion/src/physical_plan/group_scalar.rs ## @@ -22,10 +22,12 @@ use std::convert::{From, TryFrom}; use

[GitHub] [arrow] pitrou commented on pull request #9528: ARROW-8732: [C++] Add basic cancellation API

2021-03-03 Thread GitBox
pitrou commented on pull request #9528: URL: https://github.com/apache/arrow/pull/9528#issuecomment-789995923 @ursabot crossbow submit -g python This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] alamb commented on a change in pull request #9595: ARROW-11806: [Rust][DataFusion] Optimize join / inner join creation of indices

2021-03-03 Thread GitBox
alamb commented on a change in pull request #9595: URL: https://github.com/apache/arrow/pull/9595#discussion_r586715897 ## File path: rust/datafusion/src/physical_plan/hash_join.rs ## @@ -311,6 +319,7 @@ fn update_hash( hash: JoinHashMap, offset: usize,

[GitHub] [arrow] pitrou commented on pull request #9528: ARROW-8732: [C++] Add basic cancellation API

2021-03-03 Thread GitBox
pitrou commented on pull request #9528: URL: https://github.com/apache/arrow/pull/9528#issuecomment-789996441 @github-actions crossbow submit -g python This is an automated message from the Apache Git Service. To respond to

[GitHub] [arrow] Dandandan commented on pull request #9595: ARROW-11806: [Rust][DataFusion] Optimize join / inner join creation of indices

2021-03-03 Thread GitBox
Dandandan commented on pull request #9595: URL: https://github.com/apache/arrow/pull/9595#issuecomment-789968391 Thanks @alamb resolved the incosistent naming. This is an automated message from the Apache Git Service. To

[GitHub] [arrow] Dandandan commented on a change in pull request #9595: ARROW-11806: [Rust][DataFusion] Optimize join / inner join creation of indices

2021-03-03 Thread GitBox
Dandandan commented on a change in pull request #9595: URL: https://github.com/apache/arrow/pull/9595#discussion_r586686916 ## File path: rust/datafusion/src/physical_plan/hash_join.rs ## @@ -311,6 +319,7 @@ fn update_hash( hash: JoinHashMap, offset: usize,

[GitHub] [arrow] Dandandan commented on a change in pull request #9595: ARROW-11806: [Rust][DataFusion] Optimize join / inner join creation of indices

2021-03-03 Thread GitBox
Dandandan commented on a change in pull request #9595: URL: https://github.com/apache/arrow/pull/9595#discussion_r586686916 ## File path: rust/datafusion/src/physical_plan/hash_join.rs ## @@ -311,6 +319,7 @@ fn update_hash( hash: JoinHashMap, offset: usize,

[GitHub] [arrow] pitrou commented on pull request #7179: ARROW-8732: [C++] Add basic cancellation API

2021-03-03 Thread GitBox
pitrou commented on pull request #7179: URL: https://github.com/apache/arrow/pull/7179#issuecomment-789990634 Closed, superseded by #9528. This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [arrow] pitrou closed pull request #7179: ARROW-8732: [C++] Add basic cancellation API

2021-03-03 Thread GitBox
pitrou closed pull request #7179: URL: https://github.com/apache/arrow/pull/7179 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] jonkeane commented on pull request #9561: ARROW-11649: [R] Add support for null_fallback to R

2021-03-03 Thread GitBox
jonkeane commented on pull request #9561: URL: https://github.com/apache/arrow/pull/9561#issuecomment-790126766 Ok, I've accepted, re-documented, and pushed. This is good to go once CI passes This is an automated message

[GitHub] [arrow] kou commented on a change in pull request #9622: ARROW-11836: [C++] Aovid requiring arrow_bundled_dependencies when it doesn't exist for arrow_static.

2021-03-03 Thread GitBox
kou commented on a change in pull request #9622: URL: https://github.com/apache/arrow/pull/9622#discussion_r586790211 ## File path: cpp/src/arrow/ArrowConfig.cmake.in ## @@ -71,19 +71,21 @@ if(NOT (TARGET arrow_shared OR TARGET arrow_static)) get_property(arrow_static_loc

[GitHub] [arrow] nealrichardson commented on a change in pull request #9610: ARROW-11735: [R] Allow Parquet and Arrow Dataset to be optional components

2021-03-03 Thread GitBox
nealrichardson commented on a change in pull request #9610: URL: https://github.com/apache/arrow/pull/9610#discussion_r586818018 ## File path: r/R/dataset-partition.R ## @@ -76,7 +76,9 @@ HivePartitioning$create <- dataset___HivePartitioning #' calling `hive_partition()` with

[GitHub] [arrow] seddonm1 opened a new pull request #9625: ARROW-11653: [Rust][DataFusion] Postgres String Functions: ascii, chr, initcap, repeat, reverse, to_hex

2021-03-03 Thread GitBox
seddonm1 opened a new pull request #9625: URL: https://github.com/apache/arrow/pull/9625 @alamb This is the second last of the current string functions but I think there may be one after that with new code. This implements some of the miscellaneous string functions `ascii`, `chr`,

[GitHub] [arrow] nealrichardson commented on a change in pull request #9610: ARROW-11735: [R] Allow Parquet and Arrow Dataset to be optional components

2021-03-03 Thread GitBox
nealrichardson commented on a change in pull request #9610: URL: https://github.com/apache/arrow/pull/9610#discussion_r586818947 ## File path: r/configure ## @@ -26,13 +26,13 @@ # R CMD INSTALL --configure-vars='INCLUDE_DIR=/.../include LIB_DIR=/.../lib' # Library settings

[GitHub] [arrow] nealrichardson commented on a change in pull request #9610: ARROW-11735: [R] Allow Parquet and Arrow Dataset to be optional components

2021-03-03 Thread GitBox
nealrichardson commented on a change in pull request #9610: URL: https://github.com/apache/arrow/pull/9610#discussion_r586819763 ## File path: r/tools/autobrew ## @@ -48,7 +48,8 @@ fi # Hardcode this for my custom autobrew build rm -f $BREWDIR/lib/*.dylib

[GitHub] [arrow] github-actions[bot] commented on pull request #9625: ARROW-11653: [Rust][DataFusion] Postgres String Functions: ascii, chr, initcap, repeat, reverse, to_hex

2021-03-03 Thread GitBox
github-actions[bot] commented on pull request #9625: URL: https://github.com/apache/arrow/pull/9625#issuecomment-790096651 https://issues.apache.org/jira/browse/ARROW-11653 This is an automated message from the Apache Git

[GitHub] [arrow] lidavidm commented on pull request #9616: ARROW-11837: [C++][Dataset] expose originating Fragment on ScanTask

2021-03-03 Thread GitBox
lidavidm commented on pull request #9616: URL: https://github.com/apache/arrow/pull/9616#issuecomment-790099377 Thank you for the review and for patching up my very poor bindings :sweat_smile: This is an automated message

[GitHub] [arrow] ianmcook commented on a change in pull request #9610: ARROW-11735: [R] Allow Parquet and Arrow Dataset to be optional components

2021-03-03 Thread GitBox
ianmcook commented on a change in pull request #9610: URL: https://github.com/apache/arrow/pull/9610#discussion_r586830050 ## File path: r/R/dataset-partition.R ## @@ -76,7 +76,9 @@ HivePartitioning$create <- dataset___HivePartitioning #' calling `hive_partition()` with no

[GitHub] [arrow] alamb commented on pull request #9624: ARROW-11845: [Rust] Implement to_isize() for ArrowNativeTypes

2021-03-03 Thread GitBox
alamb commented on pull request #9624: URL: https://github.com/apache/arrow/pull/9624#issuecomment-790144154 Thanks for this @ericwburden -- I hope to find time to review this tomorrow This is an automated message from the

[GitHub] [arrow] seddonm1 commented on pull request #9625: ARROW-11653: [Rust][DataFusion] Postgres String Functions: ascii, chr, initcap, repeat, reverse, to_hex

2021-03-03 Thread GitBox
seddonm1 commented on pull request #9625: URL: https://github.com/apache/arrow/pull/9625#issuecomment-790153156 @andygrove no problem. I will create a macro for that. This is an automated message from the Apache Git

[GitHub] [arrow] rok commented on pull request #9606: ARROW-10405: [C++] IsIn kernel should be able to lookup dictionary in string

2021-03-03 Thread GitBox
rok commented on pull request #9606: URL: https://github.com/apache/arrow/pull/9606#issuecomment-790159420 > Thanks for doing this. Can you add tests on the C++ side? I expect tests for `IsIn` and `IndexIn`, with non-trivial dictionary arrays, also in cases where not all dictionary values

[GitHub] [arrow] kou commented on pull request #9616: ARROW-11837: [C++][Dataset] expose originating Fragment on ScanTask

2021-03-03 Thread GitBox
kou commented on pull request #9616: URL: https://github.com/apache/arrow/pull/9616#issuecomment-790161292 CI is green. I'll merge this. Thanks for updating GLib and Ruby parts too! This is an automated message

[GitHub] [arrow] nealrichardson commented on a change in pull request #9610: ARROW-11735: [R] Allow Parquet and Arrow Dataset to be optional components

2021-03-03 Thread GitBox
nealrichardson commented on a change in pull request #9610: URL: https://github.com/apache/arrow/pull/9610#discussion_r586859667 ## File path: r/R/dataset-partition.R ## @@ -76,7 +76,9 @@ HivePartitioning$create <- dataset___HivePartitioning #' calling `hive_partition()` with

[GitHub] [arrow] alamb commented on pull request #9595: ARROW-11806: [Rust][DataFusion] Optimize join / inner join creation of indices

2021-03-03 Thread GitBox
alamb commented on pull request #9595: URL: https://github.com/apache/arrow/pull/9595#issuecomment-790142862 @Dandandan on no! It now seems to have failed `rust fmt` linting This is an automated message from the Apache Git

[GitHub] [arrow] alamb commented on pull request #9625: ARROW-11653: [Rust][DataFusion] Postgres String Functions: ascii, chr, initcap, repeat, reverse, to_hex

2021-03-03 Thread GitBox
alamb commented on pull request #9625: URL: https://github.com/apache/arrow/pull/9625#issuecomment-790143072 Thanks @seddonm1 -- I plan to review this tomorrow This is an automated message from the Apache Git Service. To

[GitHub] [arrow] nealrichardson commented on a change in pull request #9579: ARROW-11774: [R] macos one line install

2021-03-03 Thread GitBox
nealrichardson commented on a change in pull request #9579: URL: https://github.com/apache/arrow/pull/9579#discussion_r586867436 ## File path: r/tools/nixlibs.R ## @@ -396,6 +401,7 @@ cmake_version <- function(cmd = "cmake") { with_s3_support <- function(env_vars) {

[GitHub] [arrow] rok commented on a change in pull request #9606: ARROW-10405: [C++] IsIn kernel should be able to lookup dictionary in string

2021-03-03 Thread GitBox
rok commented on a change in pull request #9606: URL: https://github.com/apache/arrow/pull/9606#discussion_r586896681 ## File path: r/tests/testthat/test-Array.R ## @@ -727,6 +727,15 @@ test_that("[ accepts Arrays and otherwise handles bad input", { ) }) +test_that("[

[GitHub] [arrow] kou closed pull request #9616: ARROW-11837: [C++][Dataset] expose originating Fragment on ScanTask

2021-03-03 Thread GitBox
kou closed pull request #9616: URL: https://github.com/apache/arrow/pull/9616 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] nealrichardson commented on a change in pull request #9610: ARROW-11735: [R] Allow Parquet and Arrow Dataset to be optional components

2021-03-03 Thread GitBox
nealrichardson commented on a change in pull request #9610: URL: https://github.com/apache/arrow/pull/9610#discussion_r586743366 ## File path: r/configure ## @@ -182,15 +187,33 @@ if [ $? -eq 0 ] || [ "$UNAME" = "Darwin" ]; then # Always build with arrow on macOS

[GitHub] [arrow] ianmcook commented on a change in pull request #9610: ARROW-11735: [R] Allow Parquet and Arrow Dataset to be optional components

2021-03-03 Thread GitBox
ianmcook commented on a change in pull request #9610: URL: https://github.com/apache/arrow/pull/9610#discussion_r586750914 ## File path: r/configure ## @@ -182,15 +187,33 @@ if [ $? -eq 0 ] || [ "$UNAME" = "Darwin" ]; then # Always build with arrow on macOS

[GitHub] [arrow] ianmcook commented on a change in pull request #9610: ARROW-11735: [R] Allow Parquet and Arrow Dataset to be optional components

2021-03-03 Thread GitBox
ianmcook commented on a change in pull request #9610: URL: https://github.com/apache/arrow/pull/9610#discussion_r586763543 ## File path: r/configure ## @@ -26,13 +26,14 @@ # R CMD INSTALL --configure-vars='INCLUDE_DIR=/.../include LIB_DIR=/.../lib' # Library settings

[GitHub] [arrow] ianmcook commented on a change in pull request #9610: ARROW-11735: [R] Allow Parquet and Arrow Dataset to be optional components

2021-03-03 Thread GitBox
ianmcook commented on a change in pull request #9610: URL: https://github.com/apache/arrow/pull/9610#discussion_r586783964 ## File path: r/configure ## @@ -182,15 +187,33 @@ if [ $? -eq 0 ] || [ "$UNAME" = "Darwin" ]; then # Always build with arrow on macOS

[GitHub] [arrow] ianmcook commented on a change in pull request #9610: ARROW-11735: [R] Allow Parquet and Arrow Dataset to be optional components

2021-03-03 Thread GitBox
ianmcook commented on a change in pull request #9610: URL: https://github.com/apache/arrow/pull/9610#discussion_r586783690 ## File path: r/configure ## @@ -182,15 +187,33 @@ if [ $? -eq 0 ] || [ "$UNAME" = "Darwin" ]; then # Always build with arrow on macOS

[GitHub] [arrow] ianmcook commented on a change in pull request #9610: ARROW-11735: [R] Allow Parquet and Arrow Dataset to be optional components

2021-03-03 Thread GitBox
ianmcook commented on a change in pull request #9610: URL: https://github.com/apache/arrow/pull/9610#discussion_r586783164 ## File path: r/configure ## @@ -26,13 +26,14 @@ # R CMD INSTALL --configure-vars='INCLUDE_DIR=/.../include LIB_DIR=/.../lib' # Library settings

[GitHub] [arrow] github-actions[bot] commented on pull request #9528: ARROW-8732: [C++] Add basic cancellation API

2021-03-03 Thread GitBox
github-actions[bot] commented on pull request #9528: URL: https://github.com/apache/arrow/pull/9528#issuecomment-790018580 Revision: 848e020cfaf234411d3a0167b58dc39c030823a4 Submitted crossbow builds: [ursacomputing/crossbow @

[GitHub] [arrow] codecov-io commented on pull request #9595: ARROW-11806: [Rust][DataFusion] Optimize join / inner join creation of indices

2021-03-03 Thread GitBox
codecov-io commented on pull request #9595: URL: https://github.com/apache/arrow/pull/9595#issuecomment-790021228 # [Codecov](https://codecov.io/gh/apache/arrow/pull/9595?src=pr=h1) Report > Merging [#9595](https://codecov.io/gh/apache/arrow/pull/9595?src=pr=desc) (2869040) into

[GitHub] [arrow] nealrichardson commented on a change in pull request #9610: ARROW-11735: [R] Allow Parquet and Arrow Dataset to be optional components

2021-03-03 Thread GitBox
nealrichardson commented on a change in pull request #9610: URL: https://github.com/apache/arrow/pull/9610#discussion_r586741670 ## File path: r/configure ## @@ -145,7 +150,7 @@ else # TODO: what about non-bundled deps? BUNDLED_LIBS=`cd $LIB_DIR && ls *.a`

[GitHub] [arrow] lidavidm commented on pull request #9616: ARROW-11837: [C++][Dataset] expose originating Fragment on ScanTask

2021-03-03 Thread GitBox
lidavidm commented on pull request #9616: URL: https://github.com/apache/arrow/pull/9616#issuecomment-790032276 Looks like things finally pass after fixing the GLib and Ruby bindings. I opted not to bind Fragment::Scan since the API there is being reworked at the moment anyways.

[GitHub] [arrow] ianmcook commented on a change in pull request #9610: ARROW-11735: [R] Allow Parquet and Arrow Dataset to be optional components

2021-03-03 Thread GitBox
ianmcook commented on a change in pull request #9610: URL: https://github.com/apache/arrow/pull/9610#discussion_r586785240 ## File path: r/configure ## @@ -145,7 +150,7 @@ else # TODO: what about non-bundled deps? BUNDLED_LIBS=`cd $LIB_DIR && ls *.a`

[GitHub] [arrow] nealrichardson commented on a change in pull request #9606: ARROW-10405: [C++] IsIn kernel should be able to lookup dictionary in string

2021-03-03 Thread GitBox
nealrichardson commented on a change in pull request #9606: URL: https://github.com/apache/arrow/pull/9606#discussion_r586922616 ## File path: r/tests/testthat/test-Array.R ## @@ -723,6 +723,17 @@ test_that("[ accepts Arrays and otherwise handles bad input", { ) })

[GitHub] [arrow] nealrichardson closed pull request #9561: ARROW-11649: [R] Add support for null_fallback to R

2021-03-03 Thread GitBox
nealrichardson closed pull request #9561: URL: https://github.com/apache/arrow/pull/9561 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow] sundy-li commented on a change in pull request #9602: ARROW-11630: [Rust] Introduce limit option for sort kernel

2021-03-03 Thread GitBox
sundy-li commented on a change in pull request #9602: URL: https://github.com/apache/arrow/pull/9602#discussion_r586975774 ## File path: rust/arrow/src/compute/kernels/sort.rs ## @@ -278,12 +347,27 @@ fn sort_boolean( let valids_len = valids.len(); let nulls_len =

[GitHub] [arrow] westonpace opened a new pull request #9626: ARROW-11855: Memory leak in to_pandas when converting chunked struct array

2021-03-03 Thread GitBox
westonpace opened a new pull request #9626: URL: https://github.com/apache/arrow/pull/9626 When converting a struct with chunks to python the ownership of the arrow arrays was not being properly tracked and the deletion of the resulting pandas dataframe would leave some buffers behind. I

[GitHub] [arrow] sundy-li commented on a change in pull request #9602: ARROW-11630: [Rust] Introduce limit option for sort kernel

2021-03-03 Thread GitBox
sundy-li commented on a change in pull request #9602: URL: https://github.com/apache/arrow/pull/9602#discussion_r586975774 ## File path: rust/arrow/src/compute/kernels/sort.rs ## @@ -278,12 +347,27 @@ fn sort_boolean( let valids_len = valids.len(); let nulls_len =

[GitHub] [arrow] sundy-li commented on a change in pull request #9602: ARROW-11630: [Rust] Introduce limit option for sort kernel

2021-03-03 Thread GitBox
sundy-li commented on a change in pull request #9602: URL: https://github.com/apache/arrow/pull/9602#discussion_r586975774 ## File path: rust/arrow/src/compute/kernels/sort.rs ## @@ -278,12 +347,27 @@ fn sort_boolean( let valids_len = valids.len(); let nulls_len =

[GitHub] [arrow] mathyingzhou edited a comment on pull request #8648: ARROW-7906: [C++] [Python] Add ORC write support

2021-03-03 Thread GitBox
mathyingzhou edited a comment on pull request #8648: URL: https://github.com/apache/arrow/pull/8648#issuecomment-790297527 @pitrou Yes now it is ready for another review. I have fixed all the issues you mentioned, found and fixed a previously hidden bug and shortened the tests to about

[GitHub] [arrow] sighingnow commented on a change in pull request #9622: ARROW-11836: [C++] Avoid requiring arrow_bundled_dependencies when it doesn't exist for arrow_static.

2021-03-03 Thread GitBox
sighingnow commented on a change in pull request #9622: URL: https://github.com/apache/arrow/pull/9622#discussion_r586992064 ## File path: cpp/src/arrow/ArrowConfig.cmake.in ## @@ -71,19 +71,21 @@ if(NOT (TARGET arrow_shared OR TARGET arrow_static))

[GitHub] [arrow] seddonm1 commented on pull request #9625: ARROW-11653: [Rust][DataFusion] Postgres String Functions: ascii, chr, initcap, repeat, reverse, to_hex

2021-03-03 Thread GitBox
seddonm1 commented on pull request #9625: URL: https://github.com/apache/arrow/pull/9625#issuecomment-790222729 Thanks @andygrove . I have a few more PRs to do to finish this first phase of work. Then I think it's time to tackle type coercion.

[GitHub] [arrow] github-actions[bot] commented on pull request #9626: ARROW-11855: Memory leak in to_pandas when converting chunked struct array

2021-03-03 Thread GitBox
github-actions[bot] commented on pull request #9626: URL: https://github.com/apache/arrow/pull/9626#issuecomment-790226986 https://issues.apache.org/jira/browse/ARROW-11855 This is an automated message from the Apache Git

[GitHub] [arrow] sundy-li commented on a change in pull request #9602: ARROW-11630: [Rust] Introduce limit option for sort kernel

2021-03-03 Thread GitBox
sundy-li commented on a change in pull request #9602: URL: https://github.com/apache/arrow/pull/9602#discussion_r586975774 ## File path: rust/arrow/src/compute/kernels/sort.rs ## @@ -278,12 +347,27 @@ fn sort_boolean( let valids_len = valids.len(); let nulls_len =

[GitHub] [arrow] sighingnow commented on a change in pull request #9622: ARROW-11836: [C++] Avoid requiring arrow_bundled_dependencies when it doesn't exist for arrow_static.

2021-03-03 Thread GitBox
sighingnow commented on a change in pull request #9622: URL: https://github.com/apache/arrow/pull/9622#discussion_r586993756 ## File path: cpp/src/arrow/ArrowConfig.cmake.in ## @@ -71,19 +71,21 @@ if(NOT (TARGET arrow_shared OR TARGET arrow_static))

[GitHub] [arrow] mathyingzhou edited a comment on pull request #8648: ARROW-7906: [C++] [Python] Add ORC write support

2021-03-03 Thread GitBox
mathyingzhou edited a comment on pull request #8648: URL: https://github.com/apache/arrow/pull/8648#issuecomment-790297527 @pitrou Yes now it is ready for another review. I have fixed all the issues you mentioned and shortened the tests to about 650 lines (with more tests!) It should be

[GitHub] [arrow] dmyersturnbull opened a new issue #9628: write_feather incorrectly deletes files

2021-03-03 Thread GitBox
dmyersturnbull opened a new issue #9628: URL: https://github.com/apache/arrow/issues/9628 I'm happy to fill this out on Jira and/or to submit a pull request. I just can't log in to the Jira right now -- as soon as I log in it "forgets". Two related issues about

[GitHub] [arrow] mathyingzhou commented on pull request #8648: ARROW-7906: [C++] [Python] Add ORC write support

2021-03-03 Thread GitBox
mathyingzhou commented on pull request #8648: URL: https://github.com/apache/arrow/pull/8648#issuecomment-790297527 @pitrou Yes now it is ready for another review. This is an automated message from the Apache Git Service.

[GitHub] [arrow] kou closed pull request #9622: ARROW-11836: [C++] Avoid requiring arrow_bundled_dependencies when it doesn't exist for arrow_static.

2021-03-03 Thread GitBox
kou closed pull request #9622: URL: https://github.com/apache/arrow/pull/9622 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] westonpace opened a new pull request #9627: ARROW-11856: [C++] Remove unused reference to RecordBatchStreamWriter

2021-03-03 Thread GitBox
westonpace opened a new pull request #9627: URL: https://github.com/apache/arrow/pull/9627 The type RecordBatchStreamWriter was in a type_fwd but never implemented anywhere. The property type would be RecordBatchWriter

[GitHub] [arrow] github-actions[bot] commented on pull request #9627: ARROW-11856: [C++] Remove unused reference to RecordBatchStreamWriter

2021-03-03 Thread GitBox
github-actions[bot] commented on pull request #9627: URL: https://github.com/apache/arrow/pull/9627#issuecomment-790353990 https://issues.apache.org/jira/browse/ARROW-11856 This is an automated message from the Apache Git

[GitHub] [arrow] liyafan82 commented on pull request #8949: ARROW-10880: [Java] Support compressing RecordBatch IPC buffers by LZ4

2021-03-03 Thread GitBox
liyafan82 commented on pull request #8949: URL: https://github.com/apache/arrow/pull/8949#issuecomment-790239103 > Congratulations @liyafan82 ! Do you have an idea how hard it will be to add zstd support? @pitrou Support for zstd should be much easier, as you can see, most of the

[GitHub] [arrow] alamb commented on pull request #9213: ARROW-11266: [Rust][DataFusion] Implement vectorized hashing for hash aggregate [WIP]

2021-03-03 Thread GitBox
alamb commented on pull request #9213: URL: https://github.com/apache/arrow/pull/9213#issuecomment-789646626 @Dandandan I am closing this PR for the time being to clean up the Rust/Arrow PR backlog. Please let me know if this is a mistake

[GitHub] [arrow] alamb closed pull request #9086: [Rust] [DataFusion] [Experiment] Blocking threads filter

2021-03-03 Thread GitBox
alamb closed pull request #9086: URL: https://github.com/apache/arrow/pull/9086 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] alamb closed pull request #9213: ARROW-11266: [Rust][DataFusion] Implement vectorized hashing for hash aggregate [WIP]

2021-03-03 Thread GitBox
alamb closed pull request #9213: URL: https://github.com/apache/arrow/pull/9213 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] alamb commented on pull request #9086: [Rust] [DataFusion] [Experiment] Blocking threads filter

2021-03-03 Thread GitBox
alamb commented on pull request #9086: URL: https://github.com/apache/arrow/pull/9086#issuecomment-789646295 @jorgecarleitao I am closing this PR for the time being to clean up the Rust/Arrow PR backlog. Please let me know if this is a mistake

[GitHub] [arrow] alamb commented on pull request #9111: ARROW-11140: [Rust] [CI] Experimenting with Buildkite

2021-03-03 Thread GitBox
alamb commented on pull request #9111: URL: https://github.com/apache/arrow/pull/9111#issuecomment-789646411 @jorgecarleitao I am closing this PR for the time being to clean up the Rust/Arrow PR backlog. Please let me know if this is a mistake

[GitHub] [arrow] alamb closed pull request #9111: ARROW-11140: [Rust] [CI] Experimenting with Buildkite

2021-03-03 Thread GitBox
alamb closed pull request #9111: URL: https://github.com/apache/arrow/pull/9111 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] alamb commented on pull request #9605: ARROW-11802: [Rust][DataFusion] Remove use of crossbeam channels to avoid potential deadlocks

2021-03-03 Thread GitBox
alamb commented on pull request #9605: URL: https://github.com/apache/arrow/pull/9605#issuecomment-789662262 I checked this code out locally, merged from apache/master and reran the tests -- all still passes so merging in.

[GitHub] [arrow] alamb closed pull request #9605: ARROW-11802: [Rust][DataFusion] Remove use of crossbeam channels to avoid potential deadlocks

2021-03-03 Thread GitBox
alamb closed pull request #9605: URL: https://github.com/apache/arrow/pull/9605 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] Dandandan commented on pull request #9605: ARROW-11802: [Rust][DataFusion] Remove use of crossbeam channels to avoid potential deadlocks

2021-03-03 Thread GitBox
Dandandan commented on pull request #9605: URL: https://github.com/apache/arrow/pull/9605#issuecomment-789670134 I also tested partitioning parquet & reading parquet from datafusion - all worked OK :+1: This is an

[GitHub] [arrow] github-actions[bot] commented on pull request #9624: ARROW-11845: [Rust]Implement to_isize() for ArrowNativeTypes

2021-03-03 Thread GitBox
github-actions[bot] commented on pull request #9624: URL: https://github.com/apache/arrow/pull/9624#issuecomment-789710271 https://issues.apache.org/jira/browse/ARROW-11845 This is an automated message from the Apache Git

[GitHub] [arrow] paddyhoran commented on a change in pull request #9602: ARROW-11630: [Rust] Introduce limit option for sort kernel

2021-03-03 Thread GitBox
paddyhoran commented on a change in pull request #9602: URL: https://github.com/apache/arrow/pull/9602#discussion_r586324951 ## File path: rust/arrow/src/compute/kernels/sort.rs ## @@ -278,12 +347,27 @@ fn sort_boolean( let valids_len = valids.len(); let nulls_len =

[GitHub] [arrow] alamb commented on pull request #9232: ARROW-10818: [Rust] Implement DecimalType

2021-03-03 Thread GitBox
alamb commented on pull request #9232: URL: https://github.com/apache/arrow/pull/9232#issuecomment-789647138 @ovr what do you think we should do with this PR? Is it worth keeping open as a Draft or shall we close it for now?

[GitHub] [arrow] alamb commented on pull request #9313: [Rust] [Experiment] Trigonometry kernels

2021-03-03 Thread GitBox
alamb commented on pull request #9313: URL: https://github.com/apache/arrow/pull/9313#issuecomment-789647360 @nevi-me I am closing this PR for the time being to clean up the Rust/Arrow PR backlog. Please let me know if this is a mistake

[GitHub] [arrow] alamb closed pull request #9313: [Rust] [Experiment] Trigonometry kernels

2021-03-03 Thread GitBox
alamb closed pull request #9313: URL: https://github.com/apache/arrow/pull/9313 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] alamb commented on pull request #9603: ARROW-11687: [Rust][DataFusion] RepartitionExec Hanging Test

2021-03-03 Thread GitBox
alamb commented on pull request #9603: URL: https://github.com/apache/arrow/pull/9603#issuecomment-789650069 Incorporated in https://github.com/apache/arrow/pull/9605 so closing this PR This is an automated message from the

[GitHub] [arrow] liyafan82 commented on pull request #8949: ARROW-10880: [Java] Support compressing RecordBatch IPC buffers by LZ4

2021-03-03 Thread GitBox
liyafan82 commented on pull request #8949: URL: https://github.com/apache/arrow/pull/8949#issuecomment-789649806 > I've restarted the integration CI job, it seemed stuck downloading the docker image. @pitrou Thanks for your help.

[GitHub] [arrow] alamb closed pull request #9603: ARROW-11687: [Rust][DataFusion] RepartitionExec Hanging Test

2021-03-03 Thread GitBox
alamb closed pull request #9603: URL: https://github.com/apache/arrow/pull/9603 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] lidavidm commented on pull request #9616: ARROW-11837: [C++][Dataset] expose originating Fragment on ScanTask

2021-03-03 Thread GitBox
lidavidm commented on pull request #9616: URL: https://github.com/apache/arrow/pull/9616#issuecomment-789698947 Ah, this is broken until I fully define GADInMemoryFragment...let me fix that up This is an automated message

[GitHub] [arrow] alamb edited a comment on pull request #9598: ARROW-11804: [Developer] Offer to create JIRA issue

2021-03-03 Thread GitBox
alamb edited a comment on pull request #9598: URL: https://github.com/apache/arrow/pull/9598#issuecomment-789623293 > (I meant to start that with: thanks for taking the initiative to make our processes better! ❤️) Thank you! > As an alternative approach, what if we had a

[GitHub] [arrow] alamb commented on pull request #9598: ARROW-11804: [Developer] Offer to create JIRA issue

2021-03-03 Thread GitBox
alamb commented on pull request #9598: URL: https://github.com/apache/arrow/pull/9598#issuecomment-789623293 > (I meant to start that with: thanks for taking the initiative to make our processes better! ❤️) Thank you! > As an alternative approach, what if we had a GitHub

[GitHub] [arrow] paddyhoran commented on a change in pull request #9602: ARROW-11630: [Rust] Introduce limit option for sort kernel

2021-03-03 Thread GitBox
paddyhoran commented on a change in pull request #9602: URL: https://github.com/apache/arrow/pull/9602#discussion_r586321776 ## File path: rust/arrow/src/compute/kernels/sort.rs ## @@ -517,20 +650,32 @@ where }, ); -if !options.descending { -

[GitHub] [arrow] alamb commented on pull request #9592: ARROW-11803: [Rust] [Parquet] Support v2 LogicalType

2021-03-03 Thread GitBox
alamb commented on pull request #9592: URL: https://github.com/apache/arrow/pull/9592#issuecomment-789652617 The failing CI check, https://github.com/apache/arrow/pull/9592/checks?check_run_id=2010574916 has the same pattern as was fixed in https://github.com/apache/arrow/pull/9593

[GitHub] [arrow] alamb closed pull request #9565: ARROW-11655: [Rust][DataFusion] Postgres String Functions: left, lpad, right, rpad

2021-03-03 Thread GitBox
alamb closed pull request #9565: URL: https://github.com/apache/arrow/pull/9565 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] ericwburden opened a new pull request #9624: Implement to_isize() for ArrowNativeTypes

2021-03-03 Thread GitBox
ericwburden opened a new pull request #9624: URL: https://github.com/apache/arrow/pull/9624 Corrects an issue with the Debug implementation for ArrowNativeTypes (like Date32Array) that panic for negative values due to a .to_usize() call. This may impact the handling of Time32/Time64

[GitHub] [arrow] Dandandan commented on pull request #9213: ARROW-11266: [Rust][DataFusion] Implement vectorized hashing for hash aggregate [WIP]

2021-03-03 Thread GitBox
Dandandan commented on pull request #9213: URL: https://github.com/apache/arrow/pull/9213#issuecomment-789665919 @alamb thanks, probably will open this or a new one if/when I continue with it This is an automated message

[GitHub] [arrow] pitrou commented on a change in pull request #8949: ARROW-10880: [Java] Support compressing RecordBatch IPC buffers by LZ4

2021-03-03 Thread GitBox
pitrou commented on a change in pull request #8949: URL: https://github.com/apache/arrow/pull/8949#discussion_r586333980 ## File path: java/compression/src/main/java/org/apache/arrow/compression/Lz4CompressionCodec.java ## @@ -0,0 +1,159 @@ +/* + * Licensed to the Apache

[GitHub] [arrow] alamb closed pull request #9330: [Rust] [Experiment] [WIP]: Use SmallVec in ArrayData to optimize the common usecase of single-buffer arrays

2021-03-03 Thread GitBox
alamb closed pull request #9330: URL: https://github.com/apache/arrow/pull/9330 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] alamb commented on pull request #9330: [Rust] [Experiment] [WIP]: Use SmallVec in ArrayData to optimize the common usecase of single-buffer arrays

2021-03-03 Thread GitBox
alamb commented on pull request #9330: URL: https://github.com/apache/arrow/pull/9330#issuecomment-789647553 @jhorstmann I am closing this PR for the time being to clean up the Rust/Arrow PR backlog. Please let me know if this is a mistake

[GitHub] [arrow] alamb commented on pull request #9494: ARROW-11626: [Rust][DataFusion] Move [DataFusion] examples to own project

2021-03-03 Thread GitBox
alamb commented on pull request #9494: URL: https://github.com/apache/arrow/pull/9494#issuecomment-789649499 @Dandandan I think this idea is worth pursuing -- would you be willing to ressurect this PR? This is an

[GitHub] [arrow] alamb commented on a change in pull request #9565: ARROW-11655: [Rust][DataFusion] Postgres String Functions: left, lpad, right, rpad

2021-03-03 Thread GitBox
alamb commented on a change in pull request #9565: URL: https://github.com/apache/arrow/pull/9565#discussion_r586351860 ## File path: rust/datafusion/tests/sql.rs ## @@ -2058,17 +2058,53 @@ async fn test_string_expressions() -> Result<()> {

[GitHub] [arrow] Dandandan commented on pull request #9493: ARROW-11624: [Rust] Move Arrow benchmarks to its own crate

2021-03-03 Thread GitBox
Dandandan commented on pull request #9493: URL: https://github.com/apache/arrow/pull/9493#issuecomment-789668717 @alamb let's park the discussion for now and see if there are other areas of improvement. I think it still could be a good idea to decrease compile / dev times, but the

[GitHub] [arrow] liyafan82 commented on pull request #8949: ARROW-10880: [Java] Support compressing RecordBatch IPC buffers by LZ4

2021-03-03 Thread GitBox
liyafan82 commented on pull request #8949: URL: https://github.com/apache/arrow/pull/8949#issuecomment-789684969 The integration tests have passed. Please take another another look when you have time, dear reviewers. (maybe just review the last three commits) Thanks a lot.

[GitHub] [arrow] github-actions[bot] commented on pull request #9624: ARROW-11845: [Rust]Implement to_isize() for ArrowNativeTypes

2021-03-03 Thread GitBox
github-actions[bot] commented on pull request #9624: URL: https://github.com/apache/arrow/pull/9624#issuecomment-789694328 Thanks for opening a pull request! Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW Then

[GitHub] [arrow] ovr commented on pull request #9232: ARROW-10818: [Rust] Implement DecimalType

2021-03-03 Thread GitBox
ovr commented on pull request #9232: URL: https://github.com/apache/arrow/pull/9232#issuecomment-789732012 > @ovr what do you think we should do with this PR? Is it worth keeping open as a Draft or shall we close it for now? It's time to finish it. So, I've rebased PR and drop

[GitHub] [arrow] alamb commented on pull request #9493: ARROW-11624: [Rust] Move Arrow benchmarks to its own crate

2021-03-03 Thread GitBox
alamb commented on pull request #9493: URL: https://github.com/apache/arrow/pull/9493#issuecomment-789648942 @Dandandan what do you think we should do with this PR? Is it something we should work on getting in? Or should we park the discussion for now? I doubt this is going to

[GitHub] [arrow] alamb edited a comment on pull request #9565: ARROW-11655: [Rust][DataFusion] Postgres String Functions: left, lpad, right, rpad

2021-03-03 Thread GitBox
alamb edited a comment on pull request #9565: URL: https://github.com/apache/arrow/pull/9565#issuecomment-789658697 Thanks again @seddonm1 -- I think this PR is ready to go and has been hanging out for a long time -- we can improve things in subsequent PRs. I am going to merge it in. FYI

[GitHub] [arrow] alamb commented on pull request #9565: ARROW-11655: [Rust][DataFusion] Postgres String Functions: left, lpad, right, rpad

2021-03-03 Thread GitBox
alamb commented on pull request #9565: URL: https://github.com/apache/arrow/pull/9565#issuecomment-789658697 Thanks again @seddonm1 -- I think this PR is ready to go and has been hanging out for a long time -- we can improve things in subsequent PRs. I am going to merge it in.

[GitHub] [arrow] alamb commented on pull request #9595: ARROW-11806: [Rust][DataFusion] Optimize join / inner join creation of indices

2021-03-03 Thread GitBox
alamb commented on pull request #9595: URL: https://github.com/apache/arrow/pull/9595#issuecomment-789714615 The integration test failure in https://github.com/apache/arrow/pull/9595/checks?check_run_id=1998235390 seems to be the same as was fixed in

[GitHub] [arrow] paddyhoran commented on a change in pull request #9602: ARROW-11630: [Rust] Introduce limit option for sort kernel

2021-03-03 Thread GitBox
paddyhoran commented on a change in pull request #9602: URL: https://github.com/apache/arrow/pull/9602#discussion_r586323373 ## File path: rust/arrow/src/compute/kernels/sort.rs ## @@ -278,12 +347,27 @@ fn sort_boolean( let valids_len = valids.len(); let nulls_len =

[GitHub] [arrow] pitrou commented on pull request #8949: ARROW-10880: [Java] Support compressing RecordBatch IPC buffers by LZ4

2021-03-03 Thread GitBox
pitrou commented on pull request #8949: URL: https://github.com/apache/arrow/pull/8949#issuecomment-789645531 I've restarted the integration CI job, it seemed stuck downloading the docker image. This is an automated message

[GitHub] [arrow] alamb closed pull request #9567: ARROW-11775: [Rust][DataFusion] Feature Flags for Dependencies

2021-03-03 Thread GitBox
alamb closed pull request #9567: URL: https://github.com/apache/arrow/pull/9567 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] alamb commented on a change in pull request #9565: ARROW-11655: [Rust][DataFusion] Postgres String Functions: left, lpad, right, rpad

2021-03-03 Thread GitBox
alamb commented on a change in pull request #9565: URL: https://github.com/apache/arrow/pull/9565#discussion_r586347165 ## File path: rust/datafusion/tests/sql.rs ## @@ -2058,17 +2058,53 @@ async fn test_string_expressions() -> Result<()> {

  1   2   >