[GitHub] [arrow] edponce commented on a change in pull request #10274: ARROW-12685: [C++][Compute] Add unary absolute value kernel

2021-05-11 Thread GitBox
edponce commented on a change in pull request #10274: URL: https://github.com/apache/arrow/pull/10274#discussion_r630707124 ## File path: cpp/src/arrow/compute/kernels/scalar_arithmetic_test.cc ## @@ -1057,5 +1058,109 @@ TYPED_TEST(TestUnaryArithmeticFloating, Negate) { }

[GitHub] [arrow-rs] jorgecarleitao commented on issue #286: Unable to load Feather v2 files created by pyarrow and pandas.

2021-05-11 Thread GitBox
jorgecarleitao commented on issue #286: URL: https://github.com/apache/arrow-rs/issues/286#issuecomment-839446499 I did not know this: is `feather` compatible with IPC? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [arrow-datafusion] Dandandan commented on a change in pull request #320: Implement hash partitioned aggregation

2021-05-11 Thread GitBox
Dandandan commented on a change in pull request #320: URL: https://github.com/apache/arrow-datafusion/pull/320#discussion_r630738770 ## File path: datafusion/src/physical_plan/planner.rs ## @@ -184,19 +184,54 @@ impl DefaultPhysicalPlanner { let final_group:

[GitHub] [arrow] cyb70289 commented on a change in pull request #10274: ARROW-12685: [C++][compute] add unary absolute value kernel

2021-05-11 Thread GitBox
cyb70289 commented on a change in pull request #10274: URL: https://github.com/apache/arrow/pull/10274#discussion_r630683291 ## File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc ## @@ -66,6 +66,50 @@ constexpr Unsigned to_unsigned(T signed_) { return

[GitHub] [arrow-datafusion] codecov-commenter commented on pull request #326: Remove references to Ballista Docker images published to ballistacompute Docker Hub repo

2021-05-11 Thread GitBox
codecov-commenter commented on pull request #326: URL: https://github.com/apache/arrow-datafusion/pull/326#issuecomment-839392396 #

[GitHub] [arrow-datafusion] codecov-commenter edited a comment on pull request #281: add integration test to compare datafusion-cli against psql

2021-05-11 Thread GitBox
codecov-commenter edited a comment on pull request #281: URL: https://github.com/apache/arrow-datafusion/pull/281#issuecomment-833973730 #

[GitHub] [arrow-rs] ghuls commented on issue #286: Unable to load Feather v2 files created by pyarrow and pandas.

2021-05-11 Thread GitBox
ghuls commented on issue #286: URL: https://github.com/apache/arrow-rs/issues/286#issuecomment-839462080 It should be IPC on disk with optional compression with lz4 or zstd: https://arrow.apache.org/docs/python/feather.html https://ursalabs.org/blog/2020-feather-v2/ -- This is

[GitHub] [arrow-datafusion] Jimexist opened a new pull request #323: add print options and allow timing info to be turned on/off

2021-05-11 Thread GitBox
Jimexist opened a new pull request #323: URL: https://github.com/apache/arrow-datafusion/pull/323 # Which issue does this PR close? add print options and allow timing info to be turned on/off Closes #. # Rationale for this change so that we can do better

[GitHub] [arrow-datafusion] Dandandan commented on a change in pull request #320: Implement hash partitioned aggregation

2021-05-11 Thread GitBox
Dandandan commented on a change in pull request #320: URL: https://github.com/apache/arrow-datafusion/pull/320#discussion_r630723213 ## File path: ballista/rust/scheduler/src/test_utils.rs ## @@ -32,10 +32,8 @@ pub const TPCH_TABLES: &[] = &[ pub fn

[GitHub] [arrow] anthonylouisbsb opened a new pull request #10300: ARROW-12699: [Java] Generate a jar compatible with Linux and MacOS for all Arrow components

2021-05-11 Thread GitBox
anthonylouisbsb opened a new pull request #10300: URL: https://github.com/apache/arrow/pull/10300 Change the build to generate the Arrow's libraries jar files containing the C++ shared libs both for Linux and macOS. **Note**: It only generates the artifact jars for the components

[GitHub] [arrow] github-actions[bot] commented on pull request #10300: ARROW-12699: [Java] Generate a jar compatible with Linux and MacOS for all Arrow components

2021-05-11 Thread GitBox
github-actions[bot] commented on pull request #10300: URL: https://github.com/apache/arrow/pull/10300#issuecomment-839457607 https://issues.apache.org/jira/browse/ARROW-12699 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow-datafusion] Jeeesie opened a new issue #327: How can I make ballista distributed compute work?

2021-05-11 Thread GitBox
Jeeesie opened a new issue #327: URL: https://github.com/apache/arrow-datafusion/issues/327 I want to execute benchmake q1.sql distributed, And I noticed that in from_proto.rs there is PhysicalPlanType::ParquetScan, in which we can use ParquetExec::try_from_files() to make several

[GitHub] [arrow] emkornfield commented on pull request #10239: ARROW-12643: [Governance] Added experimental repos guidelines.

2021-05-11 Thread GitBox
emkornfield commented on pull request #10239: URL: https://github.com/apache/arrow/pull/10239#issuecomment-839469540 Sorry for the delay. LGTM to me. I'll wait for feedback from others before merging. -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] cyb70289 commented on a change in pull request #10274: ARROW-12685: [C++][compute] add unary absolute value kernel

2021-05-11 Thread GitBox
cyb70289 commented on a change in pull request #10274: URL: https://github.com/apache/arrow/pull/10274#discussion_r630690033 ## File path: cpp/src/arrow/compute/kernels/scalar_arithmetic_test.cc ## @@ -1057,5 +1058,109 @@ TYPED_TEST(TestUnaryArithmeticFloating, Negate) { }

[GitHub] [arrow] edponce commented on a change in pull request #10274: ARROW-12685: [C++][compute] add unary absolute value kernel

2021-05-11 Thread GitBox
edponce commented on a change in pull request #10274: URL: https://github.com/apache/arrow/pull/10274#discussion_r630698902 ## File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc ## @@ -66,6 +66,50 @@ constexpr Unsigned to_unsigned(T signed_) { return

[GitHub] [arrow] westonpace commented on pull request #10205: ARROW-12004: [C++] Result is annoying

2021-05-11 Thread GitBox
westonpace commented on pull request #10205: URL: https://github.com/apache/arrow/pull/10205#issuecomment-839428127 > Overall I'm a bit surprised by the amount of complication that seems necessary. Is there a way to straighten this up. Most of your comments should be addressable.

[GitHub] [arrow-datafusion] Jimexist commented on a change in pull request #281: add integration test to compare datafusion-cli against psql

2021-05-11 Thread GitBox
Jimexist commented on a change in pull request #281: URL: https://github.com/apache/arrow-datafusion/pull/281#discussion_r630651882 ## File path: integration-tests/sqls/simple_math_expressions.sql ## @@ -0,0 +1,22 @@ +-- Licensed to the Apache Software Foundation (ASF) under

[GitHub] [arrow] cyb70289 opened a new pull request #10298: ARROW-12490: [Dev] Use only miniforge in verify-release-candidate.sh

2021-05-11 Thread GitBox
cyb70289 opened a new pull request #10298: URL: https://github.com/apache/arrow/pull/10298 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this

[GitHub] [arrow] github-actions[bot] commented on pull request #10298: ARROW-12490: [Dev] Use only miniforge in verify-release-candidate.sh

2021-05-11 Thread GitBox
github-actions[bot] commented on pull request #10298: URL: https://github.com/apache/arrow/pull/10298#issuecomment-839372319 https://issues.apache.org/jira/browse/ARROW-12490 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow-rs] ghuls edited a comment on issue #286: Unable to load Feather v2 files created by pyarrow and pandas.

2021-05-11 Thread GitBox
ghuls edited a comment on issue #286: URL: https://github.com/apache/arrow-rs/issues/286#issuecomment-839462080 It should be IPC on disk with optional compression with lz4 or zstd: https://arrow.apache.org/docs/python/feather.html https://ursalabs.org/blog/2020-feather-v2/

[GitHub] [arrow-datafusion] Jimexist commented on a change in pull request #281: add integration test to compare datafusion-cli against psql

2021-05-11 Thread GitBox
Jimexist commented on a change in pull request #281: URL: https://github.com/apache/arrow-datafusion/pull/281#discussion_r630664326 ## File path: integration-tests/sqls/simple_math_expressions.sql ## @@ -0,0 +1,22 @@ +-- Licensed to the Apache Software Foundation (ASF) under

[GitHub] [arrow] cyb70289 commented on pull request #10298: ARROW-12490: [Dev] Use only miniforge in verify-release-candidate.sh

2021-05-11 Thread GitBox
cyb70289 commented on pull request #10298: URL: https://github.com/apache/arrow/pull/10298#issuecomment-839373461 I tested on `Linux/x86_64`, `Linux/aarch64`, `MacOSX/arm64`, but **not** `MacOSX/x86_64`. -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] github-actions[bot] commented on pull request #10299: ARROW-12656: [C++][Gandiva] Implement castVARCHAR for date, intervalDay and intervalYear

2021-05-11 Thread GitBox
github-actions[bot] commented on pull request #10299: URL: https://github.com/apache/arrow/pull/10299#issuecomment-839379441 https://issues.apache.org/jira/browse/ARROW-12656 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow] rodrigojdebem opened a new pull request #10299: ARROW-12656: [C++][Gandiva] Implement castVARCHAR for date, intervalDay and intervalYear

2021-05-11 Thread GitBox
rodrigojdebem opened a new pull request #10299: URL: https://github.com/apache/arrow/pull/10299 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this

[GitHub] [arrow-datafusion] andygrove commented on pull request #326: Remove references to Ballista Docker images published to ballistacompute Docker Hub repo

2021-05-11 Thread GitBox
andygrove commented on pull request #326: URL: https://github.com/apache/arrow-datafusion/pull/326#issuecomment-839387697 @edrevo fyi -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] cyb70289 commented on a change in pull request #10274: ARROW-12685: [C++][compute] add unary absolute value kernel

2021-05-11 Thread GitBox
cyb70289 commented on a change in pull request #10274: URL: https://github.com/apache/arrow/pull/10274#discussion_r630683291 ## File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc ## @@ -66,6 +66,50 @@ constexpr Unsigned to_unsigned(T signed_) { return

[GitHub] [arrow] rodrigojdebem closed pull request #10249: ARROW-12656: [C++][Gandiva] Implement castVARCHAR for date and intervalDay

2021-05-11 Thread GitBox
rodrigojdebem closed pull request #10249: URL: https://github.com/apache/arrow/pull/10249 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [arrow-datafusion] andygrove opened a new pull request #326: Ballista packaging

2021-05-11 Thread GitBox
andygrove opened a new pull request #326: URL: https://github.com/apache/arrow-datafusion/pull/326 # Which issue does this PR close? Closes #325. # Rationale for this change # What changes are included in this PR? # Are there any user-facing

[GitHub] [arrow-datafusion] Jeeesie opened a new issue #316: no method named `select_nth_unstable_by` found for mutable reference ` [T]`

2021-05-11 Thread GitBox
Jeeesie opened a new issue #316: URL: https://github.com/apache/arrow-datafusion/issues/316 rust toolchain is nightly-2020-04-22-x86_64-pc-windows-msvc unchanged - rustc 1.44.0-nightly when I use 'cargo build ' to build the projects under path $Ballista_Home/rust I meet the

[GitHub] [arrow] pitrou commented on a change in pull request #10255: ARROW-12661: [C++] Add ReaderOptions::skip_rows_after_names

2021-05-11 Thread GitBox
pitrou commented on a change in pull request #10255: URL: https://github.com/apache/arrow/pull/10255#discussion_r629977397 ## File path: cpp/src/arrow/csv/reader_test.cc ## @@ -216,5 +216,83 @@ TEST(StreamingReaderTests, NestedParallelism) {

[GitHub] [arrow] pitrou closed pull request #10283: ARROW-12533: [C++] Add random real distribution function

2021-05-11 Thread GitBox
pitrou closed pull request #10283: URL: https://github.com/apache/arrow/pull/10283 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please

[GitHub] [arrow] pitrou commented on pull request #10283: ARROW-12533: [C++] Add random real distribution function

2021-05-11 Thread GitBox
pitrou commented on pull request #10283: URL: https://github.com/apache/arrow/pull/10283#issuecomment-838127459 Thank you @cyb70289 ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow-datafusion] alamb opened a new pull request #317: Update arrow-rs deps

2021-05-11 Thread GitBox
alamb opened a new pull request #317: URL: https://github.com/apache/arrow-datafusion/pull/317 # What changes are included in this PR? Update to latest arrow-rs reference I am actively working on a more regular release schedule for arrow-rs but until then I plan to keep pulling

[GitHub] [arrow] thisisnic commented on a change in pull request #10269: ARROW-11705: [R] Support scalar value recycling in RecordBatch/Table$create()

2021-05-11 Thread GitBox
thisisnic commented on a change in pull request #10269: URL: https://github.com/apache/arrow/pull/10269#discussion_r630048707 ## File path: r/R/record-batch.R ## @@ -161,6 +161,17 @@ RecordBatch$create <- function(..., schema = NULL) { out <-

[GitHub] [arrow-rs] nevi-me commented on issue #279: Performance improvements for take by specializing on 32 / 64 bit integer indices

2021-05-11 Thread GitBox
nevi-me commented on issue #279: URL: https://github.com/apache/arrow-rs/issues/279#issuecomment-838276957 These are good opportunities to use `stdsimd`, as like all things, it'll eventually become stable -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow-datafusion] msathis commented on a change in pull request #288: [Datafusion] NOW() function support

2021-05-11 Thread GitBox
msathis commented on a change in pull request #288: URL: https://github.com/apache/arrow-datafusion/pull/288#discussion_r630071639 ## File path: datafusion/src/optimizer/constant_folding.rs ## @@ -200,6 +209,15 @@ impl<'a> ExprRewriter for ConstantRewriter<'a> {

[GitHub] [arrow] DileepSrigiri commented on a change in pull request #10160: ARROW-12563: [C++][Gandiva] Add space,add_months and datediff functions for string

2021-05-11 Thread GitBox
DileepSrigiri commented on a change in pull request #10160: URL: https://github.com/apache/arrow/pull/10160#discussion_r630083819 ## File path: cpp/src/gandiva/precompiled/epoch_time_point.h ## @@ -19,6 +19,14 @@ // TODO(wesm): IR compilation does not have any include

[GitHub] [arrow] rok commented on a change in pull request #9758: ARROW-9054: [C++] Add ScalarAggregateOptions

2021-05-11 Thread GitBox
rok commented on a change in pull request #9758: URL: https://github.com/apache/arrow/pull/9758#discussion_r629994282 ## File path: cpp/src/arrow/compute/kernels/aggregate_basic.cc ## @@ -75,48 +75,55 @@ struct CountImpl : public ScalarAggregator { Status

[GitHub] [arrow-datafusion] msathis commented on a change in pull request #288: [Datafusion] NOW() function support

2021-05-11 Thread GitBox
msathis commented on a change in pull request #288: URL: https://github.com/apache/arrow-datafusion/pull/288#discussion_r630073584 ## File path: datafusion/src/optimizer/constant_folding.rs ## @@ -200,6 +209,15 @@ impl<'a> ExprRewriter for ConstantRewriter<'a> {

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #315: Update PR template by commenting out instructions

2021-05-11 Thread GitBox
alamb commented on a change in pull request #315: URL: https://github.com/apache/arrow-datafusion/pull/315#discussion_r630021884 ## File path: .github/pull_request_template.md ## @@ -1,19 +1,25 @@ # Which issue does this PR close? + Closes #. # Rationale for this

[GitHub] [arrow-datafusion] msathis commented on a change in pull request #288: [Datafusion] NOW() function support

2021-05-11 Thread GitBox
msathis commented on a change in pull request #288: URL: https://github.com/apache/arrow-datafusion/pull/288#discussion_r630037254 ## File path: datafusion/src/optimizer/timestamp_evaluation.rs ## @@ -0,0 +1,177 @@ +// Licensed to the Apache Software Foundation (ASF) under one

[GitHub] [arrow-rs] alamb commented on pull request #270: Fix null struct and list roundtrip

2021-05-11 Thread GitBox
alamb commented on pull request #270: URL: https://github.com/apache/arrow-rs/pull/270#issuecomment-838247734 Great job @nevi-me -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [arrow-datafusion] msathis commented on a change in pull request #288: [Datafusion] NOW() function support

2021-05-11 Thread GitBox
msathis commented on a change in pull request #288: URL: https://github.com/apache/arrow-datafusion/pull/288#discussion_r630072536 ## File path: datafusion/src/physical_plan/datetime_expressions.rs ## @@ -268,6 +268,13 @@ pub fn to_timestamp(args: &[ColumnarValue]) -> Result

[GitHub] [arrow] pitrou commented on pull request #10240: ARROW-9530: [C++] Add option to disable jemalloc background thread on Linux

2021-05-11 Thread GitBox
pitrou commented on pull request #10240: URL: https://github.com/apache/arrow/pull/10240#issuecomment-838076933 Unfortunately, jemalloc configuration seems a bit [complicated](http://jemalloc.net/jemalloc.3.html). Some values are "read-only" in that you can only initialize them at process

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #295: Add json print format mode to datafusion cli

2021-05-11 Thread GitBox
alamb commented on a change in pull request #295: URL: https://github.com/apache/arrow-datafusion/pull/295#discussion_r630059838 ## File path: datafusion-cli/src/format/print_format.rs ## @@ -65,7 +80,89 @@ impl PrintFormat { Self::Csv => println!("{}",

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #307: fix 305 by using a scalar uint as param for zero param functions

2021-05-11 Thread GitBox
alamb commented on a change in pull request #307: URL: https://github.com/apache/arrow-datafusion/pull/307#discussion_r630079422 ## File path: datafusion/src/physical_plan/functions.rs ## @@ -1373,20 +1374,28 @@ impl PhysicalExpr for ScalarFunctionExpr { } fn

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #303: add random SQL function

2021-05-11 Thread GitBox
alamb commented on a change in pull request #303: URL: https://github.com/apache/arrow-datafusion/pull/303#discussion_r630080278 ## File path: datafusion/src/physical_plan/functions.rs ## @@ -1373,20 +1370,26 @@ impl PhysicalExpr for ScalarFunctionExpr { } fn

[GitHub] [arrow] DileepSrigiri commented on a change in pull request #10160: ARROW-12563: [C++][Gandiva] Add space,add_months and datediff functions for string

2021-05-11 Thread GitBox
DileepSrigiri commented on a change in pull request #10160: URL: https://github.com/apache/arrow/pull/10160#discussion_r630087656 ## File path: cpp/src/gandiva/precompiled/timestamp_arithmetic.cc ## @@ -47,6 +47,28 @@ bool is_last_day_of_month(const EpochTimePoint& tp) {

[GitHub] [arrow] pitrou commented on pull request #10245: ARROW-12627: [C++] Link shared libraries with -Bsymbolic-functions

2021-05-11 Thread GitBox
pitrou commented on pull request #10245: URL: https://github.com/apache/arrow/pull/10245#issuecomment-838059262 As far as I understand, it has a similar effect to `-fno-semantic-interposition`: functions inside the library don't go through shared symbol resolution (so you cannot use

[GitHub] [arrow-datafusion] alamb closed issue #294: Add json print format to datafusion cli

2021-05-11 Thread GitBox
alamb closed issue #294: URL: https://github.com/apache/arrow-datafusion/issues/294 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [arrow-datafusion] alamb merged pull request #295: Add json print format mode to datafusion cli

2021-05-11 Thread GitBox
alamb merged pull request #295: URL: https://github.com/apache/arrow-datafusion/pull/295 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [arrow-datafusion] alamb merged pull request #315: Update PR template by commenting out instructions

2021-05-11 Thread GitBox
alamb merged pull request #315: URL: https://github.com/apache/arrow-datafusion/pull/315 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #281: add integration test to compare datafusion-cli against psql

2021-05-11 Thread GitBox
alamb commented on a change in pull request #281: URL: https://github.com/apache/arrow-datafusion/pull/281#discussion_r630081042 ## File path: .github/workflows/rust.yml ## @@ -133,6 +132,52 @@ jobs: # snmalloc requires cmake so build without default features

[GitHub] [arrow-datafusion] msathis commented on a change in pull request #288: [Datafusion] NOW() function support

2021-05-11 Thread GitBox
msathis commented on a change in pull request #288: URL: https://github.com/apache/arrow-datafusion/pull/288#discussion_r630036998 ## File path: datafusion/src/optimizer/timestamp_evaluation.rs ## @@ -0,0 +1,177 @@ +// Licensed to the Apache Software Foundation (ASF) under one

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #288: [Datafusion] NOW() function support

2021-05-11 Thread GitBox
alamb commented on a change in pull request #288: URL: https://github.com/apache/arrow-datafusion/pull/288#discussion_r630061415 ## File path: datafusion/src/optimizer/constant_folding.rs ## @@ -200,6 +209,15 @@ impl<'a> ExprRewriter for ConstantRewriter<'a> {

[GitHub] [arrow-datafusion] Jimexist opened a new issue #324: add timing toggle in datafusion cli to allow timing info printing to be turned on or off

2021-05-11 Thread GitBox
Jimexist opened a new issue #324: URL: https://github.com/apache/arrow-datafusion/issues/324 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** add timing toggle in datafusion cli to allow timing info printing to be turned on

[GitHub] [arrow] cyb70289 commented on a change in pull request #10298: ARROW-12490: [Dev] Use only miniforge in verify-release-candidate.sh

2021-05-11 Thread GitBox
cyb70289 commented on a change in pull request #10298: URL: https://github.com/apache/arrow/pull/10298#discussion_r630665159 ## File path: dev/release/verify-release-candidate.sh ## @@ -217,17 +217,12 @@ setup_tempdir() { setup_miniconda() { # Setup short-lived miniconda

[GitHub] [arrow] cyb70289 removed a comment on pull request #10298: ARROW-12490: [Dev] Use only miniforge in verify-release-candidate.sh

2021-05-11 Thread GitBox
cyb70289 removed a comment on pull request #10298: URL: https://github.com/apache/arrow/pull/10298#issuecomment-839373461 I tested on `Linux/x86_64`, `Linux/aarch64`, `MacOSX/arm64`, but **not** `MacOSX/x86_64`. -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [arrow-datafusion] andygrove opened a new issue #325: Remove references to ballistacompute Docker Hub repo

2021-05-11 Thread GitBox
andygrove opened a new issue #325: URL: https://github.com/apache/arrow-datafusion/issues/325 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** Now that Ballista is part of Arrow DataFusion, we need to remove references to the

[GitHub] [arrow] cyb70289 commented on a change in pull request #10274: ARROW-12685: [C++][compute] add unary absolute value kernel

2021-05-11 Thread GitBox
cyb70289 commented on a change in pull request #10274: URL: https://github.com/apache/arrow/pull/10274#discussion_r630686322 ## File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc ## @@ -66,6 +66,50 @@ constexpr Unsigned to_unsigned(T signed_) { return

[GitHub] [arrow] jorgecarleitao commented on pull request #10239: ARROW-12643: [Governance] Added experimental repos guidelines.

2021-05-11 Thread GitBox
jorgecarleitao commented on pull request #10239: URL: https://github.com/apache/arrow/pull/10239#issuecomment-839460212 @wesm @emkornfield @andygrove , is there anything we need to do here? -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [arrow-datafusion] Jimexist commented on a change in pull request #303: add random SQL function

2021-05-11 Thread GitBox
Jimexist commented on a change in pull request #303: URL: https://github.com/apache/arrow-datafusion/pull/303#discussion_r629927595 ## File path: datafusion/src/physical_plan/functions.rs ## @@ -1373,20 +1370,26 @@ impl PhysicalExpr for ScalarFunctionExpr { } fn

[GitHub] [arrow] cyb70289 commented on a change in pull request #9758: ARROW-9054: [C++] Add ScalarAggregateOptions

2021-05-11 Thread GitBox
cyb70289 commented on a change in pull request #9758: URL: https://github.com/apache/arrow/pull/9758#discussion_r629950742 ## File path: cpp/src/arrow/compute/kernels/aggregate_basic.cc ## @@ -75,48 +75,55 @@ struct CountImpl : public ScalarAggregator { Status

[GitHub] [arrow] kou merged pull request #10291: MINOR: [JS] document how to run benchmarks

2021-05-11 Thread GitBox
kou merged pull request #10291: URL: https://github.com/apache/arrow/pull/10291 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please

[GitHub] [arrow-rs] ritchie46 closed issue #268: Arrow Aligned Vec

2021-05-11 Thread GitBox
ritchie46 closed issue #268: URL: https://github.com/apache/arrow-rs/issues/268 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please

[GitHub] [arrow-rs] ritchie46 commented on issue #268: Arrow Aligned Vec

2021-05-11 Thread GitBox
ritchie46 commented on issue #268: URL: https://github.com/apache/arrow-rs/issues/268#issuecomment-838016441 Close, see discussion in #269 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow-rs] ritchie46 commented on issue #279: Performance improvements for take by specializing on 32 / 64 bit integer indices

2021-05-11 Thread GitBox
ritchie46 commented on issue #279: URL: https://github.com/apache/arrow-rs/issues/279#issuecomment-838020696 > It seems further speedups might come only from somehow removing bound checks and SIMD gather. SIMD gather is not being used atm right? I remember looking for this

[GitHub] [arrow-datafusion] Jimexist commented on a change in pull request #303: add random SQL function

2021-05-11 Thread GitBox
Jimexist commented on a change in pull request #303: URL: https://github.com/apache/arrow-datafusion/pull/303#discussion_r629941928 ## File path: datafusion/src/physical_plan/functions.rs ## @@ -1373,20 +1370,26 @@ impl PhysicalExpr for ScalarFunctionExpr { } fn

[GitHub] [arrow-rs] Dandandan edited a comment on issue #283: no method named `select_nth_unstable_by` found for mutable reference ` [T]`

2021-05-11 Thread GitBox
Dandandan edited a comment on issue #283: URL: https://github.com/apache/arrow-rs/issues/283#issuecomment-837902018 `select_nth_unstable_by` is stabilized in a later rust version (1.49). Could you upgrade the compiler version? I also moved the issue to the arrow-rs repo as it

[GitHub] [arrow-rs] Dandandan commented on issue #279: Performance improvements for take by specializing on 32 / 64 bit integer indices

2021-05-11 Thread GitBox
Dandandan commented on issue #279: URL: https://github.com/apache/arrow-rs/issues/279#issuecomment-838025166 Yeah I also did some googling - seems not supported in `packed_simd` so I guess at this point it has to be written by hand. -- This is an automated message from the Apache Git

[GitHub] [arrow-datafusion] Dandandan commented on a change in pull request #288: [Datafusion] NOW() function support

2021-05-11 Thread GitBox
Dandandan commented on a change in pull request #288: URL: https://github.com/apache/arrow-datafusion/pull/288#discussion_r629883952 ## File path: datafusion/src/physical_plan/datetime_expressions.rs ## @@ -268,6 +268,13 @@ pub fn to_timestamp(args: &[ColumnarValue]) ->

[GitHub] [arrow-datafusion] Dandandan commented on issue #316: no method named `select_nth_unstable_by` found for mutable reference ` [T]`

2021-05-11 Thread GitBox
Dandandan commented on issue #316: URL: https://github.com/apache/arrow-datafusion/issues/316#issuecomment-837902018 `select_nth_unstable_by` is stabilized in a later rust version (1.49). Could you upgrade the compiler version? -- This is an automated message from the Apache Git

[GitHub] [arrow-rs] Jeeesie opened a new issue #283: no method named `select_nth_unstable_by` found for mutable reference ` [T]`

2021-05-11 Thread GitBox
Jeeesie opened a new issue #283: URL: https://github.com/apache/arrow-rs/issues/283 rust toolchain is nightly-2020-04-22-x86_64-pc-windows-msvc unchanged - rustc 1.44.0-nightly when I use 'cargo build ' to build the projects under path $Ballista_Home/rust I meet the error :

[GitHub] [arrow-rs] Jeeesie closed issue #283: no method named `select_nth_unstable_by` found for mutable reference ` [T]`

2021-05-11 Thread GitBox
Jeeesie closed issue #283: URL: https://github.com/apache/arrow-rs/issues/283 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please

[GitHub] [arrow-rs] Jeeesie commented on issue #283: no method named `select_nth_unstable_by` found for mutable reference ` [T]`

2021-05-11 Thread GitBox
Jeeesie commented on issue #283: URL: https://github.com/apache/arrow-rs/issues/283#issuecomment-837935341 Yeah, resolved. Thanks for the quick answer. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [arrow] rok commented on a change in pull request #9758: ARROW-9054: [C++] Add ScalarAggregateOptions

2021-05-11 Thread GitBox
rok commented on a change in pull request #9758: URL: https://github.com/apache/arrow/pull/9758#discussion_r629922962 ## File path: cpp/src/arrow/compute/kernels/aggregate_basic.cc ## @@ -75,48 +75,55 @@ struct CountImpl : public ScalarAggregator { Status

[GitHub] [arrow-rs] crepererum commented on issue #264: Include NaN in Parquet stats (again)

2021-05-11 Thread GitBox
crepererum commented on issue #264: URL: https://github.com/apache/arrow-rs/issues/264#issuecomment-837997316 I'm a bit low on resources this week. I can certainly write down the options (which are more than the proposal above) and how the link to use cases and prior art next week and

[GitHub] [arrow] github-actions[bot] commented on pull request #10274: ARROW-12685: [C++][compute] add unary absolute value kernel

2021-05-11 Thread GitBox
github-actions[bot] commented on pull request #10274: URL: https://github.com/apache/arrow/pull/10274#issuecomment-838700289 https://issues.apache.org/jira/browse/ARROW-12685 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow] emkornfield closed pull request #9151: ARROW-11173: [Java] Add map type in complex reader / writer

2021-05-11 Thread GitBox
emkornfield closed pull request #9151: URL: https://github.com/apache/arrow/pull/9151 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [arrow] emkornfield commented on pull request #9147: ARROW-11177: [Java] ArrowMessage failed to parse compressed grpc stream

2021-05-11 Thread GitBox
emkornfield commented on pull request #9147: URL: https://github.com/apache/arrow/pull/9147#issuecomment-838718260 @stczwd did you want to address feedback here? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] edponce commented on pull request #10274: ARROW-12685: [C++][compute] add unary absolute value kernel

2021-05-11 Thread GitBox
edponce commented on pull request #10274: URL: https://github.com/apache/arrow/pull/10274#issuecomment-838763550 I named the compute function as `AbsoluteValue` and kernels as "absolute value" but this feels like too long a name. Convention across other libraries is "abs" but Arrow's

[GitHub] [arrow] bkietz commented on a change in pull request #9768: ARROW-12010: [C++][Compute] Improve performance of the hash table used in GroupIdentifier

2021-05-11 Thread GitBox
bkietz commented on a change in pull request #9768: URL: https://github.com/apache/arrow/pull/9768#discussion_r630404567 ## File path: cpp/src/arrow/compute/exec/key_hash.cc ## @@ -0,0 +1,247 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more

[GitHub] [arrow-rs] ghuls opened a new issue #286: Unable to load Feather v2 files created by pyarrow and pandas.

2021-05-11 Thread GitBox
ghuls opened a new issue #286: URL: https://github.com/apache/arrow-rs/issues/286 **Describe the bug** Original bug report is here (agains polars, which was using arrow-rs for parsing Feather v2 files (IPC)): https://github.com/pola-rs/polars/issues/623 Unable to load

[GitHub] [arrow] github-actions[bot] commented on pull request #10294: ARROW-12736: [C++] Eliminate forced copy of potentially large vector>

2021-05-11 Thread GitBox
github-actions[bot] commented on pull request #10294: URL: https://github.com/apache/arrow/pull/10294#issuecomment-838686006 https://issues.apache.org/jira/browse/ARROW-12736 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow] lidavidm opened a new pull request #10294: ARROW-12736: [C++] Eliminate forced copy of potentially large vector>

2021-05-11 Thread GitBox
lidavidm opened a new pull request #10294: URL: https://github.com/apache/arrow/pull/10294 This is one of the contributors to the regression in scan times of wide datasets in ARROW-11469 - for every column, we're copying a vector of shared_ptrs of every column, leading to a quadratic

[GitHub] [arrow] lidavidm commented on pull request #10294: ARROW-12736: [C++] Eliminate forced copy of potentially large vector>

2021-05-11 Thread GitBox
lidavidm commented on pull request #10294: URL: https://github.com/apache/arrow/pull/10294#issuecomment-838737990 @ursabot please benchmark -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] emkornfield commented on a change in pull request #10203: ARROW-5385: [Go] Implement EXTENSION datatype

2021-05-11 Thread GitBox
emkornfield commented on a change in pull request #10203: URL: https://github.com/apache/arrow/pull/10203#discussion_r630327365 ## File path: go/arrow/internal/testing/types/extension_types.go ## @@ -0,0 +1,247 @@ +// Licensed to the Apache Software Foundation (ASF) under one

[GitHub] [arrow] ursabot edited a comment on pull request #10294: ARROW-12736: [C++] Eliminate forced copy of potentially large vector>

2021-05-11 Thread GitBox
ursabot edited a comment on pull request #10294: URL: https://github.com/apache/arrow/pull/10294#issuecomment-838739409 Benchmark runs are scheduled for baseline = 553f3d8211271e8eb576c9668e53dd5dc53c480a and contender = 23a46c10f72a55866818e6bf0537719c9a2a61dc. Results will be available

[GitHub] [arrow] bkietz closed pull request #10287: ARROW-12670: [C++] Fix extract_regex output after non-matching values

2021-05-11 Thread GitBox
bkietz closed pull request #10287: URL: https://github.com/apache/arrow/pull/10287 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please

[GitHub] [arrow] ursabot edited a comment on pull request #10294: ARROW-12736: [C++] Eliminate forced copy of potentially large vector>

2021-05-11 Thread GitBox
ursabot edited a comment on pull request #10294: URL: https://github.com/apache/arrow/pull/10294#issuecomment-838739409 Benchmark runs are scheduled for baseline = 553f3d8211271e8eb576c9668e53dd5dc53c480a and contender = 23a46c10f72a55866818e6bf0537719c9a2a61dc. Results will be available

[GitHub] [arrow] ianmcook edited a comment on pull request #10269: ARROW-11705: [R] Support scalar value recycling in RecordBatch/Table$create()

2021-05-11 Thread GitBox
ianmcook edited a comment on pull request #10269: URL: https://github.com/apache/arrow/pull/10269#issuecomment-838715745 @thisisnic please also build the docs so the new `.Rd` file is included here -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [arrow-rs] alamb edited a comment on issue #284: RecordBatch Sort Order

2021-05-11 Thread GitBox
alamb edited a comment on issue #284: URL: https://github.com/apache/arrow-rs/issues/284#issuecomment-838894369 I will start a thread on d...@arrow.apache.org https://lists.apache.org/thread.html/r851827e166cf1bdd0197b22e2d993ea1f7fb79c911f5a34689b92ae4%40%3Cdev.arrow.apache.org%3E

[GitHub] [arrow-datafusion] codecov-commenter edited a comment on pull request #307: fix 305 by using a scalar uint as param for zero param functions

2021-05-11 Thread GitBox
codecov-commenter edited a comment on pull request #307: URL: https://github.com/apache/arrow-datafusion/pull/307#issuecomment-836501466 #

[GitHub] [arrow-datafusion] Jimexist commented on a change in pull request #307: fix 305 by using a scalar uint as param for zero param functions

2021-05-11 Thread GitBox
Jimexist commented on a change in pull request #307: URL: https://github.com/apache/arrow-datafusion/pull/307#discussion_r630249637 ## File path: datafusion/src/physical_plan/functions.rs ## @@ -1373,20 +1374,28 @@ impl PhysicalExpr for ScalarFunctionExpr { } fn

[GitHub] [arrow] thisisnic commented on pull request #10269: ARROW-11705: [R] Support scalar value recycling in RecordBatch/Table$create()

2021-05-11 Thread GitBox
thisisnic commented on pull request #10269: URL: https://github.com/apache/arrow/pull/10269#issuecomment-838628591 > This looks pretty good to me! Just a few final things: > > * Could you please ensure there are spaces added: `if(` → `if (` and `){`→ `) {` > > * Could

[GitHub] [arrow-datafusion] Dandandan edited a comment on pull request #317: Update arrow-rs deps

2021-05-11 Thread GitBox
Dandandan edited a comment on pull request #317: URL: https://github.com/apache/arrow-datafusion/pull/317#issuecomment-838687606 > From parquet, this brings in some fixes for the writer. > > Do Datafusion or Ballista currently use the writer? Yes

[GitHub] [arrow] ianmcook commented on a change in pull request #10269: ARROW-11705: [R] Support scalar value recycling in RecordBatch/Table$create()

2021-05-11 Thread GitBox
ianmcook commented on a change in pull request #10269: URL: https://github.com/apache/arrow/pull/10269#discussion_r630293930 ## File path: r/R/record-batch.R ## @@ -161,6 +161,17 @@ RecordBatch$create <- function(..., schema = NULL) { out <-

[GitHub] [arrow-datafusion] Dandandan commented on a change in pull request #303: add random SQL function

2021-05-11 Thread GitBox
Dandandan commented on a change in pull request #303: URL: https://github.com/apache/arrow-datafusion/pull/303#discussion_r630297583 ## File path: datafusion/src/physical_plan/functions.rs ## @@ -1373,20 +1370,26 @@ impl PhysicalExpr for ScalarFunctionExpr { } fn

[GitHub] [arrow] ianmcook commented on pull request #10269: ARROW-11705: [R] Support scalar value recycling in RecordBatch/Table$create()

2021-05-11 Thread GitBox
ianmcook commented on pull request #10269: URL: https://github.com/apache/arrow/pull/10269#issuecomment-838715745 @thisisnic please also build the docs so `util.Rd` is included here -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [arrow] emkornfield commented on a change in pull request #10203: ARROW-5385: [Go] Implement EXTENSION datatype

2021-05-11 Thread GitBox
emkornfield commented on a change in pull request #10203: URL: https://github.com/apache/arrow/pull/10203#discussion_r630316132 ## File path: go/arrow/compare_test.go ## @@ -27,7 +27,7 @@ func TestTypeEqual(t *testing.T) { checkMetadata bool }{

  1   2   3   >