[GitHub] [arrow] liyafan82 closed pull request #8214: ARROW-9965: [Java] Improve performance of BaseFixedWidthVector.setSafe by optimizing capacity calculations

2020-09-27 Thread GitBox
liyafan82 closed pull request #8214: URL: https://github.com/apache/arrow/pull/8214 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] liyafan82 commented on pull request #8214: ARROW-9965: [Java] Improve performance of BaseFixedWidthVector.setSafe by optimizing capacity calculations

2020-09-27 Thread GitBox
liyafan82 commented on pull request #8214: URL: https://github.com/apache/arrow/pull/8214#issuecomment-699729876 Merging. The check failure is irrelavent. Thanks for the PR @josiahyan This is an automated message from the

[GitHub] [arrow] alippai edited a comment on pull request #8283: ARROW-9707: [Rust] [DataFusion] DataFusion Scheduler Prototype [WIP]

2020-09-27 Thread GitBox
alippai edited a comment on pull request #8283: URL: https://github.com/apache/arrow/pull/8283#issuecomment-699695827 @andygrove I think now you understand all my issues I had previously. The scheduler proposal and the recent comments regarding the concurrency are all superb, I think you

[GitHub] [arrow] nevi-me commented on pull request #8211: ARROW-10030: [Rust] Add support for `FromIter` and `IntoIter` for primitive types

2020-09-27 Thread GitBox
nevi-me commented on pull request #8211: URL: https://github.com/apache/arrow/pull/8211#issuecomment-699691004 I'll take a look at this during the week This is an automated message from the Apache Git Service. To respond to

[GitHub] [arrow] nevi-me closed pull request #8199: ARROW-10019: [Rust] Add substring kernel

2020-09-27 Thread GitBox
nevi-me closed pull request #8199: URL: https://github.com/apache/arrow/pull/8199 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] github-actions[bot] commented on pull request #8282: ARROW-10104: [Python] Separate tests into its own conda package

2020-09-27 Thread GitBox
github-actions[bot] commented on pull request #8282: URL: https://github.com/apache/arrow/pull/8282#issuecomment-699690223 Revision: 01fc0c058d71c3f981016294ab13f0f9e12a10b3 Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] xhochy commented on pull request #8282: ARROW-10104: [Python] Separate tests into its own conda package

2020-09-27 Thread GitBox
xhochy commented on pull request #8282: URL: https://github.com/apache/arrow/pull/8282#issuecomment-699690005 @github-actions crossbow submit conda-win-vs2017-py37 This is an automated message from the Apache Git

[GitHub] [arrow] alamb commented on a change in pull request #8283: ARROW-9707: [Rust] [DataFusion] DataFusion Scheduler Prototype [WIP]

2020-09-27 Thread GitBox
alamb commented on a change in pull request #8283: URL: https://github.com/apache/arrow/pull/8283#discussion_r495617565 ## File path: rust/datafusion/src/scheduler/mod.rs ## @@ -0,0 +1,381 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more

[GitHub] [arrow] alamb commented on pull request #8285: ARROW-9754: [Rust] [DataFusion] Implement async in ExecutionPlan trait

2020-09-27 Thread GitBox
alamb commented on pull request #8285: URL: https://github.com/apache/arrow/pull/8285#issuecomment-699689232 Sounds like a good plan @andygrove -- regarding the scheduler I may have time to help out in a few weeks as well as it is directly applicable to what I am working on at work

[GitHub] [arrow] jorgecarleitao commented on a change in pull request #8260: ARROW-10084: [Rust] [DataFusion] Added length of LargeStringArray and fixed undefined behavior.

2020-09-27 Thread GitBox
jorgecarleitao commented on a change in pull request #8260: URL: https://github.com/apache/arrow/pull/8260#discussion_r495614311 ## File path: rust/arrow/src/compute/kernels/length.rs ## @@ -17,52 +17,56 @@ //! Defines kernel for length of a string array -use

[GitHub] [arrow] xhochy commented on pull request #8282: ARROW-10104: [Python] Separate tests into its own conda package

2020-09-27 Thread GitBox
xhochy commented on pull request #8282: URL: https://github.com/apache/arrow/pull/8282#issuecomment-699685265 @github-actions crossbow submit -g conda This is an automated message from the Apache Git Service. To

[GitHub] [arrow] jorgecarleitao commented on pull request #8199: ARROW-10019: [Rust] Add substring kernel

2020-09-27 Thread GitBox
jorgecarleitao commented on pull request #8199: URL: https://github.com/apache/arrow/pull/8199#issuecomment-699684954 @nevi-me , no problem, clippy is important. I fixed the ones that `cargo clippy` showed as a separate (last) commit in this PR.

[GitHub] [arrow] github-actions[bot] commented on pull request #8282: ARROW-10104: [Python] Separate tests into its own conda package

2020-09-27 Thread GitBox
github-actions[bot] commented on pull request #8282: URL: https://github.com/apache/arrow/pull/8282#issuecomment-699681384 Revision: a153f14322045e5a0aa045a2e89273ad93951dbd Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] xhochy commented on pull request #8282: ARROW-10104: [Python] Separate tests into its own conda package

2020-09-27 Thread GitBox
xhochy commented on pull request #8282: URL: https://github.com/apache/arrow/pull/8282#issuecomment-699681148 @github-actions crossbow submit conda-linux-gcc-py36-cpu This is an automated message from the Apache Git Service.

[GitHub] [arrow] github-actions[bot] commented on pull request #8282: ARROW-10104: [Python] Separate tests into its own conda package

2020-09-27 Thread GitBox
github-actions[bot] commented on pull request #8282: URL: https://github.com/apache/arrow/pull/8282#issuecomment-699676524 Revision: 4f88f7eae43fe4996eadccdbff7fa101a1ec49e3 Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] xhochy commented on pull request #8282: ARROW-10104: [Python] Separate tests into its own conda package

2020-09-27 Thread GitBox
xhochy commented on pull request #8282: URL: https://github.com/apache/arrow/pull/8282#issuecomment-699676345 @github-actions crossbow submit -g conda This is an automated message from the Apache Git Service. To

[GitHub] [arrow] lidavidm closed pull request #8245: ARROW-10069: [Java] Support running Java benchmarks from command line

2020-09-27 Thread GitBox
lidavidm closed pull request #8245: URL: https://github.com/apache/arrow/pull/8245 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] andygrove closed pull request #8285: ARROW-9754: [Rust] [DataFusion] Implement async in ExecutionPlan trait

2020-09-27 Thread GitBox
andygrove closed pull request #8285: URL: https://github.com/apache/arrow/pull/8285 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] andygrove commented on pull request #8285: ARROW-9754: [Rust] [DataFusion] Implement async in ExecutionPlan trait

2020-09-27 Thread GitBox
andygrove commented on pull request #8285: URL: https://github.com/apache/arrow/pull/8285#issuecomment-699660074 @jorgecarleitao I know how to implement joins, but I am still learning on the scheduler front, so I think it would make more sense to ship join support in 2.0.0 and this may

[GitHub] [arrow] jorgecarleitao commented on pull request #8285: ARROW-9754: [Rust] [DataFusion] Implement async in ExecutionPlan trait

2020-09-27 Thread GitBox
jorgecarleitao commented on pull request #8285: URL: https://github.com/apache/arrow/pull/8285#issuecomment-699659752 I +1 the one that you think you will have the most fun working on :-) If both are equally fun, I would go for the joins, just because feature-wise IMO it is one of

[GitHub] [arrow] kiszk commented on pull request #8245: ARROW-10069: [Java] Support running Java benchmarks from command line

2020-09-27 Thread GitBox
kiszk commented on pull request #8245: URL: https://github.com/apache/arrow/pull/8245#issuecomment-699658104 Thank you. Now, these three parameters work correctly as we expect. This is an automated message from the Apache

[GitHub] [arrow] jorgecarleitao commented on pull request #8283: ARROW-9707: [Rust] [DataFusion] DataFusion Scheduler Prototype [WIP]

2020-09-27 Thread GitBox
jorgecarleitao commented on pull request #8283: URL: https://github.com/apache/arrow/pull/8283#issuecomment-699658056 It makes sense, @andygrove. Note that I do not disagree with us having a custom scheduler. I was noting that we could separate the two problems: one problem is

[GitHub] [arrow] andygrove commented on pull request #8285: ARROW-9754: [Rust] [DataFusion] Implement async in ExecutionPlan trait

2020-09-27 Thread GitBox
andygrove commented on pull request #8285: URL: https://github.com/apache/arrow/pull/8285#issuecomment-699657851 @alippai @vertexclique @svenwb fyi This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] andygrove commented on pull request #8285: ARROW-9754: [Rust] [DataFusion] Implement async in ExecutionPlan trait

2020-09-27 Thread GitBox
andygrove commented on pull request #8285: URL: https://github.com/apache/arrow/pull/8285#issuecomment-699657737 @alamb @jorgecarleitao I just realized that once this PR is merged, I could go ahead and implement join support because it should be relatively efficient now that MergeExec is

[GitHub] [arrow] andygrove commented on pull request #8283: ARROW-9707: [Rust] [DataFusion] DataFusion Scheduler Prototype [WIP]

2020-09-27 Thread GitBox
andygrove commented on pull request #8283: URL: https://github.com/apache/arrow/pull/8283#issuecomment-69963 @jorgecarleitao Async/await helps a lot but we also need our own scheduler to orchestrate how a query is executed. I am going to write up something more detailed with my

[GitHub] [arrow] andygrove commented on a change in pull request #8285: ARROW-9754: [Rust] [DataFusion] Implement async in ExecutionPlan trait

2020-09-27 Thread GitBox
andygrove commented on a change in pull request #8285: URL: https://github.com/apache/arrow/pull/8285#discussion_r495582607 ## File path: rust/datafusion/benches/sort_limit_query_sql.rs ## @@ -66,21 +66,23 @@ fn create_context() -> ExecutionContext { ) .unwrap(); -

[GitHub] [arrow] andygrove commented on a change in pull request #8285: ARROW-9754: [Rust] [DataFusion] Implement async in ExecutionPlan trait

2020-09-27 Thread GitBox
andygrove commented on a change in pull request #8285: URL: https://github.com/apache/arrow/pull/8285#discussion_r495582141 ## File path: rust/datafusion/benches/sort_limit_query_sql.rs ## @@ -66,21 +66,23 @@ fn create_context() -> ExecutionContext { ) .unwrap(); -

[GitHub] [arrow] github-actions[bot] commented on pull request #8287: ARROW-10111: [Rust] Added new crate with code to consume C Data interface to Rust

2020-09-27 Thread GitBox
github-actions[bot] commented on pull request #8287: URL: https://github.com/apache/arrow/pull/8287#issuecomment-699646874 https://issues.apache.org/jira/browse/ARROW-10111 This is an automated message from the Apache Git

[GitHub] [arrow] alamb commented on a change in pull request #8285: ARROW-9754: [Rust] [DataFusion] Implement async in ExecutionPlan trait

2020-09-27 Thread GitBox
alamb commented on a change in pull request #8285: URL: https://github.com/apache/arrow/pull/8285#discussion_r495581230 ## File path: rust/datafusion/benches/sort_limit_query_sql.rs ## @@ -66,21 +66,23 @@ fn create_context() -> ExecutionContext { ) .unwrap(); -

[GitHub] [arrow] andygrove commented on pull request #8285: ARROW-9754: [Rust] [DataFusion] Implement async in ExecutionPlan trait

2020-09-27 Thread GitBox
andygrove commented on pull request #8285: URL: https://github.com/apache/arrow/pull/8285#issuecomment-699645055 Thanks @BatmanAoD that was it. This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] github-actions[bot] commented on pull request #8287: [Rust] Added new crate with code to consume C Data interface to Rust

2020-09-27 Thread GitBox
github-actions[bot] commented on pull request #8287: URL: https://github.com/apache/arrow/pull/8287#issuecomment-699644581 Thanks for opening a pull request! Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW Then

[GitHub] [arrow] jorgecarleitao opened a new pull request #8287: ARROW-10110: [Rust] Added new crate with code to consume C Data interface to Rust

2020-09-27 Thread GitBox
jorgecarleitao opened a new pull request #8287: URL: https://github.com/apache/arrow/pull/8287 todo list: * [x] C header to Rust via Rust's bindgen * [x] Basic round-trip using pyarrow API * [ ] Move `ArrowArray` and `ffi` to the main Arrow library, that does not depend on

[GitHub] [arrow] jhorstmann commented on pull request #8280: ARROW-10103: [Rust] Add contains kernel

2020-09-27 Thread GitBox
jhorstmann commented on pull request #8280: URL: https://github.com/apache/arrow/pull/8280#issuecomment-699615864 > Given a list array, return true if a non-null value exists in the array. Intuitively this makes sense, unfortunately the sql rules are slightly more complex and can

[GitHub] [arrow] github-actions[bot] commented on pull request #8286: ARROW-9960: [C++] Enable external material and rotation for encryption keys

2020-09-27 Thread GitBox
github-actions[bot] commented on pull request #8286: URL: https://github.com/apache/arrow/pull/8286#issuecomment-699592528 https://issues.apache.org/jira/browse/ARROW-9960 This is an automated message from the Apache Git

[GitHub] [arrow] jorgecarleitao edited a comment on pull request #8283: ARROW-9707: [Rust] [DataFusion] DataFusion Scheduler Prototype [WIP]

2020-09-27 Thread GitBox
jorgecarleitao edited a comment on pull request #8283: URL: https://github.com/apache/arrow/pull/8283#issuecomment-699591199 This is super exciting Thanks a lot @andygrove for pushing this through! I am trying to understand how this is related to other executing architectures in

[GitHub] [arrow] revit13 opened a new pull request #8286: "[C++][Parquet] Enable external material and rotation for encryption keys"

2020-09-27 Thread GitBox
revit13 opened a new pull request #8286: URL: https://github.com/apache/arrow/pull/8286 Work in progress, please do not review yet. This patch depends on the work in https://github.com/apache/arrow/pull/8023. This is an

[GitHub] [arrow] jorgecarleitao commented on pull request #8283: ARROW-9707: [Rust] [DataFusion] DataFusion Scheduler Prototype [WIP]

2020-09-27 Thread GitBox
jorgecarleitao commented on pull request #8283: URL: https://github.com/apache/arrow/pull/8283#issuecomment-699591199 This is super exciting Thanks a lot @andygrove for pushing this through! I am trying to understand how this is related to other executing architectures in Rust