[GitHub] [arrow] liyafan82 commented on pull request #8210: ARROW-10031: [CI][Java] Support Java benchmark in Ursabot

2020-11-11 Thread GitBox
liyafan82 commented on pull request #8210: URL: https://github.com/apache/arrow/pull/8210#issuecomment-725281667 > At my end, I can generate the following JSON file by `archery benchmark diff --language=java ...` > > @liyafan82 any comments regarding format and parameters are

[GitHub] [arrow] github-actions[bot] commented on pull request #8637: ARROW-10021: [C++][Compute] Return top-n modes in mode kernel

2020-11-11 Thread GitBox
github-actions[bot] commented on pull request #8637: URL: https://github.com/apache/arrow/pull/8637#issuecomment-725320345 https://issues.apache.org/jira/browse/ARROW-10021 This is an automated message from the Apache Git

[GitHub] [arrow] vertexclique commented on pull request #8635: ARROW-10551: [Rust] Fix unreproducible benches

2020-11-11 Thread GitBox
vertexclique commented on pull request #8635: URL: https://github.com/apache/arrow/pull/8635#issuecomment-725349668 Answered This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #8621: ARROW-9128: [C++] Implement string space trimming kernels: trim, ltrim, and rtrim

2020-11-11 Thread GitBox
jorisvandenbossche commented on a change in pull request #8621: URL: https://github.com/apache/arrow/pull/8621#discussion_r521311397 ## File path: cpp/src/arrow/compute/kernels/scalar_string.cc ## @@ -1231,6 +1251,305 @@ Result StrptimeResolve(KernelContext* ctx, const

[GitHub] [arrow] alamb commented on pull request #8635: ARROW-10551: [Rust] Fix unreproducible benches

2020-11-11 Thread GitBox
alamb commented on pull request #8635: URL: https://github.com/apache/arrow/pull/8635#issuecomment-725397069 I am checking / testing this PR out locally. This is an automated message from the Apache Git Service. To respond

[GitHub] [arrow] kiszk commented on pull request #8210: ARROW-10031: [CI][Java] Support Java benchmark in Ursabot

2020-11-11 Thread GitBox
kiszk commented on pull request #8210: URL: https://github.com/apache/arrow/pull/8210#issuecomment-725290440 For 1., I will rename the title for cpp and Java. For 2, you are right as sorted [here](https://github.com/apache/arrow/blob/master/dev/archery/archery/cli.py#L595).

[GitHub] [arrow] cyb70289 opened a new pull request #8637: ARROW-10021: [C++][Compute] Return top-n modes in mode kernel

2020-11-11 Thread GitBox
cyb70289 opened a new pull request #8637: URL: https://github.com/apache/arrow/pull/8637 This patch generalize mode kernel to return top-n modes. No performance difference for normal benchmarks. About 20% performance drop for 100% null benchmarks.

[GitHub] [arrow] pitrou commented on pull request #8626: ARROW-10545: [C++] Fix crash on invalid Parquet file (OSS-Fuzz)

2020-11-11 Thread GitBox
pitrou commented on pull request #8626: URL: https://github.com/apache/arrow/pull/8626#issuecomment-725298582 As I said above, this will break CI until #8617 is merged. This is an automated message from the Apache Git

[GitHub] [arrow] maartenbreddels commented on pull request #8621: ARROW-9128: [C++] Implement string space trimming kernels: trim, ltrim, and rtrim

2020-11-11 Thread GitBox
maartenbreddels commented on pull request #8621: URL: https://github.com/apache/arrow/pull/8621#issuecomment-725345905 The `std::vector` was a good idea, and indeed because of it's bit usage, the memory usage for Unicode isn't that heavy (most extreme: `0x10 bits = 140kb` in case of a

[GitHub] [arrow] alamb closed pull request #8636: ARROW-10552: [Rust] Removed un-used Result

2020-11-11 Thread GitBox
alamb closed pull request #8636: URL: https://github.com/apache/arrow/pull/8636 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] pitrou commented on a change in pull request #8637: ARROW-10021: [C++][Compute] Return top-n modes in mode kernel

2020-11-11 Thread GitBox
pitrou commented on a change in pull request #8637: URL: https://github.com/apache/arrow/pull/8637#discussion_r521255222 ## File path: docs/source/cpp/compute.rst ## @@ -140,7 +140,7 @@ Aggregations

[GitHub] [arrow] alamb commented on pull request #8635: ARROW-10551: [Rust] Fix unreproducible benches

2020-11-11 Thread GitBox
alamb commented on pull request #8635: URL: https://github.com/apache/arrow/pull/8635#issuecomment-725416573 So my personal suggestion is change all the benches to use `seed_from_u64` or equivalent. I don't think there is any need for a new additional dependency --

[GitHub] [arrow] pitrou commented on a change in pull request #8542: ARROW-10407: [C++] Add BasicDecimal256 Division Support

2020-11-11 Thread GitBox
pitrou commented on a change in pull request #8542: URL: https://github.com/apache/arrow/pull/8542#discussion_r521349565 ## File path: cpp/src/arrow/util/basic_decimal.cc ## @@ -395,33 +395,49 @@ BasicDecimal128& BasicDecimal128::operator*=(const BasicDecimal128& right) {

[GitHub] [arrow] pitrou commented on a change in pull request #8542: ARROW-10407: [C++] Add BasicDecimal256 Division Support

2020-11-11 Thread GitBox
pitrou commented on a change in pull request #8542: URL: https://github.com/apache/arrow/pull/8542#discussion_r521349186 ## File path: cpp/src/arrow/util/basic_decimal.cc ## @@ -490,49 +527,60 @@ static void FixDivisionSigns(BasicDecimal128* result, BasicDecimal128* remainder

[GitHub] [arrow] vertexclique edited a comment on pull request #8635: ARROW-10551: [Rust] Fix unreproducible benches

2020-11-11 Thread GitBox
vertexclique edited a comment on pull request #8635: URL: https://github.com/apache/arrow/pull/8635#issuecomment-725422469 > If the argument is that two runs of the same benchmark do not agree due to randomness, then I do not understand how having the same seed per thread or different

[GitHub] [arrow] vertexclique edited a comment on pull request #8635: ARROW-10551: [Rust] Fix unreproducible benches

2020-11-11 Thread GitBox
vertexclique edited a comment on pull request #8635: URL: https://github.com/apache/arrow/pull/8635#issuecomment-725422469 > If the argument is that two runs of the same benchmark do not agree due to randomness, then I do not understand how having the same seed per thread or different

[GitHub] [arrow] vertexclique edited a comment on pull request #8635: ARROW-10551: [Rust] Fix unreproducible benches

2020-11-11 Thread GitBox
vertexclique edited a comment on pull request #8635: URL: https://github.com/apache/arrow/pull/8635#issuecomment-725422469 > If the argument is that two runs of the same benchmark do not agree due to randomness, then I do not understand how having the same seed per thread or different

[GitHub] [arrow] vertexclique edited a comment on pull request #8635: ARROW-10551: [Rust] Fix unreproducible benches

2020-11-11 Thread GitBox
vertexclique edited a comment on pull request #8635: URL: https://github.com/apache/arrow/pull/8635#issuecomment-725422469 > If the argument is that two runs of the same benchmark do not agree due to randomness, then I do not understand how having the same seed per thread or different

[GitHub] [arrow] alamb closed pull request #8619: ARROW-10531: [Rust][DataFusion]: Add schema and graphviz formatting for LogicalPlans and a PlanVisitor

2020-11-11 Thread GitBox
alamb closed pull request #8619: URL: https://github.com/apache/arrow/pull/8619 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] maartenbreddels commented on a change in pull request #8621: ARROW-9128: [C++] Implement string space trimming kernels: trim, ltrim, and rtrim

2020-11-11 Thread GitBox
maartenbreddels commented on a change in pull request #8621: URL: https://github.com/apache/arrow/pull/8621#discussion_r521422359 ## File path: cpp/src/arrow/compute/kernels/scalar_string.cc ## @@ -1231,6 +1251,305 @@ Result StrptimeResolve(KernelContext* ctx, const

[GitHub] [arrow] pitrou commented on pull request #8621: ARROW-9128: [C++] Implement string space trimming kernels: trim, ltrim, and rtrim

2020-11-11 Thread GitBox
pitrou commented on pull request #8621: URL: https://github.com/apache/arrow/pull/8621#issuecomment-725493506 > I guess we still need to manually add content to compute.rst? Yes, you do :-) This is an automated

[GitHub] [arrow] jorgecarleitao edited a comment on pull request #8630: ARROW-10540 [Rust] Improve filtering

2020-11-11 Thread GitBox
jorgecarleitao edited a comment on pull request #8630: URL: https://github.com/apache/arrow/pull/8630#issuecomment-725500367 Thanks a lot, @alamb and @vertexclique . I agree with the naming issues here, and great insight into those crates. I do not have strong feelings about naming; I

[GitHub] [arrow] alamb commented on pull request #8635: ARROW-10551: [Rust] Fix unreproducible benches

2020-11-11 Thread GitBox
alamb commented on pull request #8635: URL: https://github.com/apache/arrow/pull/8635#issuecomment-725414703 In my measurements, the `sum` compute kernels do appear to have significant variability from run to run on my machine (details below). The variability still exists with this PR

[GitHub] [arrow] jorgecarleitao commented on pull request #8635: ARROW-10551: [Rust] Fix unreproducible benches

2020-11-11 Thread GitBox
jorgecarleitao commented on pull request #8635: URL: https://github.com/apache/arrow/pull/8635#issuecomment-725414955 > Current, code is reseeding after 32 MB of data, and every thread has different randomness. So no data is same as another data and totally different in different use

[GitHub] [arrow] alamb commented on pull request #8635: ARROW-10551: [Rust] Fix unreproducible benches

2020-11-11 Thread GitBox
alamb commented on pull request #8635: URL: https://github.com/apache/arrow/pull/8635#issuecomment-725438655 > Can you use https://github.com/vertexclique/zor/blob/master/zor if you are on Linux? I am running on mac osx -- I looked at zor and it looked like a good tool to try and

[GitHub] [arrow] jorgecarleitao commented on a change in pull request #8630: ARROW-10540 [Rust] Improve filtering

2020-11-11 Thread GitBox
jorgecarleitao commented on a change in pull request #8630: URL: https://github.com/apache/arrow/pull/8630#discussion_r521407494 ## File path: rust/arrow/src/array/transform/primitive.rs ## @@ -0,0 +1,37 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or

[GitHub] [arrow] jorisvandenbossche opened a new pull request #8638: ARROW-10558: [Python] Fix python S3 filesystem tests interdependence

2020-11-11 Thread GitBox
jorisvandenbossche opened a new pull request #8638: URL: https://github.com/apache/arrow/pull/8638 Follow-up on https://github.com/apache/arrow/pull/8573, where I introduced a test that was only passing because of state from other S3 tests.

[GitHub] [arrow] vertexclique commented on pull request #8635: ARROW-10551: [Rust] Fix unreproducible benches

2020-11-11 Thread GitBox
vertexclique commented on pull request #8635: URL: https://github.com/apache/arrow/pull/8635#issuecomment-725419093 Can you use https://github.com/vertexclique/zor/blob/master/zor if you are on Linux? This is an automated

[GitHub] [arrow] jorgecarleitao commented on pull request #8630: ARROW-10540 [Rust] Improve filtering

2020-11-11 Thread GitBox
jorgecarleitao commented on pull request #8630: URL: https://github.com/apache/arrow/pull/8630#issuecomment-725500367 Thanks a lot, @alamb and @vertexclique . I agree with the naming issues here, and great insight into those crates. I do not have strong feelings about naming; I tried to

[GitHub] [arrow] vertexclique commented on a change in pull request #8636: ARROW-10552: [Rust] Removed un-used Result

2020-11-11 Thread GitBox
vertexclique commented on a change in pull request #8636: URL: https://github.com/apache/arrow/pull/8636#discussion_r521339244 ## File path: rust/arrow/src/compute/kernels/filter.rs ## @@ -112,7 +112,7 @@ impl<'a> CopyNullBit for NullBitSetter<'a> { } fn

[GitHub] [arrow] jorgecarleitao edited a comment on pull request #8635: ARROW-10551: [Rust] Fix unreproducible benches

2020-11-11 Thread GitBox
jorgecarleitao edited a comment on pull request #8635: URL: https://github.com/apache/arrow/pull/8635#issuecomment-725414955 > Current, code is reseeding after 32 MB of data, and every thread has different randomness. So no data is same as another data and totally different in different

[GitHub] [arrow] alamb commented on a change in pull request #8630: ARROW-10540 [Rust] Improve filtering

2020-11-11 Thread GitBox
alamb commented on a change in pull request #8630: URL: https://github.com/apache/arrow/pull/8630#discussion_r521354110 ## File path: rust/arrow/benches/filter_kernels.rs ## @@ -14,137 +14,136 @@ // KIND, either express or implied. See the License for the // specific

[GitHub] [arrow] alamb commented on pull request #8619: ARROW-10531: [Rust][DataFusion]: Add schema and graphviz formatting for LogicalPlans and a PlanVisitor

2020-11-11 Thread GitBox
alamb commented on pull request #8619: URL: https://github.com/apache/arrow/pull/8619#issuecomment-725433951 Rebased and will merge when it passes CI This is an automated message from the Apache Git Service. To respond to

[GitHub] [arrow] maartenbreddels commented on pull request #8621: ARROW-9128: [C++] Implement string space trimming kernels: trim, ltrim, and rtrim

2020-11-11 Thread GitBox
maartenbreddels commented on pull request #8621: URL: https://github.com/apache/arrow/pull/8621#issuecomment-725476725 I've opened an issue at https://issues.apache.org/jira/browse/ARROW-10556 I guess we still need to manually add content to compute.rst?

[GitHub] [arrow] alamb commented on a change in pull request #8636: ARROW-10552: [Rust] Removed un-used Result

2020-11-11 Thread GitBox
alamb commented on a change in pull request #8636: URL: https://github.com/apache/arrow/pull/8636#discussion_r521368094 ## File path: rust/arrow/src/compute/kernels/filter.rs ## @@ -112,7 +112,7 @@ impl<'a> CopyNullBit for NullBitSetter<'a> { } fn null_buffer(

[GitHub] [arrow] jhorstmann commented on pull request #8598: ARROW-10500: [Rust] Refactor bit slice, bit view iterator for array buffers

2020-11-11 Thread GitBox
jhorstmann commented on pull request #8598: URL: https://github.com/apache/arrow/pull/8598#issuecomment-725477741 > > I think we should address jhorstmann 's measurements of performance regressions before this pR is merged. > > I measured the performance. upside_down_face It is in

[GitHub] [arrow] vertexclique commented on pull request #8635: ARROW-10551: [Rust] Fix unreproducible benches

2020-11-11 Thread GitBox
vertexclique commented on pull request #8635: URL: https://github.com/apache/arrow/pull/8635#issuecomment-725422469 > If the argument is that two runs of the same benchmark do not agree due to randomness, then I do not understand how having the same seed per thread or different seeds per

[GitHub] [arrow] vertexclique commented on pull request #8630: ARROW-10540 [Rust] Improve filtering

2020-11-11 Thread GitBox
vertexclique commented on pull request #8630: URL: https://github.com/apache/arrow/pull/8630#issuecomment-725436304 ``` Naming: I have seen similar concepts called "Masks" (as they are similar to bit masks) -- so perhaps ArrayDataMask or MaskedArrayData. Or perhaps ArrayRowSet

[GitHub] [arrow] pitrou commented on a change in pull request #8542: ARROW-10407: [C++] Add BasicDecimal256 Division Support

2020-11-11 Thread GitBox
pitrou commented on a change in pull request #8542: URL: https://github.com/apache/arrow/pull/8542#discussion_r521362458 ## File path: cpp/src/arrow/util/basic_decimal.cc ## @@ -549,24 +603,27 @@ static DecimalStatus SingleDivide(const uint32_t* dividend, int64_t dividend_len

[GitHub] [arrow] jorgecarleitao commented on a change in pull request #8630: ARROW-10540 [Rust] Improve filtering

2020-11-11 Thread GitBox
jorgecarleitao commented on a change in pull request #8630: URL: https://github.com/apache/arrow/pull/8630#discussion_r521392544 ## File path: rust/arrow/src/array/transform/mod.rs ## @@ -0,0 +1,532 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more

[GitHub] [arrow] pitrou commented on a change in pull request #8612: ARROW-8199: [C++] Add support for multi-column sort indices on Table

2020-11-11 Thread GitBox
pitrou commented on a change in pull request #8612: URL: https://github.com/apache/arrow/pull/8612#discussion_r521405090 ## File path: cpp/src/arrow/compute/api_vector.h ## @@ -58,6 +58,34 @@ struct ARROW_EXPORT TakeOptions : public FunctionOptions { static TakeOptions

[GitHub] [arrow] alamb opened a new pull request #8639: ARROW-10559: [Rust][DataFusion] Split up logical_plan/mod.rs into sub modules

2020-11-11 Thread GitBox
alamb opened a new pull request #8639: URL: https://github.com/apache/arrow/pull/8639 The module has gotten fairly large and so refactoring it into smaller chunks will improve readability – as suggested by Jorge https://github.com/apache/arrow/pull/8619#pullrequestreview-527391221

[GitHub] [arrow] pitrou commented on a change in pull request #8542: ARROW-10407: [C++] Add BasicDecimal256 Division Support

2020-11-11 Thread GitBox
pitrou commented on a change in pull request #8542: URL: https://github.com/apache/arrow/pull/8542#discussion_r521350486 ## File path: cpp/src/arrow/util/basic_decimal.cc ## @@ -490,49 +527,60 @@ static void FixDivisionSigns(BasicDecimal128* result, BasicDecimal128* remainder

[GitHub] [arrow] alamb commented on a change in pull request #8630: ARROW-10540 [Rust] Improve filtering

2020-11-11 Thread GitBox
alamb commented on a change in pull request #8630: URL: https://github.com/apache/arrow/pull/8630#discussion_r521362589 ## File path: rust/arrow/src/array/transform/mod.rs ## @@ -0,0 +1,532 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more

[GitHub] [arrow] jorgecarleitao edited a comment on pull request #8630: ARROW-10540 [Rust] Improve filtering

2020-11-11 Thread GitBox
jorgecarleitao edited a comment on pull request #8630: URL: https://github.com/apache/arrow/pull/8630#issuecomment-725500367 Thanks a lot, @alamb and @vertexclique . I agree with the naming issues here, and great insight into those crates. I do not have strong feelings about naming; I

[GitHub] [arrow] alamb commented on pull request #8619: ARROW-10531: [Rust][DataFusion]: Add schema and graphviz formatting for LogicalPlans and a PlanVisitor

2020-11-11 Thread GitBox
alamb commented on pull request #8619: URL: https://github.com/apache/arrow/pull/8619#issuecomment-725433229 Thanks @andygrove and @jorgecarleitao -- I plan to merge this PR and then make a new one to break the code into smaller modules

[GitHub] [arrow] vertexclique commented on pull request #8635: ARROW-10551: [Rust] Fix unreproducible benches

2020-11-11 Thread GitBox
vertexclique commented on pull request #8635: URL: https://github.com/apache/arrow/pull/8635#issuecomment-725439351 > I wonder if you can try using the seeded random approach with zor to see if that is as reproducible. will do. > That looks cool, but I am not enough of a

[GitHub] [arrow] jorgecarleitao commented on a change in pull request #8630: ARROW-10540 [Rust] Improve filtering

2020-11-11 Thread GitBox
jorgecarleitao commented on a change in pull request #8630: URL: https://github.com/apache/arrow/pull/8630#discussion_r521407494 ## File path: rust/arrow/src/array/transform/primitive.rs ## @@ -0,0 +1,37 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or

[GitHub] [arrow] github-actions[bot] commented on pull request #8638: ARROW-10558: [Python] Fix python S3 filesystem tests interdependence

2020-11-11 Thread GitBox
github-actions[bot] commented on pull request #8638: URL: https://github.com/apache/arrow/pull/8638#issuecomment-725498272 https://issues.apache.org/jira/browse/ARROW-10558 This is an automated message from the Apache Git

[GitHub] [arrow] alamb commented on a change in pull request #8639: ARROW-10559: [Rust][DataFusion] Split up logical_plan/mod.rs into sub modules

2020-11-11 Thread GitBox
alamb commented on a change in pull request #8639: URL: https://github.com/apache/arrow/pull/8639#discussion_r521484903 ## File path: rust/datafusion/src/logical_plan/mod.rs ## @@ -1264,31 +1050,6 @@ mod tests { Ok(()) } -#[test] -fn test_visitor() { -

[GitHub] [arrow] sweb opened a new pull request #8640: WIP: ARROW-4193: [Rust] Add support for decimal data type

2020-11-11 Thread GitBox
sweb opened a new pull request #8640: URL: https://github.com/apache/arrow/pull/8640 This is very much work in progress. The idea is to implement `DecimalArray` based on the existing `FixedSizeBinaryArray`. At the current state this is mostly C The values are returned as `i128`.

[GitHub] [arrow] jorgecarleitao commented on a change in pull request #8639: ARROW-10559: [Rust][DataFusion] Split up logical_plan/mod.rs into sub modules

2020-11-11 Thread GitBox
jorgecarleitao commented on a change in pull request #8639: URL: https://github.com/apache/arrow/pull/8639#discussion_r521494620 ## File path: rust/datafusion/src/logical_plan/mod.rs ## @@ -21,2300 +21,21 @@ //! Logical query plans can then be optimized and executed directly,

[GitHub] [arrow] kiszk commented on a change in pull request #7507: ARROW-8797: [C++] Read RecordBatch in a different endian

2020-11-11 Thread GitBox
kiszk commented on a change in pull request #7507: URL: https://github.com/apache/arrow/pull/7507#discussion_r521518379 ## File path: cpp/src/arrow/type.h ## @@ -1604,13 +1605,26 @@ class ARROW_EXPORT FieldRef { //

[GitHub] [arrow] lidavidm edited a comment on pull request #8585: ARROW-10475: [C++][FlightRPC] handle IPv6 hosts

2020-11-11 Thread GitBox
lidavidm edited a comment on pull request #8585: URL: https://github.com/apache/arrow/pull/8585#issuecomment-725554055 I've added a method on the Uri class now to format the host, though the implementation is essentially the naive one still (if ipv6 add brackets else pass through the

[GitHub] [arrow] lidavidm commented on pull request #8585: ARROW-10475: [C++][FlightRPC] handle IPv6 hosts

2020-11-11 Thread GitBox
lidavidm commented on pull request #8585: URL: https://github.com/apache/arrow/pull/8585#issuecomment-725554055 I've added a method on the Uri class now to format the host, though the implementation is essentially the naive one still.

[GitHub] [arrow] pitrou opened a new pull request #8642: ARROW-6071: [C++] Generic binary-to-binary casts

2020-11-11 Thread GitBox
pitrou opened a new pull request #8642: URL: https://github.com/apache/arrow/pull/8642 Implement all flavours of binary-to-binary casting, except for fixed-size binary. This is an automated message from the Apache Git

[GitHub] [arrow] github-actions[bot] commented on pull request #8642: ARROW-6071: [C++] Generic binary-to-binary casts

2020-11-11 Thread GitBox
github-actions[bot] commented on pull request #8642: URL: https://github.com/apache/arrow/pull/8642#issuecomment-725595691 https://issues.apache.org/jira/browse/ARROW-6071 This is an automated message from the Apache Git

[GitHub] [arrow] bkietz closed pull request #8617: ARROW-10525: [C++] Fix crash on unsupported IPC stream

2020-11-11 Thread GitBox
bkietz closed pull request #8617: URL: https://github.com/apache/arrow/pull/8617 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] github-actions[bot] commented on pull request #8256: ARROW-9001: [R] Box outputs as correct type in call_function

2020-11-11 Thread GitBox
github-actions[bot] commented on pull request #8256: URL: https://github.com/apache/arrow/pull/8256#issuecomment-725640851 Revision: 2c3dd4279cd5fd9d403bc8aa1f4e0ec3323d081c Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] alamb commented on a change in pull request #8639: ARROW-10559: [Rust][DataFusion] Split up logical_plan/mod.rs into sub modules

2020-11-11 Thread GitBox
alamb commented on a change in pull request #8639: URL: https://github.com/apache/arrow/pull/8639#discussion_r521488424 ## File path: rust/datafusion/src/logical_plan/mod.rs ## @@ -21,2300 +21,21 @@ //! Logical query plans can then be optimized and executed directly, or

[GitHub] [arrow] kiszk commented on a change in pull request #7507: ARROW-8797: [C++] Read RecordBatch in a different endian

2020-11-11 Thread GitBox
kiszk commented on a change in pull request #7507: URL: https://github.com/apache/arrow/pull/7507#discussion_r521499870 ## File path: cpp/src/arrow/ipc/reader.cc ## @@ -664,14 +690,15 @@ Result> ReadRecordBatch( std::shared_ptr out_schema; // Empty means do not use

[GitHub] [arrow] alamb commented on a change in pull request #8639: ARROW-10559: [Rust][DataFusion] Split up logical_plan/mod.rs into sub modules

2020-11-11 Thread GitBox
alamb commented on a change in pull request #8639: URL: https://github.com/apache/arrow/pull/8639#discussion_r521553659 ## File path: rust/datafusion/src/logical_plan/mod.rs ## @@ -21,2300 +21,21 @@ //! Logical query plans can then be optimized and executed directly, or

[GitHub] [arrow] github-actions[bot] commented on pull request #8256: ARROW-9001: [R] Box outputs as correct type in call_function

2020-11-11 Thread GitBox
github-actions[bot] commented on pull request #8256: URL: https://github.com/apache/arrow/pull/8256#issuecomment-725616996 Revision: 2c3dd4279cd5fd9d403bc8aa1f4e0ec3323d081c Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] bkietz commented on pull request #8256: ARROW-9001: [R] Box outputs as correct type in call_function

2020-11-11 Thread GitBox
bkietz commented on pull request #8256: URL: https://github.com/apache/arrow/pull/8256#issuecomment-725571750 @nealrichardson @romainfrancois I've rebased and - moved forward declarations to c++ headers, so it's no longer necessary to forward declare things in arrow_exports.h -

[GitHub] [arrow] github-actions[bot] commented on pull request #8641: ARROW-8853: [Rust] [Integration Testing] Enable Flight tests

2020-11-11 Thread GitBox
github-actions[bot] commented on pull request #8641: URL: https://github.com/apache/arrow/pull/8641#issuecomment-725588664 https://issues.apache.org/jira/browse/ARROW-8853 This is an automated message from the Apache Git

[GitHub] [arrow] pitrou commented on a change in pull request #8642: ARROW-6071: [C++] Generic binary-to-binary casts

2020-11-11 Thread GitBox
pitrou commented on a change in pull request #8642: URL: https://github.com/apache/arrow/pull/8642#discussion_r521577628 ## File path: cpp/src/arrow/compute/kernels/scalar_cast_string.cc ## @@ -92,12 +93,74 @@ struct Utf8Validator { }; template -struct

[GitHub] [arrow] bkietz commented on a change in pull request #8642: ARROW-6071: [C++] Generic binary-to-binary casts

2020-11-11 Thread GitBox
bkietz commented on a change in pull request #8642: URL: https://github.com/apache/arrow/pull/8642#discussion_r521577222 ## File path: cpp/src/arrow/compute/kernels/scalar_cast_string.cc ## @@ -92,12 +93,74 @@ struct Utf8Validator { }; template -struct

[GitHub] [arrow] pitrou commented on a change in pull request #8642: ARROW-6071: [C++] Generic binary-to-binary casts

2020-11-11 Thread GitBox
pitrou commented on a change in pull request #8642: URL: https://github.com/apache/arrow/pull/8642#discussion_r521584030 ## File path: cpp/src/arrow/compute/kernels/scalar_cast_string.cc ## @@ -92,12 +93,74 @@ struct Utf8Validator { }; template -struct

[GitHub] [arrow] jorgecarleitao commented on a change in pull request #8639: ARROW-10559: [Rust][DataFusion] Split up logical_plan/mod.rs into sub modules

2020-11-11 Thread GitBox
jorgecarleitao commented on a change in pull request #8639: URL: https://github.com/apache/arrow/pull/8639#discussion_r521475124 ## File path: rust/datafusion/src/logical_plan/mod.rs ## @@ -1264,31 +1050,6 @@ mod tests { Ok(()) } -#[test] -fn

[GitHub] [arrow] yordan-pavlov commented on a change in pull request #8630: ARROW-10540 [Rust] Improve filtering

2020-11-11 Thread GitBox
yordan-pavlov commented on a change in pull request #8630: URL: https://github.com/apache/arrow/pull/8630#discussion_r521574396 ## File path: rust/arrow/benches/filter_kernels.rs ## @@ -14,137 +14,136 @@ // KIND, either express or implied. See the License for the //

[GitHub] [arrow] bkietz commented on pull request #8256: ARROW-9001: [R] Box outputs as correct type in call_function

2020-11-11 Thread GitBox
bkietz commented on pull request #8256: URL: https://github.com/apache/arrow/pull/8256#issuecomment-725615670 [ASAN UBSAN CI failure](https://github.com/apache/arrow/pull/8256/checks?check_run_id=1386783871#step:8:2353) is ARROW-10525

[GitHub] [arrow] nealrichardson commented on pull request #8256: ARROW-9001: [R] Box outputs as correct type in call_function

2020-11-11 Thread GitBox
nealrichardson commented on pull request #8256: URL: https://github.com/apache/arrow/pull/8256#issuecomment-725616136 @github-actions crossbow submit -g r This is an automated message from the Apache Git Service. To respond

[GitHub] [arrow] yordan-pavlov commented on a change in pull request #8630: ARROW-10540 [Rust] Improve filtering

2020-11-11 Thread GitBox
yordan-pavlov commented on a change in pull request #8630: URL: https://github.com/apache/arrow/pull/8630#discussion_r521588710 ## File path: rust/arrow/benches/filter_kernels.rs ## @@ -14,137 +14,136 @@ // KIND, either express or implied. See the License for the //

[GitHub] [arrow] alamb commented on a change in pull request #8639: ARROW-10559: [Rust][DataFusion] Split up logical_plan/mod.rs into sub modules

2020-11-11 Thread GitBox
alamb commented on a change in pull request #8639: URL: https://github.com/apache/arrow/pull/8639#discussion_r521467373 ## File path: rust/datafusion/src/logical_plan/mod.rs ## @@ -21,2300 +21,21 @@ //! Logical query plans can then be optimized and executed directly, or

[GitHub] [arrow] alamb commented on pull request #8619: ARROW-10531: [Rust][DataFusion]: Add schema and graphviz formatting for LogicalPlans and a PlanVisitor

2020-11-11 Thread GitBox
alamb commented on pull request #8619: URL: https://github.com/apache/arrow/pull/8619#issuecomment-725513681 https://github.com/apache/arrow/pull/8639 is the PR for breaking up the logical_plan module This is an automated

[GitHub] [arrow] kiszk commented on a change in pull request #7507: ARROW-8797: [C++] Read RecordBatch in a different endian

2020-11-11 Thread GitBox
kiszk commented on a change in pull request #7507: URL: https://github.com/apache/arrow/pull/7507#discussion_r521501051 ## File path: cpp/src/arrow/ipc/reader.cc ## @@ -699,42 +726,45 @@ Status ReadDictionary(const Buffer& metadata, DictionaryMemo* dictionary_memo, //

[GitHub] [arrow] alamb commented on a change in pull request #8639: ARROW-10559: [Rust][DataFusion] Split up logical_plan/mod.rs into sub modules

2020-11-11 Thread GitBox
alamb commented on a change in pull request #8639: URL: https://github.com/apache/arrow/pull/8639#discussion_r521541312 ## File path: rust/datafusion/src/logical_plan/mod.rs ## @@ -21,2300 +21,21 @@ //! Logical query plans can then be optimized and executed directly, or

[GitHub] [arrow] nevi-me commented on a change in pull request #8641: ARROW-8853: [Rust] [Integration Testing] Enable Flight tests

2020-11-11 Thread GitBox
nevi-me commented on a change in pull request #8641: URL: https://github.com/apache/arrow/pull/8641#discussion_r521574538 ## File path: rust/integration-testing/src/bin/flight-test-integration-client.rs ## @@ -0,0 +1,377 @@ +// Licensed to the Apache Software Foundation (ASF)

[GitHub] [arrow] vivkong commented on pull request #8011: ARROW-9803: [Go] Add initial support for s390x

2020-11-11 Thread GitBox
vivkong commented on pull request #8011: URL: https://github.com/apache/arrow/pull/8011#issuecomment-725525017 Hello @emkornfield, wondering if this can be considered for merging? I've updated the PR to use a constant to check for endianess. Thanks.

[GitHub] [arrow] github-actions[bot] commented on pull request #8639: ARROW-10559: [Rust][DataFusion] Split up logical_plan/mod.rs into sub modules

2020-11-11 Thread GitBox
github-actions[bot] commented on pull request #8639: URL: https://github.com/apache/arrow/pull/8639#issuecomment-725524192 https://issues.apache.org/jira/browse/ARROW-10559 This is an automated message from the Apache Git

[GitHub] [arrow] kiszk commented on a change in pull request #7507: ARROW-8797: [C++] Read RecordBatch in a different endian

2020-11-11 Thread GitBox
kiszk commented on a change in pull request #7507: URL: https://github.com/apache/arrow/pull/7507#discussion_r521499370 ## File path: cpp/src/arrow/ipc/reader.cc ## @@ -107,6 +108,23 @@ Status InvalidMessageType(MessageType expected, MessageType actual) { //

[GitHub] [arrow] github-actions[bot] commented on pull request #8640: WIP: ARROW-4193: [Rust] Add support for decimal data type

2020-11-11 Thread GitBox
github-actions[bot] commented on pull request #8640: URL: https://github.com/apache/arrow/pull/8640#issuecomment-725542282 Thanks for opening a pull request! Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW Then

[GitHub] [arrow] carols10cents opened a new pull request #8641: ARROW-8853: [Rust] [Integration Testing] Enable Flight tests

2020-11-11 Thread GitBox
carols10cents opened a new pull request #8641: URL: https://github.com/apache/arrow/pull/8641 There are some parts of the tests that should probably be part of the `arrow-flight` crate instead, but that can be done in a future PR. Everything here is in the integration tests.

[GitHub] [arrow] bkietz commented on a change in pull request #8642: ARROW-6071: [C++] Generic binary-to-binary casts

2020-11-11 Thread GitBox
bkietz commented on a change in pull request #8642: URL: https://github.com/apache/arrow/pull/8642#discussion_r521583351 ## File path: cpp/src/arrow/compute/kernels/scalar_cast_string.cc ## @@ -92,12 +93,74 @@ struct Utf8Validator { }; template -struct

[GitHub] [arrow] yordan-pavlov commented on pull request #8630: ARROW-10540 [Rust] Improve filtering

2020-11-11 Thread GitBox
yordan-pavlov commented on pull request #8630: URL: https://github.com/apache/arrow/pull/8630#issuecomment-725613377 @jorgecarleitao thank you for this PR; overall I think it's a great idea to reuse the code between the take and filter kernels if possible - and you have demonstrated how

[GitHub] [arrow] nealrichardson commented on pull request #8256: ARROW-9001: [R] Box outputs as correct type in call_function

2020-11-11 Thread GitBox
nealrichardson commented on pull request #8256: URL: https://github.com/apache/arrow/pull/8256#issuecomment-725639997 @github-actions crossbow submit test-r-linux-as-cran This is an automated message from the Apache Git

[GitHub] [arrow] kiszk commented on a change in pull request #7507: ARROW-8797: [C++] Read RecordBatch in a different endian

2020-11-11 Thread GitBox
kiszk commented on a change in pull request #7507: URL: https://github.com/apache/arrow/pull/7507#discussion_r521553432 ## File path: cpp/src/arrow/ipc/reader.cc ## @@ -699,42 +726,45 @@ Status ReadDictionary(const Buffer& metadata, DictionaryMemo* dictionary_memo, //

[GitHub] [arrow] yordan-pavlov commented on a change in pull request #8630: ARROW-10540 [Rust] Improve filtering

2020-11-11 Thread GitBox
yordan-pavlov commented on a change in pull request #8630: URL: https://github.com/apache/arrow/pull/8630#discussion_r521574396 ## File path: rust/arrow/benches/filter_kernels.rs ## @@ -14,137 +14,136 @@ // KIND, either express or implied. See the License for the //

[GitHub] [arrow] carols10cents commented on a change in pull request #8641: ARROW-8853: [Rust] [Integration Testing] Enable Flight tests

2020-11-11 Thread GitBox
carols10cents commented on a change in pull request #8641: URL: https://github.com/apache/arrow/pull/8641#discussion_r521618644 ## File path: rust/integration-testing/src/bin/flight-test-integration-server.rs ## @@ -0,0 +1,634 @@ +// Licensed to the Apache Software Foundation

[GitHub] [arrow] nealrichardson closed pull request #8256: ARROW-9001: [R] Box outputs as correct type in call_function

2020-11-11 Thread GitBox
nealrichardson closed pull request #8256: URL: https://github.com/apache/arrow/pull/8256 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow] a-campbell opened a new issue #8646: Predicate pushdown question

2020-11-11 Thread GitBox
a-campbell opened a new issue #8646: URL: https://github.com/apache/arrow/issues/8646 Hi Arrow community, I'm new to the project and am trying to understand exactly what is happening under the hood when I run a filter-collect query on an Arrow Dataset (backed by Parquet).

[GitHub] [arrow] wesm commented on pull request #8461: ARROW-10197: [python][Gandiva] Execute expression on filtered data

2020-11-11 Thread GitBox
wesm commented on pull request #8461: URL: https://github.com/apache/arrow/pull/8461#issuecomment-725667526 @pitrou or @bkietz could one of you have a look? This is an automated message from the Apache Git Service. To

[GitHub] [arrow] nealrichardson commented on a change in pull request #8579: ARROW-10481: [R] Bindings to add, remove, replace Table columns

2020-11-11 Thread GitBox
nealrichardson commented on a change in pull request #8579: URL: https://github.com/apache/arrow/pull/8579#discussion_r521631559 ## File path: r/R/table.R ## @@ -254,6 +257,68 @@ names.Table <- function(x) x$ColumnNames() #' @export `[[.Table` <- `[[.RecordBatch` +#'

[GitHub] [arrow] bkietz closed pull request #8642: ARROW-6071: [C++] Generic binary-to-binary casts

2020-11-11 Thread GitBox
bkietz closed pull request #8642: URL: https://github.com/apache/arrow/pull/8642 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] nealrichardson closed pull request #8616: ARROW-10522: [R] Allow rename Table and RecordBatch columns with names()

2020-11-11 Thread GitBox
nealrichardson closed pull request #8616: URL: https://github.com/apache/arrow/pull/8616 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow] kou closed pull request #8011: ARROW-9803: [Go] Add initial support for s390x

2020-11-11 Thread GitBox
kou closed pull request #8011: URL: https://github.com/apache/arrow/pull/8011 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] nealrichardson closed pull request #8643: ARROW-10522: [R] Allow rename Table and RecordBatch columns with names()

2020-11-11 Thread GitBox
nealrichardson closed pull request #8643: URL: https://github.com/apache/arrow/pull/8643 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow] jorgecarleitao opened a new pull request #8645: ARROW-10561: [Rust] Small simplification of `write` and `write_bytes`

2020-11-11 Thread GitBox
jorgecarleitao opened a new pull request #8645: URL: https://github.com/apache/arrow/pull/8645 This PR addresses 3 small issues on `MutableBuffer`: 1. `write_bytes` is incorrect, as it double-increments `len`: the length is incremented both on `self.write` and also by `write_bytes`

[GitHub] [arrow] Bei-z commented on a change in pull request #8542: ARROW-10407: [C++] Add BasicDecimal256 Division Support

2020-11-11 Thread GitBox
Bei-z commented on a change in pull request #8542: URL: https://github.com/apache/arrow/pull/8542#discussion_r521719735 ## File path: cpp/src/arrow/util/basic_decimal.cc ## @@ -395,33 +395,49 @@ BasicDecimal128& BasicDecimal128::operator*=(const BasicDecimal128& right) {

[GitHub] [arrow] Bei-z commented on a change in pull request #8542: ARROW-10407: [C++] Add BasicDecimal256 Division Support

2020-11-11 Thread GitBox
Bei-z commented on a change in pull request #8542: URL: https://github.com/apache/arrow/pull/8542#discussion_r521719841 ## File path: cpp/src/arrow/util/basic_decimal.cc ## @@ -490,49 +527,60 @@ static void FixDivisionSigns(BasicDecimal128* result, BasicDecimal128* remainder

  1   2   >