[GitHub] [arrow-rs] jorgecarleitao commented on issue #455: Not able to read nano-second timestamp columns in 1.0 parquet files written by pyarrow

2021-06-14 Thread GitBox
jorgecarleitao commented on issue #455: URL: https://github.com/apache/arrow-rs/issues/455#issuecomment-861181763 This may be related to which priority we give when converting, the parquet schema or the arrow schema. I would expect pyarrow to write them in a consistent manner though, so,

[GitHub] [arrow] frmnboi commented on issue #10488: Passing back and forth from Python and C++ with Pyarrow C++ extension and pybind11.

2021-06-14 Thread GitBox
frmnboi commented on issue #10488: URL: https://github.com/apache/arrow/issues/10488#issuecomment-861180963 After running python with debug symbols in GDB, Here is the relevant part of the GDB backtrace with directory names redacted: ``` #0 __memmove_avx_unaligned_erms () at

[GitHub] [arrow] BryanCutler commented on pull request #10513: ARROW-13044: [Java] Change UnionVector and DenseUnionVector to extend AbstractContainerVector

2021-06-14 Thread GitBox
BryanCutler commented on pull request #10513: URL: https://github.com/apache/arrow/pull/10513#issuecomment-861157094 merged to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] BryanCutler closed pull request #10513: ARROW-13044: [Java] Change UnionVector and DenseUnionVector to extend AbstractContainerVector

2021-06-14 Thread GitBox
BryanCutler closed pull request #10513: URL: https://github.com/apache/arrow/pull/10513 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [arrow] bhargav-inthezone commented on issue #10502: AttributeError: module 'pyarrow.lib' has no attribute '_Weakrefable'

2021-06-14 Thread GitBox
bhargav-inthezone commented on issue #10502: URL: https://github.com/apache/arrow/issues/10502#issuecomment-861147363 I had to uninstall pyarrow before installing Vaex before but I think kaggle fixed the problem. Now its working with installing just Vaex in the notebook -- This is an

[GitHub] [arrow-datafusion] Jimexist opened a new issue #562: publish datafusion-cli to brew repo so that macOS users can install easily

2021-06-14 Thread GitBox
Jimexist opened a new issue #562: URL: https://github.com/apache/arrow-datafusion/issues/562 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** A clear and concise description of what the problem is. Ex. I'm always frustrated when

[GitHub] [arrow-rs] emkornfield commented on issue #455: Not able to read nano-second timestamp columns in 1.0 parquet files written by pyarrow

2021-06-14 Thread GitBox
emkornfield commented on issue #455: URL: https://github.com/apache/arrow-rs/issues/455#issuecomment-861130175 > @emkornfield do you know what we should do on the Rust side to roundtrip the file correctly? Sorry, I would have to dig in the code to have a better understanding. In

[GitHub] [arrow] cyb70289 commented on a change in pull request #10530: ARROW-13072: [C++] Add bit-wise arithmetic kernels

2021-06-14 Thread GitBox
cyb70289 commented on a change in pull request #10530: URL: https://github.com/apache/arrow/pull/10530#discussion_r651405645 ## File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc ## @@ -397,6 +397,130 @@ struct PowerChecked { } }; +// Bitwise operations +

[GitHub] [arrow] p5a0u9l opened a new issue #10531: How to resolve `pyarrow.deserialize` FutureWarning

2021-06-14 Thread GitBox
p5a0u9l opened a new issue #10531: URL: https://github.com/apache/arrow/issues/10531 I'd like to resolve this warning ``` /usr/lib/python3.7/asyncio/events.py:88: FutureWarning: 'pyarrow.deserialize' is deprecated as of 2.0.0 and will be removed in a future version. Use pickle or

[GitHub] [arrow-datafusion] Jimexist commented on pull request #429: implement lead and lag built-in window function

2021-06-14 Thread GitBox
Jimexist commented on pull request #429: URL: https://github.com/apache/arrow-datafusion/pull/429#issuecomment-861103557 putting this back to draft as this relies on https://github.com/apache/arrow-rs/pull/388 which is not yet in arrow 4.3 -- This is an automated message from the Apache

[GitHub] [arrow-datafusion] Jimexist edited a comment on pull request #520: Implement window functions with `order_by` clause

2021-06-14 Thread GitBox
Jimexist edited a comment on pull request #520: URL: https://github.com/apache/arrow-datafusion/pull/520#issuecomment-861088442 > This looks very nice @Jimexist -- I went over the code and saw only goodness :) > > All that this PR needs to be mergeable in my opinion is to reset the

[GitHub] [arrow-datafusion] Jimexist commented on pull request #520: Implement window functions with `order_by` clause

2021-06-14 Thread GitBox
Jimexist commented on pull request #520: URL: https://github.com/apache/arrow-datafusion/pull/520#issuecomment-861088442 > This looks very nice @Jimexist -- I went over the code and saw only goodness :) > > All that this PR needs to be mergeable in my opinion is to reset the Cargo

[GitHub] [arrow-datafusion] codecov-commenter edited a comment on pull request #558: Implement window functions with `partition_by` clause

2021-06-14 Thread GitBox
codecov-commenter edited a comment on pull request #558: URL: https://github.com/apache/arrow-datafusion/pull/558#issuecomment-860569767 #

[GitHub] [arrow-rs] Jimexist commented on pull request #457: Add sort boolean benchmark

2021-06-14 Thread GitBox
Jimexist commented on pull request #457: URL: https://github.com/apache/arrow-rs/pull/457#issuecomment-861078612 was wondering what if you increase it to say 2^16? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow-rs] Jimexist commented on pull request #448: Use partition for bool sort

2021-06-14 Thread GitBox
Jimexist commented on pull request #448: URL: https://github.com/apache/arrow-rs/pull/448#issuecomment-861078094 i guess 1024 is too short an array so it's dominated by the memory allocation rather than sorting -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [arrow-datafusion] Jimexist commented on a change in pull request #520: Implement window functions with `order_by` clause

2021-06-14 Thread GitBox
Jimexist commented on a change in pull request #520: URL: https://github.com/apache/arrow-datafusion/pull/520#discussion_r651361200 ## File path: datafusion/src/physical_plan/windows.rs ## @@ -156,31 +162,72 @@ impl WindowExpr for BuiltInWindowExpr {

[GitHub] [arrow-datafusion] Jimexist commented on a change in pull request #520: Implement window functions with `order_by` clause

2021-06-14 Thread GitBox
Jimexist commented on a change in pull request #520: URL: https://github.com/apache/arrow-datafusion/pull/520#discussion_r651360294 ## File path: datafusion/src/physical_plan/expressions/nth_value.rs ## @@ -113,54 +111,32 @@ impl BuiltInWindowFunctionExpr for NthValue {

[GitHub] [arrow-datafusion] Jimexist commented on a change in pull request #520: Implement window functions with `order_by` clause

2021-06-14 Thread GitBox
Jimexist commented on a change in pull request #520: URL: https://github.com/apache/arrow-datafusion/pull/520#discussion_r651358846 ## File path: datafusion/src/physical_plan/windows.rs ## @@ -156,31 +162,72 @@ impl WindowExpr for BuiltInWindowExpr {

[GitHub] [arrow-datafusion] Jimexist commented on a change in pull request #520: Implement window functions with `order_by` clause

2021-06-14 Thread GitBox
Jimexist commented on a change in pull request #520: URL: https://github.com/apache/arrow-datafusion/pull/520#discussion_r651356971 ## File path: datafusion/src/physical_plan/expressions/nth_value.rs ## @@ -113,54 +111,32 @@ impl BuiltInWindowFunctionExpr for NthValue {

[GitHub] [arrow] ianmcook commented on pull request #10520: ARROW-12709: [C++] Add binary_join_element_wise

2021-06-14 Thread GitBox
ianmcook commented on pull request #10520: URL: https://github.com/apache/arrow/pull/10520#issuecomment-861046053 Oh, I see; I was confused about what happens when the separator is an array; I see now. Thank you! -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [arrow] lidavidm edited a comment on pull request #10520: ARROW-12709: [C++] Add binary_join_element_wise

2021-06-14 Thread GitBox
lidavidm edited a comment on pull request #10520: URL: https://github.com/apache/arrow/pull/10520#issuecomment-861035926 The null handling behavior never affects the separator: it only describes how to handle the other values. It's intended to let you mimic libcudf. ```

[GitHub] [arrow] lidavidm commented on pull request #10520: ARROW-12709: [C++] Add binary_join_element_wise

2021-06-14 Thread GitBox
lidavidm commented on pull request #10520: URL: https://github.com/apache/arrow/pull/10520#issuecomment-861035926 The null handling behavior never affects the separator: it only describes how to handle the other values. It's intended to let you mimic libcudf. ```

[GitHub] [arrow] rok commented on a change in pull request #10476: ARROW-12499: [C++][Compute] Add ScalarAggregateOptions to Any and All kernels

2021-06-14 Thread GitBox
rok commented on a change in pull request #10476: URL: https://github.com/apache/arrow/pull/10476#discussion_r651318115 ## File path: cpp/src/arrow/compute/kernels/aggregate_basic.cc ## @@ -166,32 +168,48 @@ struct BooleanAnyImpl : public ScalarAggregator { Status

[GitHub] [arrow] ianmcook edited a comment on pull request #10520: ARROW-12709: [C++] Add binary_join_element_wise

2021-06-14 Thread GitBox
ianmcook edited a comment on pull request #10520: URL: https://github.com/apache/arrow/pull/10520#issuecomment-861034008 I'm unclear on what the intended behavior is when you choose the `SKIP` null handling behavior and the separator is an array. Could you describe that please? Thanks

[GitHub] [arrow] ianmcook commented on pull request #10520: ARROW-12709: [C++] Add binary_join_element_wise

2021-06-14 Thread GitBox
ianmcook commented on pull request #10520: URL: https://github.com/apache/arrow/pull/10520#issuecomment-861034008 I'm unclear on what the intended behavior is when you choose the {{SKIP}} null handling behavior and the separator is an array. Could you describe that please? Thanks --

[GitHub] [arrow] rok commented on a change in pull request #10457: ARROW-12980: [C++] Kernels to extract datetime components should be timezone aware

2021-06-14 Thread GitBox
rok commented on a change in pull request #10457: URL: https://github.com/apache/arrow/pull/10457#discussion_r651290261 ## File path: cpp/src/arrow/compute/kernels/scalar_temporal_test.cc ## @@ -143,39 +142,202 @@ TEST(ScalarTemporalTest,

[GitHub] [arrow] edponce commented on issue #10502: AttributeError: module 'pyarrow.lib' has no attribute '_Weakrefable'

2021-06-14 Thread GitBox
edponce commented on issue #10502: URL: https://github.com/apache/arrow/issues/10502#issuecomment-861009813 @bhargav-inthezone Kaggle notebook [installs latest pyarrow by default](https://github.com/Kaggle/docker-python/blob/main/Dockerfile#L347) but it seems the Docker image was created

[GitHub] [arrow] rok commented on pull request #10457: ARROW-12980: [C++] Kernels to extract datetime components should be timezone aware

2021-06-14 Thread GitBox
rok commented on pull request #10457: URL: https://github.com/apache/arrow/pull/10457#issuecomment-861007346 > I think there was no opposition for this one (i.e. that the field extraction should yield local hour/minute/etc), so I don't think there is a need to close the PR. Huh, I

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #561: Fix pruning on not equal predicate

2021-06-14 Thread GitBox
alamb commented on a change in pull request #561: URL: https://github.com/apache/arrow-datafusion/pull/561#discussion_r651282910 ## File path: datafusion/src/physical_optimizer/pruning.rs ## @@ -553,12 +553,14 @@ fn build_predicate_expression( let corrected_op =

[GitHub] [arrow-datafusion] adsharma edited a comment on issue #533: Add extension plugin to parse SQL into logical plan

2021-06-14 Thread GitBox
adsharma edited a comment on issue #533: URL: https://github.com/apache/arrow-datafusion/issues/533#issuecomment-860999601 I don't have much context about the proposal. Trying to understand things better. Please bear with me. The reason why SQL doesn't have `select * from t version

[GitHub] [arrow-datafusion] adsharma commented on issue #533: Add extension plugin to parse SQL into logical plan

2021-06-14 Thread GitBox
adsharma commented on issue #533: URL: https://github.com/apache/arrow-datafusion/issues/533#issuecomment-860999601 I don't have much context about the proposal. Trying to understand things better. Please bear with me. The reason why SQL doesn't have `select * from t version as of

[GitHub] [arrow-rs] sunchao commented on a change in pull request #443: parquet: improve BOOLEAN writing logic and report error on encoding fail

2021-06-14 Thread GitBox
sunchao commented on a change in pull request #443: URL: https://github.com/apache/arrow-rs/pull/443#discussion_r651247944 ## File path: parquet/src/data_type.rs ## @@ -661,8 +661,15 @@ pub(crate) mod private { _: W, bit_writer: BitWriter,

[GitHub] [arrow] lidavidm commented on pull request #10412: ARROW-9430: [C++] Implement replace_with_mask kernel

2021-06-14 Thread GitBox
lidavidm commented on pull request #10412: URL: https://github.com/apache/arrow/pull/10412#issuecomment-860961527 @bkietz @jorisvandenbossche I know y'all are busy, but any other comments? Once this is in, @nirandaperera can get started on ARROW-9431 on top of this -- This is an

[GitHub] [arrow-datafusion] Dandandan commented on a change in pull request #561: Fix pruning on not equal predicate

2021-06-14 Thread GitBox
Dandandan commented on a change in pull request #561: URL: https://github.com/apache/arrow-datafusion/pull/561#discussion_r651243733 ## File path: datafusion/src/physical_optimizer/pruning.rs ## @@ -553,12 +553,14 @@ fn build_predicate_expression( let corrected_op =

[GitHub] [arrow-datafusion] Dandandan commented on a change in pull request #561: Fix pruning on not equal predicate

2021-06-14 Thread GitBox
Dandandan commented on a change in pull request #561: URL: https://github.com/apache/arrow-datafusion/pull/561#discussion_r651243037 ## File path: datafusion/src/physical_optimizer/pruning.rs ## @@ -553,12 +553,14 @@ fn build_predicate_expression( let corrected_op =

[GitHub] [arrow] rok commented on a change in pull request #10457: ARROW-12980: [C++] Kernels to extract datetime components should be timezone aware

2021-06-14 Thread GitBox
rok commented on a change in pull request #10457: URL: https://github.com/apache/arrow/pull/10457#discussion_r651241888 ## File path: python/pyarrow/tests/test_compute.py ## @@ -1212,6 +1212,80 @@ def test_strptime(): assert got == expected +def

[GitHub] [arrow] github-actions[bot] commented on pull request #10530: ARROW-13072: [C++] Add bit-wise arithmetic kernels

2021-06-14 Thread GitBox
github-actions[bot] commented on pull request #10530: URL: https://github.com/apache/arrow/pull/10530#issuecomment-860958622 https://issues.apache.org/jira/browse/ARROW-13072 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow] lidavidm opened a new pull request #10530: ARROW-13072: [C++] Add bit-wise arithmetic kernels

2021-06-14 Thread GitBox
lidavidm opened a new pull request #10530: URL: https://github.com/apache/arrow/pull/10530 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this

[GitHub] [arrow-datafusion] Dandandan commented on a change in pull request #561: Fix pruning on not equal predicate

2021-06-14 Thread GitBox
Dandandan commented on a change in pull request #561: URL: https://github.com/apache/arrow-datafusion/pull/561#discussion_r651241366 ## File path: datafusion/src/physical_optimizer/pruning.rs ## @@ -1190,6 +1192,34 @@ mod tests { assert_eq!(result, expected); }

[GitHub] [arrow-datafusion] alamb merged pull request #561: Fix pruning on not equal predicate

2021-06-14 Thread GitBox
alamb merged pull request #561: URL: https://github.com/apache/arrow-datafusion/pull/561 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [arrow-datafusion] alamb closed issue #560: Pruning on `!=` predicate results in incorrect results

2021-06-14 Thread GitBox
alamb closed issue #560: URL: https://github.com/apache/arrow-datafusion/issues/560 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [arrow-datafusion] alamb commented on pull request #561: Fix pruning on not equal predicate

2021-06-14 Thread GitBox
alamb commented on pull request #561: URL: https://github.com/apache/arrow-datafusion/pull/561#issuecomment-860947331 > @alamb, Sorry for introducing the bug, prune_not_eq_data test is much clearer to me now. Thank you! No worries @jgoday -- both @Dandandan and I missed it on the

[GitHub] [arrow-datafusion] jgoday commented on pull request #561: Fix pruning on not equal predicate

2021-06-14 Thread GitBox
jgoday commented on pull request #561: URL: https://github.com/apache/arrow-datafusion/pull/561#issuecomment-860937695 > fyi @jgoday @alamb, Sorry for introducing the bug, prune_not_eq_data test is much clearer to me now. Thank you! -- This is an automated message from the Apache

[GitHub] [arrow-datafusion] alamb commented on pull request #561: Revert pruning on not equal predicate

2021-06-14 Thread GitBox
alamb commented on pull request #561: URL: https://github.com/apache/arrow-datafusion/pull/561#issuecomment-860896807 fyi @jgoday -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [arrow-datafusion] alamb removed a comment on pull request #561: Revert pruning on not equal predicate

2021-06-14 Thread GitBox
alamb removed a comment on pull request #561: URL: https://github.com/apache/arrow-datafusion/pull/561#issuecomment-860886806 I actually think I can still support not equals, I just need to make it a bit more restricted -- This is an automated message from the Apache Git Service. To

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #561: Revert pruning on not equal predicate

2021-06-14 Thread GitBox
alamb commented on a change in pull request #561: URL: https://github.com/apache/arrow-datafusion/pull/561#discussion_r651166362 ## File path: datafusion/src/physical_optimizer/pruning.rs ## @@ -1190,6 +1162,34 @@ mod tests { assert_eq!(result, expected); } +

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #561: Revert pruning on not equal predicate

2021-06-14 Thread GitBox
alamb commented on a change in pull request #561: URL: https://github.com/apache/arrow-datafusion/pull/561#discussion_r651166601 ## File path: datafusion/src/physical_optimizer/pruning.rs ## @@ -1190,6 +1162,34 @@ mod tests { assert_eq!(result, expected); } +

[GitHub] [arrow-datafusion] alamb edited a comment on pull request #544: Not equal predicate in physical_planning pruning

2021-06-14 Thread GitBox
alamb edited a comment on pull request #544: URL: https://github.com/apache/arrow-datafusion/pull/544#issuecomment-860882163 I think we got the logic a bit too aggressive see -- see https://github.com/apache/arrow-datafusion/issues/560. FYI @jgoday -- This is an automated message from

[GitHub] [arrow] pitrou closed pull request #10529: ARROW-13075: [Python] Expose C data interface API for pyarrow.Field

2021-06-14 Thread GitBox
pitrou closed pull request #10529: URL: https://github.com/apache/arrow/pull/10529 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please

[GitHub] [arrow-datafusion] alamb commented on pull request #561: Revert pruning on not equal predicate

2021-06-14 Thread GitBox
alamb commented on pull request #561: URL: https://github.com/apache/arrow-datafusion/pull/561#issuecomment-860886806 I actually think I can still support not equals, I just need to make it a bit more restricted -- This is an automated message from the Apache Git Service. To respond

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #561: Revert pruning on not equal predicate

2021-06-14 Thread GitBox
alamb commented on a change in pull request #561: URL: https://github.com/apache/arrow-datafusion/pull/561#discussion_r651166362 ## File path: datafusion/src/physical_optimizer/pruning.rs ## @@ -1190,6 +1162,34 @@ mod tests { assert_eq!(result, expected); } +

[GitHub] [arrow] BryanCutler edited a comment on pull request #10513: ARROW-13044: [Java] Change UnionVector and DenseUnionVector to extend AbstractContainerVector

2021-06-14 Thread GitBox
BryanCutler edited a comment on pull request #10513: URL: https://github.com/apache/arrow/pull/10513#issuecomment-860884676 Thanks @lidavidm @liyafan82 , I made https://issues.apache.org/jira/browse/ARROW-13076 for ExtensionTypeVector to use ValueVector. If this PR looks ok for

[GitHub] [arrow] BryanCutler commented on pull request #10513: ARROW-13044: [Java] Change UnionVector and DenseUnionVector to extend AbstractContainerVector

2021-06-14 Thread GitBox
BryanCutler commented on pull request #10513: URL: https://github.com/apache/arrow/pull/10513#issuecomment-860884676 Thanks @lidavidm , I made https://issues.apache.org/jira/browse/ARROW-13076 for ExtensionTypeVector to use ValueVector. If this PR looks ok for union vectors, I'll

[GitHub] [arrow-datafusion] alamb opened a new pull request #561: Revert pruning on not equal predicate

2021-06-14 Thread GitBox
alamb opened a new pull request #561: URL: https://github.com/apache/arrow-datafusion/pull/561 Closes #560 # Rationale for this change Logic is incorrect # What changes are included in this PR? 1. Revert

[GitHub] [arrow-datafusion] alamb edited a comment on pull request #544: Not equal predicate in physical_planning pruning

2021-06-14 Thread GitBox
alamb edited a comment on pull request #544: URL: https://github.com/apache/arrow-datafusion/pull/544#issuecomment-860882163 I think we got the logic slightly backwards here -- see https://github.com/apache/arrow-datafusion/issues/560. FYI @jgoday -- This is an automated message from

[GitHub] [arrow-datafusion] alamb commented on pull request #544: Not equal predicate in physical_planning pruning

2021-06-14 Thread GitBox
alamb commented on pull request #544: URL: https://github.com/apache/arrow-datafusion/pull/544#issuecomment-860882163 I think we got the logic slightly backwards here -- see https://github.com/apache/arrow-datafusion/issues/560 -- This is an automated message from the Apache Git

[GitHub] [arrow-datafusion] alamb opened a new issue #560: Pruning on `!=` predicate results in incorrect results

2021-06-14 Thread GitBox
alamb opened a new issue #560: URL: https://github.com/apache/arrow-datafusion/issues/560 **Describe the bug** The logic for pruning on `!=` predicates introduced in https://github.com/apache/arrow-datafusion/pull/544 is incorrect **To Reproduce** Use the pruning logic with a

[GitHub] [arrow] github-actions[bot] commented on pull request #10529: ARROW-13075: [Python] Expose C data interface API for pyarrow.Field

2021-06-14 Thread GitBox
github-actions[bot] commented on pull request #10529: URL: https://github.com/apache/arrow/pull/10529#issuecomment-860868416 https://issues.apache.org/jira/browse/ARROW-13075 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow] kszucs opened a new pull request #10529: ARROW-13075: [Python] Expose C data interface API for pyarrow.Field

2021-06-14 Thread GitBox
kszucs opened a new pull request #10529: URL: https://github.com/apache/arrow/pull/10529 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [arrow] AlenkaF commented on pull request #10519: ARROW-12867: [R] Bindings for abs()

2021-06-14 Thread GitBox
AlenkaF commented on pull request #10519: URL: https://github.com/apache/arrow/pull/10519#issuecomment-860853969 Sure, I am happy to do it. Thank you so much @thisisnic! Will let you know if I get stuck ;) -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] bkietz commented on pull request #10511: ARROW-13025: [C++][Python] Add FunctionOptions::Equals/ToString/Serialize

2021-06-14 Thread GitBox
bkietz commented on pull request #10511: URL: https://github.com/apache/arrow/pull/10511#issuecomment-860852122 @lidavidm thanks for working on this! >I don't like adding protected methods to a struct, and it's inconsistent with how equality is implemented for other structs (via

[GitHub] [arrow-datafusion] alamb commented on pull request #55: Support qualified columns in queries

2021-06-14 Thread GitBox
alamb commented on pull request #55: URL: https://github.com/apache/arrow-datafusion/pull/55#issuecomment-860846672 I suggest we get this PR rebased and merged asap to minimize conflicts -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #55: Support qualified columns in queries

2021-06-14 Thread GitBox
alamb commented on a change in pull request #55: URL: https://github.com/apache/arrow-datafusion/pull/55#discussion_r651125970 ## File path: datafusion/src/physical_plan/planner.rs ## @@ -56,6 +56,121 @@ use expressions::col; use log::debug; use std::sync::Arc; +fn

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #55: Support qualified columns in queries

2021-06-14 Thread GitBox
alamb commented on a change in pull request #55: URL: https://github.com/apache/arrow-datafusion/pull/55#discussion_r651124509 ## File path: datafusion/src/physical_plan/expressions/nth_value.rs ## @@ -195,7 +195,7 @@ mod tests { fn first_value() -> Result<()> {

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #55: Support qualified columns in queries

2021-06-14 Thread GitBox
alamb commented on a change in pull request #55: URL: https://github.com/apache/arrow-datafusion/pull/55#discussion_r651112853 ## File path: datafusion/src/logical_plan/dfschema.rs ## @@ -208,6 +297,28 @@ impl Into for DFSchema { } } +impl Into for { +/// Convert

[GitHub] [arrow] ianmcook commented on pull request #10520: ARROW-12709: [C++] Add var_args_join

2021-06-14 Thread GitBox
ianmcook commented on pull request #10520: URL: https://github.com/apache/arrow/pull/10520#issuecomment-860817694 The use of "binary" in the names of these string join kernels is unfortunate; it's not clear at first glance whether "binary" is a reference to the arity or to the input type.

[GitHub] [arrow] github-actions[bot] commented on pull request #10528: ARROW-13073: [Developer] archery benchmark list: unexpected keyword 'benchmark_filter'

2021-06-14 Thread GitBox
github-actions[bot] commented on pull request #10528: URL: https://github.com/apache/arrow/pull/10528#issuecomment-860812473 https://issues.apache.org/jira/browse/ARROW-13073 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow] dianaclarke opened a new pull request #10528: ARROW-13073: [Developer] archery benchmark list: unexpected keyword 'benchmark_filter'

2021-06-14 Thread GitBox
dianaclarke opened a new pull request #10528: URL: https://github.com/apache/arrow/pull/10528 ``` $ archery benchmark list Traceback (most recent call last): File "/Users/diana/envs/arrow/bin/archery", line 33, in sys.exit(load_entry_point('archery', 'console_scripts',

[GitHub] [arrow-datafusion] alamb commented on pull request #546: parallelize window function evaluations

2021-06-14 Thread GitBox
alamb commented on pull request #546: URL: https://github.com/apache/arrow-datafusion/pull/546#issuecomment-860810784 Just to be super clear, I am not suggesting we add a task scheduler as part of adding window functions -- I was trying to say that I felt following the existing pattern of

[GitHub] [arrow] kiszk closed pull request #10525: ARROW-13026: [CI] Use LLVM 10 for s390x

2021-06-14 Thread GitBox
kiszk closed pull request #10525: URL: https://github.com/apache/arrow/pull/10525 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #543: Ballista: Implement map-side shuffle

2021-06-14 Thread GitBox
alamb commented on a change in pull request #543: URL: https://github.com/apache/arrow-datafusion/pull/543#discussion_r651085152 ## File path: ballista/rust/core/src/execution_plans/query_stage.rs ## @@ -150,32 +159,150 @@ impl ExecutionPlan for QueryStageExec {

[GitHub] [arrow] lidavidm commented on pull request #10520: ARROW-12709: [C++] Add var_args_join

2021-06-14 Thread GitBox
lidavidm commented on pull request #10520: URL: https://github.com/apache/arrow/pull/10520#issuecomment-860807360 Ah, maybe then we should rename `element_wise_min` to `min_element_wise`. -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #520: Implement window functions with `order_by` clause

2021-06-14 Thread GitBox
alamb commented on a change in pull request #520: URL: https://github.com/apache/arrow-datafusion/pull/520#discussion_r651068287 ## File path: datafusion/src/physical_plan/expressions/nth_value.rs ## @@ -113,54 +111,32 @@ impl BuiltInWindowFunctionExpr for NthValue {

[GitHub] [arrow-datafusion] Dandandan commented on a change in pull request #543: Ballista: Implement map-side shuffle

2021-06-14 Thread GitBox
Dandandan commented on a change in pull request #543: URL: https://github.com/apache/arrow-datafusion/pull/543#discussion_r651076922 ## File path: ballista/rust/core/src/execution_plans/query_stage.rs ## @@ -150,32 +159,150 @@ impl ExecutionPlan for QueryStageExec {

[GitHub] [arrow-datafusion] Dandandan commented on pull request #556: hash float arrays using primitive usigned integer type

2021-06-14 Thread GitBox
Dandandan commented on pull request #556: URL: https://github.com/apache/arrow-datafusion/pull/556#issuecomment-860795232 Thanks @houqp -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow-datafusion] Dandandan closed issue #512: hash_join.rs's create_hashes function panics with float columns with nightly rustc

2021-06-14 Thread GitBox
Dandandan closed issue #512: URL: https://github.com/apache/arrow-datafusion/issues/512 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [arrow-datafusion] Dandandan merged pull request #556: hash float arrays using primitive usigned integer type

2021-06-14 Thread GitBox
Dandandan merged pull request #556: URL: https://github.com/apache/arrow-datafusion/pull/556 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this

[GitHub] [arrow-datafusion] Dandandan merged pull request #559: Filter push down for Union

2021-06-14 Thread GitBox
Dandandan merged pull request #559: URL: https://github.com/apache/arrow-datafusion/pull/559 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this

[GitHub] [arrow-datafusion] Dandandan closed issue #557: Filters aren't passed down to table scans in a union

2021-06-14 Thread GitBox
Dandandan closed issue #557: URL: https://github.com/apache/arrow-datafusion/issues/557 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [arrow] jorisvandenbossche commented on pull request #10520: ARROW-12709: [C++] Add var_args_join

2021-06-14 Thread GitBox
jorisvandenbossche commented on pull request #10520: URL: https://github.com/apache/arrow/pull/10520#issuecomment-860792611 That would indeed be more consistent. Personally, searching for the function / "tab-completion" in mind, I think having the name start with "binary_join" or

[GitHub] [arrow] thisisnic commented on pull request #10519: ARROW-12867: [R] Bindings for abs()

2021-06-14 Thread GitBox
thisisnic commented on pull request #10519: URL: https://github.com/apache/arrow/pull/10519#issuecomment-860792424 Awesome, thanks! Another really minor change to suggest: the approach to your unit tests is great; however, there's a helper function in the Arrow package called

[GitHub] [arrow] jorisvandenbossche edited a comment on pull request #10520: ARROW-12709: [C++] Add var_args_join

2021-06-14 Thread GitBox
jorisvandenbossche edited a comment on pull request #10520: URL: https://github.com/apache/arrow/pull/10520#issuecomment-860784051 Some naming nitpicks ;) I think "var_args_join" is not super clear. Having a notion about it being for string data would be good, and the scalar list of

[GitHub] [arrow-datafusion] alamb commented on pull request #342: Left join could use bitmap for left join instead of Vec

2021-06-14 Thread GitBox
alamb commented on pull request #342: URL: https://github.com/apache/arrow-datafusion/pull/342#issuecomment-860786614 FYI arrow 4.3.0 has been released with the code in BooleanBufferBuilder -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [arrow-datafusion] alamb commented on pull request #510: Support modulus op

2021-06-14 Thread GitBox
alamb commented on pull request #510: URL: https://github.com/apache/arrow-datafusion/pull/510#issuecomment-860786176 Arrow 4.3.0 has been released so if you rebase this PR it will likely be ready to go -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] lidavidm commented on pull request #10520: ARROW-12709: [C++] Add var_args_join

2021-06-14 Thread GitBox
lidavidm commented on pull request #10520: URL: https://github.com/apache/arrow/pull/10520#issuecomment-860785077 Or perhaps `element_wise_binary_join` since it's also `element_wise_min`? -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [arrow] jorisvandenbossche commented on pull request #10520: ARROW-12709: [C++] Add var_args_join

2021-06-14 Thread GitBox
jorisvandenbossche commented on pull request #10520: URL: https://github.com/apache/arrow/pull/10520#issuecomment-860784051 Some naming nitpicks ;) I think "var_args_join" is not super clear. Having a notion about it being for string data would be good, and the scalar list of string

[GitHub] [arrow-rs] alamb commented on pull request #443: parquet: improve BOOLEAN writing logic and report error on encoding fail

2021-06-14 Thread GitBox
alamb commented on pull request #443: URL: https://github.com/apache/arrow-rs/pull/443#issuecomment-860783118 > I'd already written the test, just been in meetings. If we'd rather rely on the test framework to terminate hanging tests, just remove the thread/mpsc/channel stuff and do a

[GitHub] [arrow-rs] alamb commented on a change in pull request #443: parquet: improve BOOLEAN writing logic and report error on encoding fail

2021-06-14 Thread GitBox
alamb commented on a change in pull request #443: URL: https://github.com/apache/arrow-rs/pull/443#discussion_r651058305 ## File path: parquet/tests/boolean_writer.rs ## @@ -0,0 +1,100 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor

[GitHub] [arrow] ianmcook commented on pull request #10520: ARROW-12709: [C++] Add var_args_join

2021-06-14 Thread GitBox
ianmcook commented on pull request #10520: URL: https://github.com/apache/arrow/pull/10520#issuecomment-860779304 Thanks for working on this @lidavidm! I will add the relevant functions to the R bindings after this is merged (ARROW-11514). -- This is an automated message from the

[GitHub] [arrow-rs] codecov-commenter commented on pull request #457: Add sort boolean benchmark

2021-06-14 Thread GitBox
codecov-commenter commented on pull request #457: URL: https://github.com/apache/arrow-rs/pull/457#issuecomment-860769544 #

[GitHub] [arrow-rs] garyanaplan commented on pull request #443: parquet: improve BOOLEAN writing logic and report error on encoding fail

2021-06-14 Thread GitBox
garyanaplan commented on pull request #443: URL: https://github.com/apache/arrow-rs/pull/443#issuecomment-860768278 I'd already written the test, just been in meetings. If we'd rather rely on the test framework to terminate hanging tests, just remove the thread/mpsc/channel stuff and do a

[GitHub] [arrow] jorisvandenbossche commented on pull request #10457: ARROW-12980: [C++] Kernels to extract datetime components should be timezone aware

2021-06-14 Thread GitBox
jorisvandenbossche commented on pull request #10457: URL: https://github.com/apache/arrow/pull/10457#issuecomment-860765541 > I'll wait for the consensus to build on the timezone handling discussions before closing the PR and moving the python tests to a new PR. I think there was no

[GitHub] [arrow-rs] ritchie46 opened a new issue #458: Arrow 4.3.0 does not compile for feature gates `["simd", "avx512"], ["simd"]`

2021-06-14 Thread GitBox
ritchie46 opened a new issue #458: URL: https://github.com/apache/arrow-rs/issues/458 The released arrow version 4.3.0 does not compile with SIMD feature flags: ``` # compiles =4.2: features = ["simd", "avx512"]# does not compile =4.2: features = ["simd"] =4.3: features =

[GitHub] [arrow-rs] alamb opened a new pull request #457: Add sort boolean benchmark

2021-06-14 Thread GitBox
alamb opened a new pull request #457: URL: https://github.com/apache/arrow-rs/pull/457 # Which issue does this PR close? Re #447 # Rationale for this change While reviewing #448 I wanted to measure performance change # What changes are included in this PR?

[GitHub] [arrow] ggershinsky edited a comment on pull request #10450: ARROW-9947: [Python] High-level Python API for Parquet encryption of files.

2021-06-14 Thread GitBox
ggershinsky edited a comment on pull request #10450: URL: https://github.com/apache/arrow/pull/10450#issuecomment-860738074 @pitrou @wesm This is the core PR for bringing Parquet encryption to PyArrow and pandas. Due to possible threading differences between the two frameworks, this PR

[GitHub] [arrow-rs] alamb commented on pull request #443: parquet: improve BOOLEAN writing logic and report error on encoding fail

2021-06-14 Thread GitBox
alamb commented on pull request #443: URL: https://github.com/apache/arrow-rs/pull/443#issuecomment-860740823 > I can think of ways to do that with a timeout and assume that if the read doesn't finish within timeout, then it must have failed. @garyanaplan I don't think we need to

[GitHub] [arrow] ggershinsky edited a comment on pull request #10450: ARROW-9947: [Python] High-level Python API for Parquet encryption of files.

2021-06-14 Thread GitBox
ggershinsky edited a comment on pull request #10450: URL: https://github.com/apache/arrow/pull/10450#issuecomment-860738074 @pitrou @wesm This is the core PR for bringing Parquet encryption to PyArrow and pandas. Due to possible threading differences between the two frameworks, this PR

[GitHub] [arrow] HashidaTKS commented on pull request #10527: ARROW-6870: [C#] Add Support for Dictionary Arrays and Dictionary Encoding

2021-06-14 Thread GitBox
HashidaTKS commented on pull request #10527: URL: https://github.com/apache/arrow/pull/10527#issuecomment-860738406 cc: @eerhardt Would you please review this when you have time? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [arrow] ggershinsky commented on pull request #10450: ARROW-9947: [Python] High-level Python API for Parquet encryption of files.

2021-06-14 Thread GitBox
ggershinsky commented on pull request #10450: URL: https://github.com/apache/arrow/pull/10450#issuecomment-860738074 @pitrou @wesm This is the core PR for bringing Parquet encryption to PyArrow and pandas. Due to possible threading differences between the two frameworks, this PR might

[GitHub] [arrow-rs] codecov-commenter edited a comment on pull request #439: [WIP] FFI bridge for Schema, Field and DataType

2021-06-14 Thread GitBox
codecov-commenter edited a comment on pull request #439: URL: https://github.com/apache/arrow-rs/pull/439#issuecomment-857974778 #

  1   2   3   4   >