[GitHub] [arrow-rs] sunchao commented on a change in pull request #443: parquet: improve BOOLEAN writing logic and report error on encoding fail

2021-06-14 Thread GitBox
sunchao commented on a change in pull request #443: URL: https://github.com/apache/arrow-rs/pull/443#discussion_r651247944 ## File path: parquet/src/data_type.rs ## @@ -661,8 +661,15 @@ pub(crate) mod private { _: W, bit_writer: BitWriter,

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #561: Fix pruning on not equal predicate

2021-06-14 Thread GitBox
alamb commented on a change in pull request #561: URL: https://github.com/apache/arrow-datafusion/pull/561#discussion_r651282910 ## File path: datafusion/src/physical_optimizer/pruning.rs ## @@ -553,12 +553,14 @@ fn build_predicate_expression( let corrected_op =

[GitHub] [arrow] lidavidm commented on pull request #10520: ARROW-12709: [C++] Add binary_join_element_wise

2021-06-14 Thread GitBox
lidavidm commented on pull request #10520: URL: https://github.com/apache/arrow/pull/10520#issuecomment-861035926 The null handling behavior never affects the separator: it only describes how to handle the other values. It's intended to let you mimic libcudf. ```

[GitHub] [arrow] ianmcook commented on pull request #10520: ARROW-12709: [C++] Add binary_join_element_wise

2021-06-14 Thread GitBox
ianmcook commented on pull request #10520: URL: https://github.com/apache/arrow/pull/10520#issuecomment-861046053 Oh, I see; I was confused about what happens when the separator is an array; I see now. Thank you! -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [arrow-datafusion] Jimexist commented on a change in pull request #520: Implement window functions with `order_by` clause

2021-06-14 Thread GitBox
Jimexist commented on a change in pull request #520: URL: https://github.com/apache/arrow-datafusion/pull/520#discussion_r651361200 ## File path: datafusion/src/physical_plan/windows.rs ## @@ -156,31 +162,72 @@ impl WindowExpr for BuiltInWindowExpr {

[GitHub] [arrow-datafusion] codecov-commenter edited a comment on pull request #558: Implement window functions with `partition_by` clause

2021-06-14 Thread GitBox
codecov-commenter edited a comment on pull request #558: URL: https://github.com/apache/arrow-datafusion/pull/558#issuecomment-860569767 #

[GitHub] [arrow-datafusion] Jimexist commented on pull request #520: Implement window functions with `order_by` clause

2021-06-14 Thread GitBox
Jimexist commented on pull request #520: URL: https://github.com/apache/arrow-datafusion/pull/520#issuecomment-861088442 > This looks very nice @Jimexist -- I went over the code and saw only goodness :) > > All that this PR needs to be mergeable in my opinion is to reset the Cargo

[GitHub] [arrow-datafusion] Jimexist edited a comment on pull request #520: Implement window functions with `order_by` clause

2021-06-14 Thread GitBox
Jimexist edited a comment on pull request #520: URL: https://github.com/apache/arrow-datafusion/pull/520#issuecomment-861088442 > This looks very nice @Jimexist -- I went over the code and saw only goodness :) > > All that this PR needs to be mergeable in my opinion is to reset the

[GitHub] [arrow] rok commented on a change in pull request #10457: ARROW-12980: [C++] Kernels to extract datetime components should be timezone aware

2021-06-14 Thread GitBox
rok commented on a change in pull request #10457: URL: https://github.com/apache/arrow/pull/10457#discussion_r651290261 ## File path: cpp/src/arrow/compute/kernels/scalar_temporal_test.cc ## @@ -143,39 +142,202 @@ TEST(ScalarTemporalTest,

[GitHub] [arrow-datafusion] Jimexist commented on a change in pull request #520: Implement window functions with `order_by` clause

2021-06-14 Thread GitBox
Jimexist commented on a change in pull request #520: URL: https://github.com/apache/arrow-datafusion/pull/520#discussion_r651360294 ## File path: datafusion/src/physical_plan/expressions/nth_value.rs ## @@ -113,54 +111,32 @@ impl BuiltInWindowFunctionExpr for NthValue {

[GitHub] [arrow] rok commented on a change in pull request #10457: ARROW-12980: [C++] Kernels to extract datetime components should be timezone aware

2021-06-14 Thread GitBox
rok commented on a change in pull request #10457: URL: https://github.com/apache/arrow/pull/10457#discussion_r651241888 ## File path: python/pyarrow/tests/test_compute.py ## @@ -1212,6 +1212,80 @@ def test_strptime(): assert got == expected +def

[GitHub] [arrow] github-actions[bot] commented on pull request #10530: ARROW-13072: [C++] Add bit-wise arithmetic kernels

2021-06-14 Thread GitBox
github-actions[bot] commented on pull request #10530: URL: https://github.com/apache/arrow/pull/10530#issuecomment-860958622 https://issues.apache.org/jira/browse/ARROW-13072 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow] rok commented on pull request #10457: ARROW-12980: [C++] Kernels to extract datetime components should be timezone aware

2021-06-14 Thread GitBox
rok commented on pull request #10457: URL: https://github.com/apache/arrow/pull/10457#issuecomment-861007346 > I think there was no opposition for this one (i.e. that the field extraction should yield local hour/minute/etc), so I don't think there is a need to close the PR. Huh, I

[GitHub] [arrow] rok commented on a change in pull request #10476: ARROW-12499: [C++][Compute] Add ScalarAggregateOptions to Any and All kernels

2021-06-14 Thread GitBox
rok commented on a change in pull request #10476: URL: https://github.com/apache/arrow/pull/10476#discussion_r651318115 ## File path: cpp/src/arrow/compute/kernels/aggregate_basic.cc ## @@ -166,32 +168,48 @@ struct BooleanAnyImpl : public ScalarAggregator { Status

[GitHub] [arrow-rs] Jimexist commented on pull request #448: Use partition for bool sort

2021-06-14 Thread GitBox
Jimexist commented on pull request #448: URL: https://github.com/apache/arrow-rs/pull/448#issuecomment-861078094 i guess 1024 is too short an array so it's dominated by the memory allocation rather than sorting -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [arrow-datafusion] Dandandan commented on a change in pull request #561: Fix pruning on not equal predicate

2021-06-14 Thread GitBox
Dandandan commented on a change in pull request #561: URL: https://github.com/apache/arrow-datafusion/pull/561#discussion_r651243037 ## File path: datafusion/src/physical_optimizer/pruning.rs ## @@ -553,12 +553,14 @@ fn build_predicate_expression( let corrected_op =

[GitHub] [arrow-datafusion] adsharma edited a comment on issue #533: Add extension plugin to parse SQL into logical plan

2021-06-14 Thread GitBox
adsharma edited a comment on issue #533: URL: https://github.com/apache/arrow-datafusion/issues/533#issuecomment-860999601 I don't have much context about the proposal. Trying to understand things better. Please bear with me. The reason why SQL doesn't have `select * from t version

[GitHub] [arrow-datafusion] adsharma commented on issue #533: Add extension plugin to parse SQL into logical plan

2021-06-14 Thread GitBox
adsharma commented on issue #533: URL: https://github.com/apache/arrow-datafusion/issues/533#issuecomment-860999601 I don't have much context about the proposal. Trying to understand things better. Please bear with me. The reason why SQL doesn't have `select * from t version as of

[GitHub] [arrow] edponce commented on issue #10502: AttributeError: module 'pyarrow.lib' has no attribute '_Weakrefable'

2021-06-14 Thread GitBox
edponce commented on issue #10502: URL: https://github.com/apache/arrow/issues/10502#issuecomment-861009813 @bhargav-inthezone Kaggle notebook [installs latest pyarrow by default](https://github.com/Kaggle/docker-python/blob/main/Dockerfile#L347) but it seems the Docker image was created

[GitHub] [arrow] bhargav-inthezone commented on issue #10502: AttributeError: module 'pyarrow.lib' has no attribute '_Weakrefable'

2021-06-14 Thread GitBox
bhargav-inthezone commented on issue #10502: URL: https://github.com/apache/arrow/issues/10502#issuecomment-861147363 I had to uninstall pyarrow before installing Vaex before but I think kaggle fixed the problem. Now its working with installing just Vaex in the notebook -- This is an

[GitHub] [arrow] p5a0u9l opened a new issue #10531: How to resolve `pyarrow.deserialize` FutureWarning

2021-06-14 Thread GitBox
p5a0u9l opened a new issue #10531: URL: https://github.com/apache/arrow/issues/10531 I'd like to resolve this warning ``` /usr/lib/python3.7/asyncio/events.py:88: FutureWarning: 'pyarrow.deserialize' is deprecated as of 2.0.0 and will be removed in a future version. Use pickle or

[GitHub] [arrow] BryanCutler commented on pull request #10513: ARROW-13044: [Java] Change UnionVector and DenseUnionVector to extend AbstractContainerVector

2021-06-14 Thread GitBox
BryanCutler commented on pull request #10513: URL: https://github.com/apache/arrow/pull/10513#issuecomment-861157094 merged to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] BryanCutler closed pull request #10513: ARROW-13044: [Java] Change UnionVector and DenseUnionVector to extend AbstractContainerVector

2021-06-14 Thread GitBox
BryanCutler closed pull request #10513: URL: https://github.com/apache/arrow/pull/10513 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [arrow-rs] jorgecarleitao commented on issue #455: Not able to read nano-second timestamp columns in 1.0 parquet files written by pyarrow

2021-06-14 Thread GitBox
jorgecarleitao commented on issue #455: URL: https://github.com/apache/arrow-rs/issues/455#issuecomment-861181763 This may be related to which priority we give when converting, the parquet schema or the arrow schema. I would expect pyarrow to write them in a consistent manner though, so,

[GitHub] [arrow-datafusion] Jimexist opened a new issue #562: publish datafusion-cli to brew repo so that macOS users can install easily

2021-06-14 Thread GitBox
Jimexist opened a new issue #562: URL: https://github.com/apache/arrow-datafusion/issues/562 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** A clear and concise description of what the problem is. Ex. I'm always frustrated when

[GitHub] [arrow] cyb70289 commented on a change in pull request #10530: ARROW-13072: [C++] Add bit-wise arithmetic kernels

2021-06-14 Thread GitBox
cyb70289 commented on a change in pull request #10530: URL: https://github.com/apache/arrow/pull/10530#discussion_r651405645 ## File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc ## @@ -397,6 +397,130 @@ struct PowerChecked { } }; +// Bitwise operations +

[GitHub] [arrow-rs] emkornfield commented on issue #455: Not able to read nano-second timestamp columns in 1.0 parquet files written by pyarrow

2021-06-14 Thread GitBox
emkornfield commented on issue #455: URL: https://github.com/apache/arrow-rs/issues/455#issuecomment-861130175 > @emkornfield do you know what we should do on the Rust side to roundtrip the file correctly? Sorry, I would have to dig in the code to have a better understanding. In

[GitHub] [arrow] frmnboi commented on issue #10488: Passing back and forth from Python and C++ with Pyarrow C++ extension and pybind11.

2021-06-14 Thread GitBox
frmnboi commented on issue #10488: URL: https://github.com/apache/arrow/issues/10488#issuecomment-861180963 After running python with debug symbols in GDB, Here is the relevant part of the GDB backtrace with directory names redacted: ``` #0 __memmove_avx_unaligned_erms () at

[GitHub] [arrow-rs] alippai commented on pull request #453: Add C data interface for decimal128

2021-06-14 Thread GitBox
alippai commented on pull request #453: URL: https://github.com/apache/arrow-rs/pull/453#issuecomment-860251015 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries

[GitHub] [arrow-rs] alippai opened a new issue #454: Add missing kernels for decimal128

2021-06-14 Thread GitBox
alippai opened a new issue #454: URL: https://github.com/apache/arrow-rs/issues/454 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** Decimal128 type doesn't have the basic kernels like add, take, filter **Describe the

[GitHub] [arrow-datafusion] alamb merged pull request #539: Refactor hash aggregates's planner building code

2021-06-14 Thread GitBox
alamb merged pull request #539: URL: https://github.com/apache/arrow-datafusion/pull/539 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [arrow-rs] codecov-commenter commented on pull request #443: parquet: improve BOOLEAN writing logic and report error on encoding fail

2021-06-14 Thread GitBox
codecov-commenter commented on pull request #443: URL: https://github.com/apache/arrow-rs/pull/443#issuecomment-860050867 #

[GitHub] [arrow-datafusion] Jimexist opened a new issue #551: support for automatic value promotion for aggregation functions

2021-06-14 Thread GitBox
Jimexist opened a new issue #551: URL: https://github.com/apache/arrow-datafusion/issues/551 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** A clear and concise description of what the problem is. Ex. I'm always frustrated when

[GitHub] [arrow-datafusion] Dandandan opened a new pull request #559: Push down filter through UNION

2021-06-14 Thread GitBox
Dandandan opened a new pull request #559: URL: https://github.com/apache/arrow-datafusion/pull/559 # Which issue does this PR close? Closes #557 # Rationale for this change # What changes are included in this PR? Filter is pushed down through union

[GitHub] [arrow-rs] nevi-me closed issue #193: JSON reader does not implement iterator

2021-06-14 Thread GitBox
nevi-me closed issue #193: URL: https://github.com/apache/arrow-rs/issues/193 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please

[GitHub] [arrow-datafusion] codecov-commenter edited a comment on pull request #552: mv register_schema() to impl

2021-06-14 Thread GitBox
codecov-commenter edited a comment on pull request #552: URL: https://github.com/apache/arrow-datafusion/pull/552#issuecomment-860214243 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow-datafusion] Jimexist opened a new pull request #545: turn on clippy rule for needless borrow

2021-06-14 Thread GitBox
Jimexist opened a new pull request #545: URL: https://github.com/apache/arrow-datafusion/pull/545 # Which issue does this PR close? Closes #. # Rationale for this change # What changes are included in this PR? # Are there any user-facing changes?

[GitHub] [arrow-datafusion] codecov-commenter commented on pull request #548: reuse code for now function expr creation

2021-06-14 Thread GitBox
codecov-commenter commented on pull request #548: URL: https://github.com/apache/arrow-datafusion/pull/548#issuecomment-860131843 #

[GitHub] [arrow-datafusion] houqp edited a comment on pull request #55: Support qualified columns in queries

2021-06-14 Thread GitBox
houqp edited a comment on pull request #55: URL: https://github.com/apache/arrow-datafusion/pull/55#issuecomment-860150632 @andygrove @Dandandan @jorgecarleitao @alamb @Jimexist this PR is now ready for review. We are now able to pass both tpch-7 and tpch-8. I filed

[GitHub] [arrow-datafusion] jorgecarleitao commented on issue #533: Add extension plugin to parse SQL into logical plan

2021-06-14 Thread GitBox
jorgecarleitao commented on issue #533: URL: https://github.com/apache/arrow-datafusion/issues/533#issuecomment-860156321 I also do not see the issue with the example above, but I would say that, In general, custom SQL parsers effectively modify the SQL dialect that is being used and

[GitHub] [arrow-rs] jorgecarleitao commented on a change in pull request #439: [WIP] FFI bridge for Schema, Field and DataType

2021-06-14 Thread GitBox
jorgecarleitao commented on a change in pull request #439: URL: https://github.com/apache/arrow-rs/pull/439#discussion_r650633827 ## File path: arrow/src/ffi.rs ## @@ -208,15 +213,25 @@ impl FFI_ArrowSchema { pub fn name() -> { assert!(!self.name.is_null());

[GitHub] [arrow-datafusion] Jimexist commented on pull request #550: Upgrade arrow 430

2021-06-14 Thread GitBox
Jimexist commented on pull request #550: URL: https://github.com/apache/arrow-datafusion/pull/550#issuecomment-860197215 > I plan to release arrow 4.3 to crates.io tomorrow (process is underway -- see

[GitHub] [arrow-rs] nevi-me commented on issue #455: Not able to read nano-second timestamp columns in 1.0 parquet files written by pyarrow

2021-06-14 Thread GitBox
nevi-me commented on issue #455: URL: https://github.com/apache/arrow-rs/issues/455#issuecomment-860299977 I'd have expected the ns timestamp to be written to `int96` as that is the legacy nanosecond timestamp format. I'll also dig into this to see, maybe if the format is `1.0` then we

[GitHub] [arrow-datafusion] Dandandan commented on a change in pull request #520: Implement window functions with `order_by` clause

2021-06-14 Thread GitBox
Dandandan commented on a change in pull request #520: URL: https://github.com/apache/arrow-datafusion/pull/520#discussion_r650546802 ## File path: ballista/rust/core/Cargo.toml ## @@ -40,7 +40,7 @@ tokio = "1.0" tonic = "0.4" uuid = { version = "0.8", features = ["v4"] }

[GitHub] [arrow] projjal commented on pull request #10501: ARROW-13032: [Java] Update guava version

2021-06-14 Thread GitBox
projjal commented on pull request #10501: URL: https://github.com/apache/arrow/pull/10501#issuecomment-860475551 > Is there any reason not to use the latest version https://mvnrepository.com/artifact/com.google.guava/guava/30.1.1-jre ? Not really. I just updated to the patched

[GitHub] [arrow] kiszk commented on pull request #10525: ARROW-12709: [CI] Use LLVM 10 for s390x

2021-06-14 Thread GitBox
kiszk commented on pull request #10525: URL: https://github.com/apache/arrow/pull/10525#issuecomment-860477976 Yes, I will try to merge myself -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow-datafusion] Dandandan edited a comment on pull request #546: parallelize window function evaluations

2021-06-14 Thread GitBox
Dandandan edited a comment on pull request #546: URL: https://github.com/apache/arrow-datafusion/pull/546#issuecomment-860194116 > maybe a better way is to use rayon In my opinion we should try to stay away from Rayon and probably also should stay away from introducing parallelism

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #547: support table alias in join clause

2021-06-14 Thread GitBox
alamb commented on a change in pull request #547: URL: https://github.com/apache/arrow-datafusion/pull/547#discussion_r650511150 ## File path: datafusion/src/sql/planner.rs ## @@ -424,17 +424,24 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> { ctes: HashMap, )

[GitHub] [arrow-rs] codecov-commenter edited a comment on pull request #414: Doctests for DecimalArray.

2021-06-14 Thread GitBox
codecov-commenter edited a comment on pull request #414: URL: https://github.com/apache/arrow-rs/pull/414#issuecomment-855393423 #

[GitHub] [arrow-datafusion] alamb commented on pull request #545: turn on clippy rule for needless borrow

2021-06-14 Thread GitBox
alamb commented on pull request #545: URL: https://github.com/apache/arrow-datafusion/pull/545#issuecomment-860193523 Thank you @Jimexist ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow-datafusion] Jimexist commented on pull request #429: implement lead and lag built-in window function

2021-06-14 Thread GitBox
Jimexist commented on pull request #429: URL: https://github.com/apache/arrow-datafusion/pull/429#issuecomment-860225216 > Actually let's park this pull request for a while - I plan to implement sort and partition first and then window frame, after which the window shift approach might

[GitHub] [arrow] pitrou commented on issue #10488: Passing back and forth from Python and C++ with Pyarrow C++ extension and pybind11.

2021-06-14 Thread GitBox
pitrou commented on issue #10488: URL: https://github.com/apache/arrow/issues/10488#issuecomment-860451738 I would suggest debugging these crashes using gdb. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [arrow-rs] alamb merged pull request #438: use iterator for partition kernel instead of generating vec

2021-06-14 Thread GitBox
alamb merged pull request #438: URL: https://github.com/apache/arrow-rs/pull/438 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please

[GitHub] [arrow-rs] yordan-pavlov commented on issue #200: Use iterators to increase performance of creating Arrow arrays

2021-06-14 Thread GitBox
yordan-pavlov commented on issue #200: URL: https://github.com/apache/arrow-rs/issues/200#issuecomment-860275967 I finally had some time to check how the new `ArrowArrayReader` affects TPC-H benchmark results - for queries which use string columns (queries 1 and 12), there is a

[GitHub] [arrow-rs] alamb merged pull request #419: Remove DictionaryArray::keys_array method

2021-06-14 Thread GitBox
alamb merged pull request #419: URL: https://github.com/apache/arrow-rs/pull/419 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please

[GitHub] [arrow-datafusion] codecov-commenter commented on pull request #559: Filter push down for Union

2021-06-14 Thread GitBox
codecov-commenter commented on pull request #559: URL: https://github.com/apache/arrow-datafusion/pull/559#issuecomment-860438515 #

[GitHub] [arrow-datafusion] codecov-commenter edited a comment on pull request #548: reuse code for now function expr creation

2021-06-14 Thread GitBox
codecov-commenter edited a comment on pull request #548: URL: https://github.com/apache/arrow-datafusion/pull/548#issuecomment-860131843 #

[GitHub] [arrow-rs] novemberkilo commented on a change in pull request #414: Doctests for DecimalArray.

2021-06-14 Thread GitBox
novemberkilo commented on a change in pull request #414: URL: https://github.com/apache/arrow-rs/pull/414#discussion_r650718790 ## File path: arrow/src/array/array_binary.rs ## @@ -613,6 +613,32 @@ impl Array for FixedSizeBinaryArray { } /// A type of `DecimalArray` whose

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #10507: ARROW-13022: [R] bindings for lubridate's year, isoyear, quarter, month, day, wday, yday, isoweek, minute, and second

2021-06-14 Thread GitBox
jorisvandenbossche commented on a change in pull request #10507: URL: https://github.com/apache/arrow/pull/10507#discussion_r650738642 ## File path: r/R/expression.R ## @@ -28,8 +28,17 @@ # stringr spellings of those "str_length" = "utf8_length", "str_to_lower" =

[GitHub] [arrow-rs] garyanaplan commented on a change in pull request #443: parquet: improve BOOLEAN writing logic and report error on encoding fail

2021-06-14 Thread GitBox
garyanaplan commented on a change in pull request #443: URL: https://github.com/apache/arrow-rs/pull/443#discussion_r650746454 ## File path: parquet/src/data_type.rs ## @@ -661,8 +661,15 @@ pub(crate) mod private { _: W, bit_writer: BitWriter,

[GitHub] [arrow-rs] garyanaplan commented on a change in pull request #443: parquet: improve BOOLEAN writing logic and report error on encoding fail

2021-06-14 Thread GitBox
garyanaplan commented on a change in pull request #443: URL: https://github.com/apache/arrow-rs/pull/443#discussion_r650750468 ## File path: parquet/src/data_type.rs ## @@ -661,8 +661,15 @@ pub(crate) mod private { _: W, bit_writer: BitWriter,

[GitHub] [arrow] kou edited a comment on pull request #10525: ARROW-12709: [CI] Use LLVM 10 for s390x

2021-06-14 Thread GitBox
kou edited a comment on pull request #10525: URL: https://github.com/apache/arrow/pull/10525#issuecomment-860263925 @kiszk Do you want to merge this yourself? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [arrow] jonkeane closed pull request #10508: ARROW-13041: [C++] Ensure unary kernels zero-initialize data behind null entries

2021-06-14 Thread GitBox
jonkeane closed pull request #10508: URL: https://github.com/apache/arrow/pull/10508 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [arrow] anthonylouisbsb commented on pull request #10195: ARROW-12595: [C++][Gandiva] Implement TO_HEX([binary] field), HEX, UNHEX and FROM_HEX([string]field] functions

2021-06-14 Thread GitBox
anthonylouisbsb commented on pull request #10195: URL: https://github.com/apache/arrow/pull/10195#issuecomment-859521494 @projjal The functions related to `HEX` and `UNHEX` were added inside this PR too, so I think you need to review the PR again. -- This is an automated message from

[GitHub] [arrow-datafusion] andygrove commented on a change in pull request #541: WIP: ShuffleReaderExec now supports multiple locations per partition

2021-06-14 Thread GitBox
andygrove commented on a change in pull request #541: URL: https://github.com/apache/arrow-datafusion/pull/541#discussion_r650138952 ## File path: ballista/rust/client/src/context.rs ## @@ -74,32 +71,6 @@ impl BallistaContextState { } } -struct WrappedStream { -

[GitHub] [arrow-datafusion] edrevo commented on a change in pull request #541: ShuffleReaderExec now supports multiple locations per partition

2021-06-14 Thread GitBox
edrevo commented on a change in pull request #541: URL: https://github.com/apache/arrow-datafusion/pull/541#discussion_r650183206 ## File path: ballista/rust/core/src/execution_plans/shuffle_reader.rs ## @@ -86,23 +87,18 @@ impl ExecutionPlan for ShuffleReaderExec {

[GitHub] [arrow-rs] Jimexist opened a new pull request #448: Use partition for bool sort

2021-06-14 Thread GitBox
Jimexist opened a new pull request #448: URL: https://github.com/apache/arrow-rs/pull/448 # Which issue does this PR close? Closes #447 # Rationale for this change # What changes are included in this PR? # Are there any user-facing

[GitHub] [arrow-datafusion] andygrove opened a new issue #540: Ballista ShuffleReaderExec should be able to read from multiple locations per partition

2021-06-14 Thread GitBox
andygrove opened a new issue #540: URL: https://github.com/apache/arrow-datafusion/issues/540 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** As a step towards implementing true shuffle, we need `ShuffleReaderExec` to be able to

[GitHub] [arrow] nealrichardson commented on a change in pull request #10507: ARROW-13022: [R] bindings for lubridate's year, isoyear, quarter, month, day, wday, yday, isoweek, minute, and second func

2021-06-14 Thread GitBox
nealrichardson commented on a change in pull request #10507: URL: https://github.com/apache/arrow/pull/10507#discussion_r650076060 ## File path: r/R/expression.R ## @@ -28,8 +28,17 @@ # stringr spellings of those "str_length" = "utf8_length", "str_to_lower" =

[GitHub] [arrow] kou closed pull request #10515: ARROW-13030: [CI][Go] Setup Arm64 golang CI

2021-06-14 Thread GitBox
kou closed pull request #10515: URL: https://github.com/apache/arrow/pull/10515 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please

[GitHub] [arrow] github-actions[bot] commented on pull request #10517: ARROW-13050: [C++][Gandiva] Implement SPACE Hive function on Gandiva

2021-06-14 Thread GitBox
github-actions[bot] commented on pull request #10517: URL: https://github.com/apache/arrow/pull/10517#issuecomment-859490703 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [arrow] AlenkaF commented on a change in pull request #10519: ARROW-12867: [R] Bindings for abs()

2021-06-14 Thread GitBox
AlenkaF commented on a change in pull request #10519: URL: https://github.com/apache/arrow/pull/10519#discussion_r650068914 ## File path: r/R/dplyr-functions.R ## @@ -108,6 +108,10 @@ nse_funcs$is.infinite <- function(x) { is_inf & !nse_funcs$is.na(is_inf) }

[GitHub] [arrow-rs] codecov-commenter commented on pull request #449: remove clippy unnecessary wraps in cast kernel

2021-06-14 Thread GitBox
codecov-commenter commented on pull request #449: URL: https://github.com/apache/arrow-rs/pull/449#issuecomment-859619048 #

[GitHub] [arrow] github-actions[bot] commented on pull request #10521: ARROW-13065: [Packaging][RPM] Add missing required LZ4 version information

2021-06-14 Thread GitBox
github-actions[bot] commented on pull request #10521: URL: https://github.com/apache/arrow/pull/10521#issuecomment-859921976 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [arrow-datafusion] viirya commented on issue #531: `cargo build` cannot build the project

2021-06-14 Thread GitBox
viirya commented on issue #531: URL: https://github.com/apache/arrow-datafusion/issues/531#issuecomment-859727700 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [arrow] kou opened a new pull request #10514: ARROW-13045: [Packaging][RPM][deb] Don't install system utf8proc if it's old

2021-06-14 Thread GitBox
kou opened a new pull request #10514: URL: https://github.com/apache/arrow/pull/10514 See also: * #10477 30f52a202d0a2f6393366ea1e4a8e5182077c72a * ARROW-13002 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow] AlenkaF opened a new pull request #10519: ARROW-12867: [R] Bindings for abs()

2021-06-14 Thread GitBox
AlenkaF opened a new pull request #10519: URL: https://github.com/apache/arrow/pull/10519 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [arrow] github-actions[bot] commented on pull request #10515: ARROW-13030: [CI][Go] Setup Arm64 golang CI

2021-06-14 Thread GitBox
github-actions[bot] commented on pull request #10515: URL: https://github.com/apache/arrow/pull/10515#issuecomment-859308213 https://issues.apache.org/jira/browse/ARROW-13030 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow-datafusion] Dandandan closed issue #131: Implement hash partitioning

2021-06-14 Thread GitBox
Dandandan closed issue #131: URL: https://github.com/apache/arrow-datafusion/issues/131 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [arrow-datafusion] Jimexist commented on a change in pull request #539: Refactor hash aggregates's planner building code

2021-06-14 Thread GitBox
Jimexist commented on a change in pull request #539: URL: https://github.com/apache/arrow-datafusion/pull/539#discussion_r649920230 ## File path: datafusion/src/physical_plan/mod.rs ## @@ -343,7 +343,8 @@ pub enum Partitioning { RoundRobinBatch(usize), /// Allocate

[GitHub] [arrow-datafusion] edrevo commented on a change in pull request #532: reuse datafusion physical planner in ballista building from protobuf

2021-06-14 Thread GitBox
edrevo commented on a change in pull request #532: URL: https://github.com/apache/arrow-datafusion/pull/532#discussion_r649800622 ## File path: datafusion/src/physical_plan/windows.rs ## @@ -61,24 +63,27 @@ pub struct WindowAggExec { /// Create a physical expression for

[GitHub] [arrow] edponce commented on issue #10502: AttributeError: module 'pyarrow.lib' has no attribute '_Weakrefable'

2021-06-14 Thread GitBox
edponce commented on issue #10502: URL: https://github.com/apache/arrow/issues/10502#issuecomment-859270240 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #10507: ARROW-13022: [R] bindings for lubridate's year, isoyear, quarter, month, day, wday, yday, isoweek, minute, and second

2021-06-14 Thread GitBox
jorisvandenbossche commented on a change in pull request #10507: URL: https://github.com/apache/arrow/pull/10507#discussion_r649723619 ## File path: r/R/dplyr-functions.R ## @@ -442,3 +442,37 @@ nse_funcs$strptime <- function(x, format = "%Y-%m-%d %H:%M:%S", tz = NULL, unit

[GitHub] [arrow-datafusion] andygrove merged pull request #535: Make BallistaContext::collect streaming

2021-06-14 Thread GitBox
andygrove merged pull request #535: URL: https://github.com/apache/arrow-datafusion/pull/535 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this

[GitHub] [arrow-rs] Jimexist opened a new issue #446: sort kernel has a lot of unnecessary wrapping

2021-06-14 Thread GitBox
Jimexist opened a new issue #446: URL: https://github.com/apache/arrow-rs/issues/446 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

[GitHub] [arrow-rs] Jimexist opened a new issue #447: sort kernel boolean sort can be O(n)

2021-06-14 Thread GitBox
Jimexist opened a new issue #447: URL: https://github.com/apache/arrow-rs/issues/447 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

[GitHub] [arrow-datafusion] Dandandan edited a comment on issue #299: Support window functions with PARTITION BY clause

2021-06-14 Thread GitBox
Dandandan edited a comment on issue #299: URL: https://github.com/apache/arrow-datafusion/issues/299#issuecomment-859516691 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [arrow] jpedroantunes opened a new pull request #10516: ARROW-13049: [C++][Gandiva] Implement BIN Hive function on Gandiva

2021-06-14 Thread GitBox
jpedroantunes opened a new pull request #10516: URL: https://github.com/apache/arrow/pull/10516 Implement BIN Hive function on Gandiva -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow-datafusion] Jimexist commented on a change in pull request #532: reuse datafusion physical planner in ballista building from protobuf

2021-06-14 Thread GitBox
Jimexist commented on a change in pull request #532: URL: https://github.com/apache/arrow-datafusion/pull/532#discussion_r649843237 ## File path: datafusion/src/physical_plan/windows.rs ## @@ -61,24 +63,27 @@ pub struct WindowAggExec { /// Create a physical expression for

[GitHub] [arrow-rs] Jimexist opened a new pull request #449: remove clippy unnecessary wraps

2021-06-14 Thread GitBox
Jimexist opened a new pull request #449: URL: https://github.com/apache/arrow-rs/pull/449 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [arrow] github-actions[bot] commented on pull request #10518: ARROW-13052: [C++][Gandiva] Implements REGEXP_EXTRACT function

2021-06-14 Thread GitBox
github-actions[bot] commented on pull request #10518: URL: https://github.com/apache/arrow/pull/10518#issuecomment-859529990 https://issues.apache.org/jira/browse/ARROW-13052 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow-rs] Jimexist closed pull request #445: remove unnecessary wraps in sort

2021-06-14 Thread GitBox
Jimexist closed pull request #445: URL: https://github.com/apache/arrow-rs/pull/445 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [arrow-datafusion] andygrove commented on pull request #543: Ballista: Implement map-side shuffle

2021-06-14 Thread GitBox
andygrove commented on pull request #543: URL: https://github.com/apache/arrow-datafusion/pull/543#issuecomment-859939472 @edrevo fyi -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] domoritz closed pull request #10500: ARROW-13031: [JS] Support arm in closure compiler on macOS

2021-06-14 Thread GitBox
domoritz closed pull request #10500: URL: https://github.com/apache/arrow/pull/10500 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [arrow] kiszk commented on pull request #10501: ARROW-13032: [Java] Update guava version

2021-06-14 Thread GitBox
kiszk commented on pull request #10501: URL: https://github.com/apache/arrow/pull/10501#issuecomment-859690248 Is there any reason not to use the latest version https://mvnrepository.com/artifact/com.google.guava/guava/30.1.1-jre ? -- This is an automated message from the Apache Git

[GitHub] [arrow] github-actions[bot] commented on pull request #10519: ARROW-12867: [R] Bindings for abs()

2021-06-14 Thread GitBox
github-actions[bot] commented on pull request #10519: URL: https://github.com/apache/arrow/pull/10519#issuecomment-859534414 https://issues.apache.org/jira/browse/ARROW-12867 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow] kou commented on pull request #10510: ARROW-13043: [GLib][Ruby] Add GArrowEqualOptions

2021-06-14 Thread GitBox
kou commented on pull request #10510: URL: https://github.com/apache/arrow/pull/10510#issuecomment-859913376 +1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [arrow] nirandaperera commented on pull request #10487: ARROW-13010: [C++][Compute] Support outputting to slices from kleene kernels

2021-06-14 Thread GitBox
nirandaperera commented on pull request #10487: URL: https://github.com/apache/arrow/pull/10487#issuecomment-859187060 @github-actions autotune -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] jonkeane closed pull request #10506: ARROW-13039: [R] Fix error message handling

2021-06-14 Thread GitBox
jonkeane closed pull request #10506: URL: https://github.com/apache/arrow/pull/10506 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #541: ShuffleReaderExec now supports multiple locations per partition

2021-06-14 Thread GitBox
alamb commented on a change in pull request #541: URL: https://github.com/apache/arrow-datafusion/pull/541#discussion_r650205163 ## File path: ballista/rust/core/src/execution_plans/shuffle_reader.rs ## @@ -86,23 +87,18 @@ impl ExecutionPlan for ShuffleReaderExec {

<    1   2   3   4   >