Re: [I] Incorrect like results for pattern starting/ending with `%` percent and containing escape characters [arrow-rs]

2024-11-08 Thread via GitHub
tustvold closed issue #6702: Incorrect like results for pattern starting/ending with `%` percent and containing escape characters URL: https://github.com/apache/arrow-rs/issues/6702 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] Fix LIKE with escapes [arrow-rs]

2024-11-08 Thread via GitHub
tustvold merged PR #6703: URL: https://github.com/apache/arrow-rs/pull/6703 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apa

Re: [PR] Implement logical_null_count for more array types [arrow-rs]

2024-11-08 Thread via GitHub
tustvold commented on PR #6704: URL: https://github.com/apache/arrow-rs/pull/6704#issuecomment-2466105016 FWIW the default implementation doesn't allocate, it just increments a reference count, but this is largely harmless -- This is an automated message from the Apache Git Service. To re

Re: [PR] Implement logical_null_count for more array types [arrow-rs]

2024-11-08 Thread via GitHub
tustvold merged PR #6704: URL: https://github.com/apache/arrow-rs/pull/6704 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apa

Re: [I] [GLib] Add GArrowStringViewDataType [arrow]

2024-11-08 Thread via GitHub
kou commented on issue #44686: URL: https://github.com/apache/arrow/issues/44686#issuecomment-2466080937 Issue resolved by pull request 44687 https://github.com/apache/arrow/pull/44687 -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] GH-44686: [GLib] Add GArrowStringViewDataType [arrow]

2024-11-08 Thread via GitHub
kou merged PR #44687: URL: https://github.com/apache/arrow/pull/44687 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.or

Re: [PR] GH-44686: [GLib] Add GArrowStringViewDataType [arrow]

2024-11-08 Thread via GitHub
github-actions[bot] commented on PR #44687: URL: https://github.com/apache/arrow/pull/44687#issuecomment-2466066037 :warning: GitHub issue #44686 **has been automatically assigned in GitHub** to PR creator. -- This is an automated message from the Apache Git Service. To respond to the mes

Re: [I] field_to_json() in arrow_integration_test/ field.rs does not serialize fields metadata [arrow-rs]

2024-11-08 Thread via GitHub
pshampanier commented on issue #6700: URL: https://github.com/apache/arrow-rs/issues/6700#issuecomment-2466068243 Not lost, just omitted at serialization: https://github.com/apache/arrow-rs/blob/0e9abcd69eedb4080f74e0631ca3cf065cf6553e/arrow-integration-test/src/field.rs#L266-L296

Re: [I] [C++] Improve error handling for hash table merges [arrow]

2024-11-08 Thread via GitHub
helloitsheqing commented on issue #32381: URL: https://github.com/apache/arrow/issues/32381#issuecomment-2466067759 Will do, thank you! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

[PR] GH-44686: [GLib] Add GArrowStringViewDataType [arrow]

2024-11-08 Thread via GitHub
hiroyuki-sato opened a new pull request, #44687: URL: https://github.com/apache/arrow/pull/44687 ### Rationale for this change The `arrow::StringViewType` has been introduced. GLib needs to be implemented as the `GArrowStringViewDataType`. ### What changes

Re: [PR] GH-44667: [Archery] Suppress pull/push progress logs [arrow]

2024-11-08 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #44669: URL: https://github.com/apache/arrow/pull/44669#issuecomment-2466066118 After merging your PR, Conbench analyzed the 3 benchmarking runs that have been run so far on merge-commit f28ba4459faf7665492fe365bcd78daf533b2702. There were no

Re: [PR] GH-44010: [C++] Add `arrow::RecordBatch::MakeStatisticsArray()` [arrow]

2024-11-08 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #44252: URL: https://github.com/apache/arrow/pull/44252#issuecomment-2466038998 After merging your PR, Conbench analyzed the 3 benchmarking runs that have been run so far on merge-commit d748acee35ecaf88cf6191048c2cac43007a76b7. There were no

Re: [PR] GH-44393: [C++][Compute] Swizzle vector functions [arrow]

2024-11-08 Thread via GitHub
zanmato1984 commented on code in PR #44394: URL: https://github.com/apache/arrow/pull/44394#discussion_r1835235184 ## cpp/src/arrow/compute/api_vector.h: ## @@ -257,6 +257,38 @@ class ARROW_EXPORT ListFlattenOptions : public FunctionOptions { bool recursive = false; }; +/

Re: [PR] GH-44393: [C++][Compute] Swizzle vector functions [arrow]

2024-11-08 Thread via GitHub
zanmato1984 commented on code in PR #44394: URL: https://github.com/apache/arrow/pull/44394#discussion_r1835234475 ## cpp/src/arrow/compute/kernels/codegen_internal.h: ## @@ -1037,8 +1037,9 @@ ArrayKernelExec GenerateFloatingPoint(detail::GetTypeId get_id) { // Generate a kern

Re: [PR] GH-44393: [C++][Compute] Swizzle vector functions [arrow]

2024-11-08 Thread via GitHub
zanmato1984 commented on code in PR #44394: URL: https://github.com/apache/arrow/pull/44394#discussion_r1835234475 ## cpp/src/arrow/compute/kernels/codegen_internal.h: ## @@ -1037,8 +1037,9 @@ ArrayKernelExec GenerateFloatingPoint(detail::GetTypeId get_id) { // Generate a kern

Re: [PR] GH-43951: [CI][Python] Use GitHub Packages for vcpkg cache [arrow]

2024-11-08 Thread via GitHub
kou commented on PR #44644: URL: https://github.com/apache/arrow/pull/44644#issuecomment-2465993294 @github-actions crossbow submit java-jars -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [I] [C++] Improve error handling for hash table merges [arrow]

2024-11-08 Thread via GitHub
kou commented on issue #32381: URL: https://github.com/apache/arrow/issues/32381#issuecomment-2465995988 Nobody is working on this. Could you try this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [PR] GH-43951: [CI][Python] Use GitHub Packages for vcpkg cache [arrow]

2024-11-08 Thread via GitHub
github-actions[bot] commented on PR #44644: URL: https://github.com/apache/arrow/pull/44644#issuecomment-2465994065 Revision: 63d255c9d3f52c1f4e40da434d89713cfb9749ef Submitted crossbow builds: [ursacomputing/crossbow @ actions-267447c157](https://github.com/ursacomputing/crossbow/bra

Re: [PR] GH-44607: [C++][Dev] Update bundled Thrift, update mirrors to use CDN [arrow]

2024-11-08 Thread via GitHub
github-actions[bot] commented on PR #44685: URL: https://github.com/apache/arrow/pull/44685#issuecomment-2465978258 Revision: 7173f310925a6daecc49a6c90b0c3a677dea7f9c Submitted crossbow builds: [ursacomputing/crossbow @ actions-c90acec2ee](https://github.com/ursacomputing/crossbow/bra

Re: [PR] GH-44607: [C++][Dev] Update bundled Thrift, update mirrors to use CDN [arrow]

2024-11-08 Thread via GitHub
github-actions[bot] commented on PR #44685: URL: https://github.com/apache/arrow/pull/44685#issuecomment-2465979524 Revision: 77d7c31c56f65d139fa2a669ab706ff1b549232c Submitted crossbow builds: [ursacomputing/crossbow @ actions-da415e2d2d](https://github.com/ursacomputing/crossbow/bra

[PR] GH-44607: [C++][Dev] Update bundled Thrift, update mirrors to use CDN [arrow]

2024-11-08 Thread via GitHub
amoeba opened a new pull request, #44685: URL: https://github.com/apache/arrow/pull/44685 ### Rationale for this change Builds with bundled Thrift could fail because all of the download mirrors we had set for it are now offline. Since those mirror URLs were added, ASF has put up a CD

Re: [PR] GH-44607: [C++][Dev] Update bundled Thrift, update mirrors to use CDN [arrow]

2024-11-08 Thread via GitHub
amoeba commented on PR #44685: URL: https://github.com/apache/arrow/pull/44685#issuecomment-2465978597 @github-actions crossbow submit -g cpp -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] GH-44607: [C++][Dev] Update bundled Thrift, update mirrors to use CDN [arrow]

2024-11-08 Thread via GitHub
amoeba commented on PR #44685: URL: https://github.com/apache/arrow/pull/44685#issuecomment-2465977252 @github-actions crossbow submit test-r-linux-as-cran -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] Prevent FlightData overflowing max size limit whenever possible. [arrow-rs]

2024-11-08 Thread via GitHub
itsjunetime commented on code in PR #6690: URL: https://github.com/apache/arrow-rs/pull/6690#discussion_r1835162135 ## arrow-buffer/src/buffer/immutable.rs: ## @@ -261,11 +261,11 @@ impl Buffer { } /// Returns a slice of this buffer starting at a certain bit offset.

Re: [PR] Fix Buffer::bit_slice losing length with byte-aligned offsets [arrow-rs]

2024-11-08 Thread via GitHub
itsjunetime commented on PR #6707: URL: https://github.com/apache/arrow-rs/pull/6707#issuecomment-2465907322 Ah, it looks like `cargo msrv find` tries to find the minimum available version that would work for _all_ the crates in your workspace, including e.g. the parquet bin target and its

Re: [PR] Fix LIKE with escapes [arrow-rs]

2024-11-08 Thread via GitHub
alamb commented on PR #6703: URL: https://github.com/apache/arrow-rs/pull/6703#issuecomment-2465903188 Benchmark results (basically no change in my opinion): ``` ++ critcmp master findepi_fix-like-with-escapes-792c56 group findepi_fix

[PR] Fix Buffer::bit_slice losing length with byte-aligned offsets [arrow-rs]

2024-11-08 Thread via GitHub
itsjunetime opened a new pull request, #6707: URL: https://github.com/apache/arrow-rs/pull/6707 # Which issue does this PR close? A part of #3478; necessary for #6690, which closes the aforementioned issue. # Rationale for this change If `bit_slice` is called with a given

[PR] feat(csharp/src/Drivers/Apache): add connect and query timeout options - unimplemented [arrow-adbc]

2024-11-08 Thread via GitHub
birschick-bq opened a new pull request, #2312: URL: https://github.com/apache/arrow-adbc/pull/2312 Adds options for command and query timeout | Property | Description | Default | | :--- | :---| :---| | `adbc.spark.connect_timeout_ms` |

Re: [PR] GH-44674: [R] Fix R CMD check failure with dev testthat [arrow]

2024-11-08 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #44675: URL: https://github.com/apache/arrow/pull/44675#issuecomment-2465871866 After merging your PR, Conbench analyzed the 3 benchmarking runs that have been run so far on merge-commit ef830524e8bb7ec6f9e74f3a4534bea8690ffc52. There were no

Re: [I] `with_unsigned_payload` shouldn't generate payload hash [arrow-rs]

2024-11-08 Thread via GitHub
tustvold closed issue #6697: `with_unsigned_payload` shouldn't generate payload hash URL: https://github.com/apache/arrow-rs/issues/6697 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] check sign_payload instead of skip_signature before computing checksum [arrow-rs]

2024-11-08 Thread via GitHub
tustvold merged PR #6698: URL: https://github.com/apache/arrow-rs/pull/6698 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apa

Re: [PR] GH-43683: [Python] Use pandas StringDtype when enabled (pandas 3+) [arrow]

2024-11-08 Thread via GitHub
WillAyd commented on code in PR #44195: URL: https://github.com/apache/arrow/pull/44195#discussion_r1835125599 ## python/pyarrow/tests/test_pandas.py: ## @@ -4523,9 +4550,11 @@ def test_metadata_compat_range_index_pre_0_12(): gen_name_1 = '__index_level_1__' # Case 1

Re: [PR] GH-43683: [Python] Use pandas StringDtype when enabled (pandas 3+) [arrow]

2024-11-08 Thread via GitHub
jorisvandenbossche commented on code in PR #44195: URL: https://github.com/apache/arrow/pull/44195#discussion_r1835117672 ## python/pyarrow/tests/test_pandas.py: ## @@ -4523,9 +4550,11 @@ def test_metadata_compat_range_index_pre_0_12(): gen_name_1 = '__index_level_1__'

Re: [PR] Prevent FlightData overflowing max size limit whenever possible. [arrow-rs]

2024-11-08 Thread via GitHub
itsjunetime commented on code in PR #6690: URL: https://github.com/apache/arrow-rs/pull/6690#discussion_r1835114076 ## arrow-ipc/src/writer.rs: ## @@ -1414,44 +1314,73 @@ fn get_buffer_element_width(spec: &BufferSpec) -> usize { } } -/// Common functionality for re-enco

Re: [PR] GH-43951: [CI][Python] Use GitHub Packages for vcpkg cache [arrow]

2024-11-08 Thread via GitHub
github-actions[bot] commented on PR #44644: URL: https://github.com/apache/arrow/pull/44644#issuecomment-2465840095 Revision: c75a11773c5d03194be4524a9ef67e930942b4d9 Submitted crossbow builds: [ursacomputing/crossbow @ actions-422eaa7713](https://github.com/ursacomputing/crossbow/bra

Re: [PR] GH-43951: [CI][Python] Use GitHub Packages for vcpkg cache [arrow]

2024-11-08 Thread via GitHub
kou commented on PR #44644: URL: https://github.com/apache/arrow/pull/44644#issuecomment-2465837760 @github-actions crossbow submit java-jars wheel-manylinux-2014-cp39-cp39-* -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

Re: [PR] Prevent FlightData overflowing max size limit whenever possible. [arrow-rs]

2024-11-08 Thread via GitHub
itsjunetime commented on code in PR #6690: URL: https://github.com/apache/arrow-rs/pull/6690#discussion_r1835109766 ## arrow-flight/src/encode.rs: ## @@ -1485,93 +1481,62 @@ mod tests { hydrate_dictionaries(&batch, batch.schema()).expect("failed to optimize"); }

Re: [PR] GH-44667: [Archery] Suppress pull/push progress logs [arrow]

2024-11-08 Thread via GitHub
assignUser merged PR #44669: URL: https://github.com/apache/arrow/pull/44669 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.ap

Re: [I] [C++] Improve error handling for hash table merges [arrow]

2024-11-08 Thread via GitHub
helloitsheqing commented on issue #32381: URL: https://github.com/apache/arrow/issues/32381#issuecomment-2465831184 Hi team! I was wondering if anyone is currently working on this issue and if not, if I am able to take a shot at it? Thank you. -- This is an automated message from the

Re: [I] [Archery] Suppress pull/push progress logs [arrow]

2024-11-08 Thread via GitHub
assignUser commented on issue #44667: URL: https://github.com/apache/arrow/issues/44667#issuecomment-2465819367 Issue resolved by pull request 44669 https://github.com/apache/arrow/pull/44669 -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] GH-44667: [Archery] Suppress pull/push progress logs [arrow]

2024-11-08 Thread via GitHub
assignUser commented on code in PR #44669: URL: https://github.com/apache/arrow/pull/44669#discussion_r1835100133 ## dev/tasks/python-wheels/github.linux.yml: ## @@ -126,10 +126,8 @@ jobs: {{ macros.github_upload_gemfury("arrow/python/repaired_wheels/*.whl")|indent }}

[PR] GH-44563: [C++] Add missing ARROW_IPC dependency to ARROW_COMPUTE [arrow]

2024-11-08 Thread via GitHub
kou opened a new pull request, #44684: URL: https://github.com/apache/arrow/pull/44684 ### Rationale for this change The compute module uses the IPC module features for option serialization. ### What changes are included in this PR? Enable the IPC module when the compute

Re: [PR] GH-44563: [C++] Add missing ARROW_IPC dependency to ARROW_COMPUTE [arrow]

2024-11-08 Thread via GitHub
kou commented on PR #44684: URL: https://github.com/apache/arrow/pull/44684#issuecomment-2465809132 @jleibs Could you try this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [PR] GH-44563: [C++] Add missing ARROW_IPC dependency to ARROW_COMPUTE [arrow]

2024-11-08 Thread via GitHub
github-actions[bot] commented on PR #44684: URL: https://github.com/apache/arrow/pull/44684#issuecomment-2465808092 :warning: GitHub issue #44563 **has been automatically assigned in GitHub** to PR creator. -- This is an automated message from the Apache Git Service. To respond to the mes

Re: [PR] Undo run end filter performance regression [arrow-rs]

2024-11-08 Thread via GitHub
Dandandan commented on code in PR #6691: URL: https://github.com/apache/arrow-rs/pull/6691#discussion_r1835084600 ## arrow-select/src/filter.rs: ## @@ -423,30 +423,30 @@ fn filter_array(values: &dyn Array, predicate: &FilterPredicate) -> Result( -re_arr: &RunArray, -pre

Re: [PR] Prevent FlightData overflowing max size limit whenever possible. [arrow-rs]

2024-11-08 Thread via GitHub
itsjunetime commented on code in PR #6690: URL: https://github.com/apache/arrow-rs/pull/6690#discussion_r1835086786 ## arrow-flight/src/encode.rs: ## @@ -700,6 +682,9 @@ mod tests { use super::*; #[test] +// flight_data_from_arrow_batch is deprecated but does exa

Re: [PR] Undo run end filter performance regression [arrow-rs]

2024-11-08 Thread via GitHub
Dandandan commented on code in PR #6691: URL: https://github.com/apache/arrow-rs/pull/6691#discussion_r1835084600 ## arrow-select/src/filter.rs: ## @@ -423,30 +423,30 @@ fn filter_array(values: &dyn Array, predicate: &FilterPredicate) -> Result( -re_arr: &RunArray, -pre

Re: [PR] Prevent FlightData overflowing max size limit whenever possible. [arrow-rs]

2024-11-08 Thread via GitHub
itsjunetime commented on code in PR #6690: URL: https://github.com/apache/arrow-rs/pull/6690#discussion_r1835082863 ## arrow-flight/src/encode.rs: ## @@ -327,6 +327,10 @@ impl FlightDataEncoder { /// Encodes batch into one or more `FlightData` messages in self.queue

Re: [PR] Prevent FlightData overflowing max size limit whenever possible. [arrow-rs]

2024-11-08 Thread via GitHub
itsjunetime commented on code in PR #6690: URL: https://github.com/apache/arrow-rs/pull/6690#discussion_r1835081592 ## arrow-buffer/src/buffer/immutable.rs: ## @@ -261,11 +261,11 @@ impl Buffer { } /// Returns a slice of this buffer starting at a certain bit offset.

Re: [PR] Fix LIKE with escapes [arrow-rs]

2024-11-08 Thread via GitHub
findepi commented on code in PR #6703: URL: https://github.com/apache/arrow-rs/pull/6703#discussion_r1835071373 ## arrow-string/src/predicate.rs: ## @@ -45,16 +45,12 @@ impl<'a> Predicate<'a> { pub fn like(pattern: &'a str) -> Result { if !contains_like_pattern(pat

Re: [I] [Python] How to use fs.FileSystem.from_uri with Azurite [arrow]

2024-11-08 Thread via GitHub
kou commented on issue #44682: URL: https://github.com/apache/arrow/issues/44682#issuecomment-2465750514 The feature was removed by #44220. Could you use an environment variable? `AZURE_PASSWORD` may work: https://learn.microsoft.com/en-us/python/api/azure-identity/azure.identity.e

Re: [PR] Add filter_kernel benchmark for run array [arrow-rs]

2024-11-08 Thread via GitHub
delamarch3 commented on PR #6706: URL: https://github.com/apache/arrow-rs/pull/6706#issuecomment-2465722398 The results seem quite severe compared to the others: ```text Benchmarking filter run array (kept 1/2): Warming up for 3. s Warning: Unable to complete 100 samples in 5.0s.

[PR] Add filter_kernel benchmark for run array [arrow-rs]

2024-11-08 Thread via GitHub
delamarch3 opened a new pull request, #6706: URL: https://github.com/apache/arrow-rs/pull/6706 # Which issue does this PR close? Related to https://github.com/apache/arrow-rs/pull/6691 and https://github.com/apache/arrow-rs/pull/6675 # Rationale for this change

Re: [I] [C++] Add a convenient function that converts `arrow::ArrayStatistics` to `arrow::Array` for the Arrow C data interface [arrow]

2024-11-08 Thread via GitHub
kou commented on issue #44010: URL: https://github.com/apache/arrow/issues/44010#issuecomment-2465709698 Issue resolved by pull request 44252 https://github.com/apache/arrow/pull/44252 -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] GH-44010: [C++] Add `arrow::RecordBatch::MakeStatisticsArray()` [arrow]

2024-11-08 Thread via GitHub
kou merged PR #44252: URL: https://github.com/apache/arrow/pull/44252 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.or

Re: [PR] GH-44667: [Archery] Suppress pull/push progress logs [arrow]

2024-11-08 Thread via GitHub
kou commented on code in PR #44669: URL: https://github.com/apache/arrow/pull/44669#discussion_r1835026044 ## dev/tasks/python-wheels/github.linux.yml: ## @@ -126,10 +126,8 @@ jobs: {{ macros.github_upload_gemfury("arrow/python/repaired_wheels/*.whl")|indent }} {{

Re: [PR] Support native S3 conditional writes [arrow-rs]

2024-11-08 Thread via GitHub
tustvold commented on PR #6682: URL: https://github.com/apache/arrow-rs/pull/6682#issuecomment-2465696376 https://github.com/apache/arrow-rs/issues/6596 tracks the next release -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [PR] Support native S3 conditional writes [arrow-rs]

2024-11-08 Thread via GitHub
criccomini commented on PR #6682: URL: https://github.com/apache/arrow-rs/pull/6682#issuecomment-2465655913 Amazing. Any idea when this might go out 🔥 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [PR] Fix LIKE with escapes [arrow-rs]

2024-11-08 Thread via GitHub
alamb commented on code in PR #6703: URL: https://github.com/apache/arrow-rs/pull/6703#discussion_r1834986481 ## arrow-string/src/predicate.rs: ## @@ -45,16 +45,12 @@ impl<'a> Predicate<'a> { pub fn like(pattern: &'a str) -> Result { if !contains_like_pattern(patte

Re: [PR] Undo run end filter performance regression [arrow-rs]

2024-11-08 Thread via GitHub
alamb commented on PR #6691: URL: https://github.com/apache/arrow-rs/pull/6691#issuecomment-2465625035 > I'll try to add one into `filter_kernels` Thank you -- if you could do so as a separate PR that would be most helpful (so it is easy to compare with these changes) 🙏 -- This is

Re: [PR] Undo run end filter performance regression [arrow-rs]

2024-11-08 Thread via GitHub
delamarch3 commented on PR #6691: URL: https://github.com/apache/arrow-rs/pull/6691#issuecomment-2465621676 I'll try to add one into `filter_kernels` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [I] MapArray Requires Values Array [arrow-rs]

2024-11-08 Thread via GitHub
etseidl commented on issue #1642: URL: https://github.com/apache/arrow-rs/issues/1642#issuecomment-2465588327 Bumping this in light of https://github.com/apache/parquet-format/pull/469. I think @nevi-me is correct. As [pointed out](https://github.com/apache/parquet-format/pull/469#discussio

Re: [PR] Undo run end filter performance regression [arrow-rs]

2024-11-08 Thread via GitHub
delamarch3 commented on PR #6691: URL: https://github.com/apache/arrow-rs/pull/6691#issuecomment-2465583647 @alamb Sorry, I wrote a separate benchmark for this but I didn't commit it, it's not consistent with the results it returns on each run so I thought it needed work but there was enoug

Re: [PR] Implement logical_null_count for more array types [arrow-rs]

2024-11-08 Thread via GitHub
findepi commented on PR #6704: URL: https://github.com/apache/arrow-rs/pull/6704#issuecomment-2465582540 I would consider it rude _not_ to tag you on a PR that's on a topic I know you're interested in and knowledgable about. -- This is an automated message from the Apache Git Service.

Re: [PR] Undo run end filter performance regression [arrow-rs]

2024-11-08 Thread via GitHub
alamb commented on PR #6691: URL: https://github.com/apache/arrow-rs/pull/6691#issuecomment-2465571567 I apologize for being denise @delamarch3 -- but I spent a while trying to find the benchmarks you are running and I couldn't figure out which they were. Is it the `filter_kernels`? --

Re: [PR] Implement logical_null_count for more array types [arrow-rs]

2024-11-08 Thread via GitHub
tustvold commented on PR #6704: URL: https://github.com/apache/arrow-rs/pull/6704#issuecomment-2465543730 Please can you stop tagging me on PRs you've filled literally moments ago, it is disruptive and rude. I will get to your PRs within a couple of days, but your incessant nagging is reall

Re: [PR] Return `BoxStream` with `'static` lifetime from `ObjectStore::list` [arrow-rs]

2024-11-08 Thread via GitHub
tustvold commented on PR #6619: URL: https://github.com/apache/arrow-rs/pull/6619#issuecomment-2465540638 This change makes sense to me, and is definitely something we should consider when looking to make a breaking releas. However, I think we want to get a few more such changes lined up be

Re: [PR] Implement logical_null_count for more array types [arrow-rs]

2024-11-08 Thread via GitHub
findepi commented on PR #6704: URL: https://github.com/apache/arrow-rs/pull/6704#issuecomment-2465539732 cc @tustvold @alamb -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Return `BoxStream` with `'static` lifetime from `ObjectStore::list` [arrow-rs]

2024-11-08 Thread via GitHub
tustvold commented on code in PR #6619: URL: https://github.com/apache/arrow-rs/pull/6619#discussion_r1834904749 ## object_store/src/client/list.rs: ## @@ -44,37 +44,38 @@ pub(crate) trait ListClientExt { prefix: Option<&Path>, delimiter: bool, offset:

Re: [PR] Support Duration in JSON Reader [arrow-rs]

2024-11-08 Thread via GitHub
alamb commented on PR #6683: URL: https://github.com/apache/arrow-rs/pull/6683#issuecomment-2465509778 > Do you think the current PR can be merged as-is? We can then create a follow-up issue to discuss the other representations. I recommend we at least have the property that data writ

Re: [PR] Undo run end filter performance regression [arrow-rs]

2024-11-08 Thread via GitHub
alamb commented on PR #6691: URL: https://github.com/apache/arrow-rs/pull/6691#issuecomment-2465514372 I am running the benchmarks on this PR to verify. Thank you @delamarch3 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

Re: [PR] Return `BoxStream` with `'static` lifetime from `ObjectStore::list` [arrow-rs]

2024-11-08 Thread via GitHub
alamb commented on code in PR #6619: URL: https://github.com/apache/arrow-rs/pull/6619#discussion_r1834883065 ## object_store/src/client/list.rs: ## @@ -44,37 +44,38 @@ pub(crate) trait ListClientExt { prefix: Option<&Path>, delimiter: bool, offset: Op

Re: [PR] Handle primitive REPEATED field not contained in LIST annotated group [arrow-rs]

2024-11-08 Thread via GitHub
alamb commented on PR #6649: URL: https://github.com/apache/arrow-rs/pull/6649#issuecomment-2465493661 > Sorry for the late reply. I'm not sure whether we should fix reading or totally prohibit writing `repeated primitive fields without LIST annotation as a list type`. This is a gray area f

Re: [PR] Fix string view ILIKE checks with NULL values [arrow-rs]

2024-11-08 Thread via GitHub
alamb merged PR #6705: URL: https://github.com/apache/arrow-rs/pull/6705 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache

Re: [PR] Fix string view LIKE checks with NULL values [arrow-rs]

2024-11-08 Thread via GitHub
alamb commented on code in PR #6662: URL: https://github.com/apache/arrow-rs/pull/6662#discussion_r1834866152 ## arrow-string/src/predicate.rs: ## @@ -146,6 +138,7 @@ impl<'a> Predicate<'a> { } Predicate::IStartsWithAscii(v) => { if let

Re: [PR] check sign_payload instead of skip_signature before computing checksum [arrow-rs]

2024-11-08 Thread via GitHub
mherrerarendon commented on code in PR #6698: URL: https://github.com/apache/arrow-rs/pull/6698#discussion_r1834850990 ## object_store/src/aws/client.rs: ## @@ -350,7 +350,7 @@ impl<'a> Request<'a> { } pub(crate) fn with_payload(mut self, payload: PutPayload) -> Self

Re: [PR] Return `BoxStream` with `'static` lifetime from `ObjectStore::list` [arrow-rs]

2024-11-08 Thread via GitHub
kylebarron commented on code in PR #6619: URL: https://github.com/apache/arrow-rs/pull/6619#discussion_r1834840644 ## object_store/src/client/list.rs: ## @@ -44,37 +44,38 @@ pub(crate) trait ListClientExt { prefix: Option<&Path>, delimiter: bool, offse

Re: [PR] Fix string view ILIKE checks with NULL values [arrow-rs]

2024-11-08 Thread via GitHub
findepi commented on PR #6705: URL: https://github.com/apache/arrow-rs/pull/6705#issuecomment-2465427583 @alamb ptal -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsu

Re: [PR] Fix string view LIKE checks with NULL values [arrow-rs]

2024-11-08 Thread via GitHub
findepi commented on code in PR #6662: URL: https://github.com/apache/arrow-rs/pull/6662#discussion_r1834837827 ## arrow-string/src/predicate.rs: ## @@ -146,6 +138,7 @@ impl<'a> Predicate<'a> { } Predicate::IStartsWithAscii(v) => { if l

Re: [PR] Fix LIKE with escapes [arrow-rs]

2024-11-08 Thread via GitHub
findepi commented on PR #6703: URL: https://github.com/apache/arrow-rs/pull/6703#issuecomment-2465395045 Rebased to resolve conflict with https://github.com/apache/arrow-rs/pull/6662 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

Re: [PR] Fix string view LIKE checks with NULL values [arrow-rs]

2024-11-08 Thread via GitHub
findepi commented on code in PR #6662: URL: https://github.com/apache/arrow-rs/pull/6662#discussion_r1834810237 ## arrow-string/src/predicate.rs: ## @@ -146,6 +138,7 @@ impl<'a> Predicate<'a> { } Predicate::IStartsWithAscii(v) => { if l

[PR] Implement logical_null_count for more array types [arrow-rs]

2024-11-08 Thread via GitHub
findepi opened a new pull request, #6704: URL: https://github.com/apache/arrow-rs/pull/6704 # Description Implement `Array::logical_null_count()` where it's easy to calculate answer without relying on the default implementation which allocates. # Which issue does this PR close?

Re: [PR] GH-44393: [C++][Compute] Swizzle vector functions [arrow]

2024-11-08 Thread via GitHub
mapleFU commented on code in PR #44394: URL: https://github.com/apache/arrow/pull/44394#discussion_r1834769765 ## cpp/src/arrow/compute/kernels/codegen_internal.h: ## @@ -1037,8 +1037,9 @@ ArrayKernelExec GenerateFloatingPoint(detail::GetTypeId get_id) { // Generate a kernel g

Re: [PR] Fix string view LIKE checks with NULL values [arrow-rs]

2024-11-08 Thread via GitHub
alamb commented on code in PR #6662: URL: https://github.com/apache/arrow-rs/pull/6662#discussion_r1834746604 ## arrow-string/src/predicate.rs: ## @@ -146,6 +138,7 @@ impl<'a> Predicate<'a> { } Predicate::IStartsWithAscii(v) => { if let

Re: [PR] Fix string view LIKE checks with NULL values [arrow-rs]

2024-11-08 Thread via GitHub
alamb merged PR #6662: URL: https://github.com/apache/arrow-rs/pull/6662 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache

Re: [PR] [Parquet] Add BooleanArray based row selection [arrow-rs]

2024-11-08 Thread via GitHub
alamb commented on code in PR #6624: URL: https://github.com/apache/arrow-rs/pull/6624#discussion_r1834695119 ## parquet/src/arrow/arrow_reader/boolean_selection.rs: ## @@ -0,0 +1,314 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor licen

Re: [PR] Fix string view LIKE checks with NULL values [arrow-rs]

2024-11-08 Thread via GitHub
alamb commented on PR #6662: URL: https://github.com/apache/arrow-rs/pull/6662#issuecomment-2465282432 Here are my benchmark results. My conclusion is that there is some non trivial variability in the benchmarks but I don't think this PR does anything substantial ``` group

Re: [PR] GH-44393: [C++][Compute] Swizzle vector functions [arrow]

2024-11-08 Thread via GitHub
zanmato1984 commented on code in PR #44394: URL: https://github.com/apache/arrow/pull/44394#discussion_r1834747613 ## cpp/src/arrow/compute/api_vector.h: ## @@ -705,5 +737,52 @@ Result> PairwiseDiff(const Array& array, bool check_ove

Re: [PR] GH-43631: [C++] Add C++ implementation of Async C Data Interface [arrow]

2024-11-08 Thread via GitHub
bkietz commented on code in PR #44495: URL: https://github.com/apache/arrow/pull/44495#discussion_r1834704974 ## cpp/src/arrow/c/bridge.cc: ## @@ -2511,4 +2516,345 @@ Result> ImportDeviceChunkedArray( return ImportChunked(stream, mapper); } +namespace { + +class AsyncReco

Re: [PR] Prevent FlightData overflowing max size limit whenever possible. [arrow-rs]

2024-11-08 Thread via GitHub
alamb commented on code in PR #6690: URL: https://github.com/apache/arrow-rs/pull/6690#discussion_r1834572196 ## arrow-flight/src/encode.rs: ## @@ -327,6 +327,10 @@ impl FlightDataEncoder { /// Encodes batch into one or more `FlightData` messages in self.queue fn enc

Re: [PR] Fix string view LIKE checks with NULL values [arrow-rs]

2024-11-08 Thread via GitHub
alamb commented on code in PR #6662: URL: https://github.com/apache/arrow-rs/pull/6662#discussion_r1834712232 ## arrow-string/src/predicate.rs: ## @@ -116,10 +116,17 @@ impl<'a> Predicate<'a> { }), Predicate::Contains(finder) => { if le

Re: [PR] check sign_payload instead of skip_signature before computing checksum [arrow-rs]

2024-11-08 Thread via GitHub
andrebsguedes commented on code in PR #6698: URL: https://github.com/apache/arrow-rs/pull/6698#discussion_r1834707910 ## object_store/src/aws/client.rs: ## @@ -350,7 +350,7 @@ impl<'a> Request<'a> { } pub(crate) fn with_payload(mut self, payload: PutPayload) -> Self

Re: [PR] Return `BoxStream` with `'static` lifetime from `ObjectStore::list` [arrow-rs]

2024-11-08 Thread via GitHub
alamb commented on code in PR #6619: URL: https://github.com/apache/arrow-rs/pull/6619#discussion_r1834707027 ## object_store/src/client/list.rs: ## @@ -44,37 +44,38 @@ pub(crate) trait ListClientExt { prefix: Option<&Path>, delimiter: bool, offset: Op

Re: [PR] Make downcast macros hygenic (#6400) [arrow-rs]

2024-11-08 Thread via GitHub
alamb merged PR #6620: URL: https://github.com/apache/arrow-rs/pull/6620 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache

Re: [PR] Make downcast macros hygenic (#6400) [arrow-rs]

2024-11-08 Thread via GitHub
alamb commented on PR #6620: URL: https://github.com/apache/arrow-rs/pull/6620#issuecomment-2465225792 Thanks again @tustvold and @crepererum -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [I] `downcast_primitive_array` and `downcast_dictionary_array` are not hygienic wrt imports [arrow-rs]

2024-11-08 Thread via GitHub
alamb closed issue #6400: `downcast_primitive_array` and `downcast_dictionary_array` are not hygienic wrt imports URL: https://github.com/apache/arrow-rs/issues/6400 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] GH-44393: [C++][Compute] Swizzle vector functions [arrow]

2024-11-08 Thread via GitHub
pitrou commented on code in PR #44394: URL: https://github.com/apache/arrow/pull/44394#discussion_r1834697105 ## cpp/src/arrow/compute/api_vector.h: ## @@ -705,5 +737,52 @@ Result> PairwiseDiff(const Array& array, bool check_overflow

Re: [I] Need a mechanism to handle schema changes due to dictionary hydration in FlightSQL server implementations [arrow-rs]

2024-11-08 Thread via GitHub
alamb closed issue #6672: Need a mechanism to handle schema changes due to dictionary hydration in FlightSQL server implementations URL: https://github.com/apache/arrow-rs/issues/6672 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

Re: [PR] feat: expose known_schema from FlightDataEncoder [arrow-rs]

2024-11-08 Thread via GitHub
alamb commented on PR #6688: URL: https://github.com/apache/arrow-rs/pull/6688#issuecomment-2465201447 Thanks again -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

Re: [I] Need a mechanism to handle schema changes due to dictionary hydration in FlightSQL server implementations [arrow-rs]

2024-11-08 Thread via GitHub
alamb closed issue #6672: Need a mechanism to handle schema changes due to dictionary hydration in FlightSQL server implementations URL: https://github.com/apache/arrow-rs/issues/6672 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

  1   2   >