Re: [I] Building ADBC and flightsql driver on Windows [arrow-adbc]

2025-05-22 Thread via GitHub
lidavidm commented on issue #2846: URL: https://github.com/apache/arrow-adbc/issues/2846#issuecomment-2901152820 You're likely running into something like https://github.com/apache/arrow-adbc/issues/634#issuecomment-1589876749 I haven't managed to figure this out myself (it doesn't he

Re: [I] Implement VectorAppender of RunEndEncodedVector [arrow-java]

2025-05-22 Thread via GitHub
ViggoC commented on issue #762: URL: https://github.com/apache/arrow-java/issues/762#issuecomment-2901151672 Hi @lidavidm, I want to discuss a question before we start implementing it. Do you think it's valid for two adjacent runs to have the same value? e.g. | values | run_ends |

Re: [I] Implement VectorAppender of RunEndEncodedVector [arrow-java]

2025-05-22 Thread via GitHub
lidavidm commented on issue #762: URL: https://github.com/apache/arrow-java/issues/762#issuecomment-2901157951 As far as I can see, this is not prohibited by the spec, so keeping them is fine [1][2] [1]: https://arrow.apache.org/docs/format/Columnar.html#run-end-encoded-layout [2]

Re: [PR] feat(c/driver/bigquery): add Google BigQuery support [arrow-adbc]

2025-05-22 Thread via GitHub
Benjamin-Philip commented on PR #1717: URL: https://github.com/apache/arrow-adbc/pull/1717#issuecomment-2901418231 Thanks. Will do. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] arrow-select: add support for optimized concatenation of struct arrays [arrow-rs]

2025-05-22 Thread via GitHub
Dandandan commented on code in PR #7517: URL: https://github.com/apache/arrow-rs/pull/7517#discussion_r2102703599 ## arrow-select/src/concat.rs: ## @@ -231,6 +231,45 @@ fn concat_bytes(arrays: &[&dyn Array]) -> Result Result { +let mut len = 0; +let mut has_nulls = fals

Re: [PR] GH-43041: [C++][Python] Read/write Parquet BYTE_ARRAY as Large/View types directly [arrow]

2025-05-22 Thread via GitHub
wgtmac commented on code in PR #46532: URL: https://github.com/apache/arrow/pull/46532#discussion_r2102703797 ## cpp/src/parquet/arrow/schema_internal.cc: ## @@ -117,34 +118,67 @@ Result> MakeArrowTimestamp(const LogicalType& logical Result> FromByteArray( const LogicalTy

Re: [PR] arrow-select: add support for optimized concatenation of struct arrays [arrow-rs]

2025-05-22 Thread via GitHub
Dandandan commented on PR #7517: URL: https://github.com/apache/arrow-rs/pull/7517#issuecomment-2901437495 Thanks @asubiotto ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

Re: [I] Add efficient concatenation of StructArrays [arrow-rs]

2025-05-22 Thread via GitHub
Dandandan closed issue #7516: Add efficient concatenation of StructArrays URL: https://github.com/apache/arrow-rs/issues/7516 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] arrow-select: add support for optimized concatenation of struct arrays [arrow-rs]

2025-05-22 Thread via GitHub
Dandandan commented on code in PR #7517: URL: https://github.com/apache/arrow-rs/pull/7517#discussion_r2102703599 ## arrow-select/src/concat.rs: ## @@ -231,6 +231,45 @@ fn concat_bytes(arrays: &[&dyn Array]) -> Result Result { +let mut len = 0; +let mut has_nulls = fals

Re: [PR] arrow-select: add support for optimized concatenation of struct arrays [arrow-rs]

2025-05-22 Thread via GitHub
Dandandan merged PR #7517: URL: https://github.com/apache/arrow-rs/pull/7517 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.ap

Re: [PR] GH-46522: [C++][FlightRPC] Add Arrow Flight SQL ODBC driver [arrow]

2025-05-22 Thread via GitHub
jmao-denver commented on PR #40939: URL: https://github.com/apache/arrow/pull/40939#issuecomment-2901452970 Congratulations on the merge! Got an error when trying the build it on Mac M2. ```shell ~/git/arrow/cpp/build main *1 ?1 ❯ cmake .. --preset ninja-debug-maximal ```

[I] [Question] Would contribution of logging wrapper and a "readahead" wrapper for ObjectStore be wanted? [arrow-rs-object-store]

2025-05-22 Thread via GitHub
m09526 opened a new issue, #380: URL: https://github.com/apache/arrow-rs-object-store/issues/380 **Which part is this question about** We have developed two `ObjectStore` wrapper implementations in a similar style to `LimitStore`, `PrefixStore`, etc. which we would like to contribute

Re: [PR] GH-46538: [CI][Packaging][AlmaLinux8] Ensure pip3 [arrow]

2025-05-22 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #46539: URL: https://github.com/apache/arrow/pull/46539#issuecomment-2901651352 After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit bbd9c7e8c6b7f46184e891900716911f9a20a12d. There were no

Re: [PR] GH-43132: [CI] Fix pre-commit Rat check [arrow]

2025-05-22 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #46541: URL: https://github.com/apache/arrow/pull/46541#issuecomment-2901660280 After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit aa5a484703f5f7bb6c0b45543e6d684e0b7e7beb. There were no

Re: [PR] feat(parquet/pqarrow): parallelize SeekToRow [arrow-go]

2025-05-22 Thread via GitHub
zeroshade merged PR #380: URL: https://github.com/apache/arrow-go/pull/380 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apac

Re: [PR] GH-52: Make RangeEqualsVisitor of RunEndEncodedVector more efficient [arrow-java]

2025-05-22 Thread via GitHub
ViggoC commented on PR #761: URL: https://github.com/apache/arrow-java/pull/761#issuecomment-2901073782 @lidavidm Thank you, I applied the spotless, but most of the checks failed, seems it's terminated for some reason. -- This is an automated message from the Apache Git Service. To respon

Re: [PR] GH-43041: [C++][Python] Read/write Parquet BYTE_ARRAY as Large/View types directly [arrow]

2025-05-22 Thread via GitHub
pitrou commented on code in PR #46532: URL: https://github.com/apache/arrow/pull/46532#discussion_r2102449344 ## cpp/src/parquet/arrow/schema_internal.cc: ## @@ -117,34 +118,67 @@ Result> MakeArrowTimestamp(const LogicalType& logical Result> FromByteArray( const LogicalTy

Re: [PR] GH-25025: [C++] Move non core compute kernels into separate shared library [arrow]

2025-05-22 Thread via GitHub
raulcd commented on code in PR #46261: URL: https://github.com/apache/arrow/pull/46261#discussion_r2102544207 ## c_glib/arrow-glib/compute.cpp: ## @@ -37,6 +37,9 @@ #include #include +// Initialize the compute library and register compute kernels. +auto compute_init_status

Re: [I] [Python] usability improvements for a "minimal" pyarrow [arrow]

2025-05-22 Thread via GitHub
WillAyd commented on issue #38536: URL: https://github.com/apache/arrow/issues/38536#issuecomment-2901077741 In lieu of adding more pyarrow- variants to pypi, another option would be to leverage the work done for adding meson-python support. If that goes live, we could document how users ca

Re: [I] [Java] Invocation of close method from ArrowFlightConnection throws a SQLException when using catalog as parameter [arrow-java]

2025-05-22 Thread via GitHub
lidavidm commented on issue #66: URL: https://github.com/apache/arrow-java/issues/66#issuecomment-2901177021 PRs are welcome. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] feat(c/driver/bigquery): add Google BigQuery support [arrow-adbc]

2025-05-22 Thread via GitHub
lidavidm commented on PR #1717: URL: https://github.com/apache/arrow-adbc/pull/1717#issuecomment-2901189152 @Benjamin-Philip you may want to post on the dev@ mailing list (see https://arrow.apache.org/community/) where the different maintainers can chime in. IMO, an Erlang implementation wo

Re: [PR] GH-45653: [Python] Scalar subclasses should implement Python protocols [arrow]

2025-05-22 Thread via GitHub
pitrou commented on code in PR #45818: URL: https://github.com/apache/arrow/pull/45818#discussion_r2102421005 ## python/pyarrow/scalar.pxi: ## @@ -238,6 +241,9 @@ cdef class UInt8Scalar(Scalar): cdef CUInt8Scalar* sp = self.wrapped.get() return sp.value if sp.

Re: [PR] GH-52: Make RangeEqualsVisitor of RunEndEncodedVector more efficient [arrow-java]

2025-05-22 Thread via GitHub
lidavidm commented on PR #761: URL: https://github.com/apache/arrow-java/pull/761#issuecomment-2901222142 Retrying - I think GHA had an outage -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[PR] GH-46556: [GLib] Add GArrowUuidDataType [arrow]

2025-05-22 Thread via GitHub
hiroyuki-sato opened a new pull request, #46558: URL: https://github.com/apache/arrow/pull/46558 ### Rationale for this change GLib should be able to use `arrow::extension:UuidType`. ### What changes are included in this PR? Add `GArrowUuidDataType` ### Are these c

Re: [PR] GH-52: Make RangeEqualsVisitor of RunEndEncodedVector more efficient [arrow-java]

2025-05-22 Thread via GitHub
lidavidm merged PR #761: URL: https://github.com/apache/arrow-java/pull/761 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apa

Re: [PR] GH-46520: [Docs] Fix variety of warnings and errors in the docs build [arrow]

2025-05-22 Thread via GitHub
amoeba commented on PR #46521: URL: https://github.com/apache/arrow/pull/46521#issuecomment-2901481452 @AlenkaF @thisisnic would either of you like to review? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [PR] GH-46520: [Docs] Fix variety of warnings and errors in the docs build [arrow]

2025-05-22 Thread via GitHub
AlenkaF commented on PR #46521: URL: https://github.com/apache/arrow/pull/46521#issuecomment-2901511743 Thanks so much for working on this, much appreciated! I have a plan to review but can not today. Will try tomorrow 🤞 -- This is an automated message from the Apache Git Service. To r

[PR] feat(csharp/src/Apache.Arrow.Adbc): OpenTelemetry tracing baseline [arrow-adbc]

2025-05-22 Thread via GitHub
birschick-bq opened a new pull request, #2847: URL: https://github.com/apache/arrow-adbc/pull/2847 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

Re: [PR] GH-25025: [C++] Move non core compute kernels into separate shared library [arrow]

2025-05-22 Thread via GitHub
raulcd commented on code in PR #46261: URL: https://github.com/apache/arrow/pull/46261#discussion_r2102906188 ## cpp/src/arrow/compute/codegen_internal.h: ## @@ -0,0 +1,31 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreement

Re: [PR] API for reading Variant data and metadata [arrow-rs]

2025-05-22 Thread via GitHub
mkarbo commented on code in PR #7535: URL: https://github.com/apache/arrow-rs/pull/7535#discussion_r2102975752 ## parquet-variant/src/variant.rs: ## @@ -0,0 +1,321 @@ +use std::{borrow::Cow, ops::Index}; + +use crate::decoder::{self, get_variant_type}; +use arrow_schema::ArrowEr

Re: [PR] API for reading Variant data and metadata [arrow-rs]

2025-05-22 Thread via GitHub
mkarbo commented on code in PR #7535: URL: https://github.com/apache/arrow-rs/pull/7535#discussion_r2102988142 ## parquet-variant/src/decoder.rs: ## @@ -0,0 +1,199 @@ +// NOTE: Largely based on the implementation of @PinkCrow007 in https://github.com/apache/arrow-rs/pull/7452 +

Re: [PR] API for reading Variant data and metadata [arrow-rs]

2025-05-22 Thread via GitHub
mkarbo commented on code in PR #7535: URL: https://github.com/apache/arrow-rs/pull/7535#discussion_r2102987595 ## parquet-variant/src/decoder.rs: ## @@ -0,0 +1,199 @@ +// NOTE: Largely based on the implementation of @PinkCrow007 in https://github.com/apache/arrow-rs/pull/7452 +

Re: [PR] API for reading Variant data and metadata [arrow-rs]

2025-05-22 Thread via GitHub
mkarbo commented on code in PR #7535: URL: https://github.com/apache/arrow-rs/pull/7535#discussion_r2102992366 ## parquet-variant/src/decoder.rs: ## @@ -0,0 +1,199 @@ +// NOTE: Largely based on the implementation of @PinkCrow007 in https://github.com/apache/arrow-rs/pull/7452 +

Re: [PR] Move Selection logic into ReadPlan builder [arrow-rs]

2025-05-22 Thread via GitHub
alamb commented on code in PR #7537: URL: https://github.com/apache/arrow-rs/pull/7537#discussion_r2102821625 ## parquet/src/arrow/arrow_reader/mod.rs: ## @@ -808,54 +809,45 @@ impl ParquetRecordBatchReader { /// Returns `Result>` rather than `Option>` to /// simplify

Re: [PR] GH-45653: [Python] Scalar subclasses should implement Python protocols [arrow]

2025-05-22 Thread via GitHub
thisisnic commented on code in PR #45818: URL: https://github.com/apache/arrow/pull/45818#discussion_r2102992832 ## python/pyarrow/scalar.pxi: ## @@ -238,6 +241,9 @@ cdef class UInt8Scalar(Scalar): cdef CUInt8Scalar* sp = self.wrapped.get() return sp.value if

Re: [PR] API for reading Variant data and metadata [arrow-rs]

2025-05-22 Thread via GitHub
mkarbo commented on code in PR #7535: URL: https://github.com/apache/arrow-rs/pull/7535#discussion_r2102992366 ## parquet-variant/src/decoder.rs: ## @@ -0,0 +1,199 @@ +// NOTE: Largely based on the implementation of @PinkCrow007 in https://github.com/apache/arrow-rs/pull/7452 +

Re: [PR] API for reading Variant data and metadata [arrow-rs]

2025-05-22 Thread via GitHub
mkarbo commented on code in PR #7535: URL: https://github.com/apache/arrow-rs/pull/7535#discussion_r2103016796 ## parquet-variant/src/variant.rs: ## @@ -0,0 +1,321 @@ +use std::{borrow::Cow, ops::Index}; + +use crate::decoder::{self, get_variant_type}; +use arrow_schema::ArrowEr

Re: [PR] API for reading Variant data and metadata [arrow-rs]

2025-05-22 Thread via GitHub
mkarbo commented on code in PR #7535: URL: https://github.com/apache/arrow-rs/pull/7535#discussion_r2103019756 ## parquet-variant/src/variant.rs: ## @@ -0,0 +1,321 @@ +use std::{borrow::Cow, ops::Index}; + +use crate::decoder::{self, get_variant_type}; +use arrow_schema::ArrowEr

Re: [I] Building ADBC and flightsql driver on Windows [arrow-adbc]

2025-05-22 Thread via GitHub
WillAyd commented on issue #2846: URL: https://github.com/apache/arrow-adbc/issues/2846#issuecomment-2901972313 That looks like a configuration issue, but not a compile time one. I see you have this in your log: ``` Build Options: -Dflightsql=enabled -Dtests=disabled ```

Re: [PR] GH-46338: [C++] Add compile step for Meson in cpp_build.sh [arrow]

2025-05-22 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #46339: URL: https://github.com/apache/arrow/pull/46339#issuecomment-2901801574 After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit 197afc02026d6ded3c45f25dcee15a94294cc5ca. There were no

Re: [I] Building ADBC and flightsql driver on Windows [arrow-adbc]

2025-05-22 Thread via GitHub
IIFE commented on issue #2846: URL: https://github.com/apache/arrow-adbc/issues/2846#issuecomment-2901884813 > I don't have a windows machine to test this out on, but maybe the Meson configuration works here? It looks like that is generating the expected extension: > > [arrow-adbc/c

Re: [PR] API for reading Variant data and metadata [arrow-rs]

2025-05-22 Thread via GitHub
mkarbo commented on code in PR #7535: URL: https://github.com/apache/arrow-rs/pull/7535#discussion_r2102981820 ## parquet-variant/src/decoder.rs: ## @@ -0,0 +1,199 @@ +// NOTE: Largely based on the implementation of @PinkCrow007 in https://github.com/apache/arrow-rs/pull/7452 +

Re: [PR] API for reading Variant data and metadata [arrow-rs]

2025-05-22 Thread via GitHub
mkarbo commented on code in PR #7535: URL: https://github.com/apache/arrow-rs/pull/7535#discussion_r2102984307 ## parquet-variant/src/decoder.rs: ## @@ -0,0 +1,199 @@ +// NOTE: Largely based on the implementation of @PinkCrow007 in https://github.com/apache/arrow-rs/pull/7452 +

Re: [I] Building ADBC and flightsql driver on Windows [arrow-adbc]

2025-05-22 Thread via GitHub
WillAyd commented on issue #2846: URL: https://github.com/apache/arrow-adbc/issues/2846#issuecomment-2901980467 Ugh...OK. Sorry that configuration is not well tested on Windows... For one more minor tweak, you can try changing this line: https://github.com/apache/arrow-adbc/blo

Re: [I] [R] read_parquet shows misleading message about url scheme when reading from s3 failed [arrow]

2025-05-22 Thread via GitHub
mdsumner commented on issue #20418: URL: https://github.com/apache/arrow/issues/20418#issuecomment-2901980519 I was seeing similarr, failing test on CRAN {sooty} on arm64 earlier today: ``` Version: 0.5.0 Check: examples Result: ERROR Running examples in ‘sooty-Ex.R’ f

Re: [PR] GH-46189 [C#] Use pooled buffers in ArrowStreamWriter [arrow]

2025-05-22 Thread via GitHub
CurtHagenlocher commented on code in PR #46190: URL: https://github.com/apache/arrow/pull/46190#discussion_r2103039330 ## csharp/src/Apache.Arrow/Ipc/ICompressionCodec.cs: ## @@ -44,5 +44,21 @@ void Compress(ReadOnlyMemory source, Stream destination) #else ; #endif +

Re: [I] Building ADBC and flightsql driver on Windows [arrow-adbc]

2025-05-22 Thread via GitHub
IIFE commented on issue #2846: URL: https://github.com/apache/arrow-adbc/issues/2846#issuecomment-2901974194 > That looks like a configuration issue, but not a compile time one. I see you have this in your log: > > ``` > Build Options: -Dflightsql=enabled -Dtests=disabled > ```

Re: [I] Building ADBC and flightsql driver on Windows [arrow-adbc]

2025-05-22 Thread via GitHub
IIFE commented on issue #2846: URL: https://github.com/apache/arrow-adbc/issues/2846#issuecomment-2901968940 @WillAyd @lidavidm This preset allows the Ninja build to complete, however, it only produces a `.dll` but no `.lib` for the import lib, which means I can't link my dll/exe ag

[PR] fix(c): Ignore dl dependency on Windows with Meson [arrow-adbc]

2025-05-22 Thread via GitHub
WillAyd opened a new pull request, #2848: URL: https://github.com/apache/arrow-adbc/pull/2848 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e

[PR] Add `rand-many-types` data in Parquet and DuckDB formats [arrow-experiments]

2025-05-22 Thread via GitHub
ianmcook opened a new pull request, #47: URL: https://github.com/apache/arrow-experiments/pull/47 This adds copies of the data in `data/rand-many-types` in Parquet file format and DuckDB database file format. The code used to generate these files is included. -- This is an automated mess

Re: [PR] GH-46435: [Parquet][C++] Fix uninitialized value in writer test [arrow]

2025-05-22 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #46533: URL: https://github.com/apache/arrow/pull/46533#issuecomment-2902246103 After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit 05ad12ccd17a623aa5a085d1377e1970883fdc5c. There were no

Re: [PR] API for reading Variant data and metadata [arrow-rs]

2025-05-22 Thread via GitHub
alamb commented on code in PR #7535: URL: https://github.com/apache/arrow-rs/pull/7535#discussion_r2103231456 ## parquet-variant/src/variant.rs: ## @@ -182,83 +183,127 @@ impl<'m> VariantMetadata<'m> { } /// Get the key-name by index -pub fn get_by(&self, index:

Re: [PR] Fix `filter_record_batch` panics with empty struct array [arrow-rs]

2025-05-22 Thread via GitHub
alamb merged PR #7539: URL: https://github.com/apache/arrow-rs/pull/7539 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache

Re: [I] Question on Using Arrow with Go and DuckDB for Python Interop [arrow-go]

2025-05-22 Thread via GitHub
aotarola commented on issue #384: URL: https://github.com/apache/arrow-go/issues/384#issuecomment-2902277289 @zeroshade Got it! I'll investigate more the ADBC path, since seems an interesting way of doing it. Regarding the `go-duckdb` implementation, I did this really small example,

Re: [PR] fix: TCP reset retry again and recursively iterates sources [arrow-rs-object-store]

2025-05-22 Thread via GitHub
Jun-Yuan closed pull request #378: fix: TCP reset retry again and recursively iterates sources URL: https://github.com/apache/arrow-rs-object-store/pull/378 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [PR] API for reading Variant data and metadata [arrow-rs]

2025-05-22 Thread via GitHub
scovich commented on code in PR #7535: URL: https://github.com/apache/arrow-rs/pull/7535#discussion_r2103236238 ## parquet-variant/src/decoder.rs: ## @@ -0,0 +1,199 @@ +// NOTE: Largely based on the implementation of @PinkCrow007 in https://github.com/apache/arrow-rs/pull/7452

Re: [PR] API for reading Variant data and metadata [arrow-rs]

2025-05-22 Thread via GitHub
alamb commented on code in PR #7535: URL: https://github.com/apache/arrow-rs/pull/7535#discussion_r2103238782 ## parquet-variant/src/decoder.rs: ## @@ -0,0 +1,199 @@ +// NOTE: Largely based on the implementation of @PinkCrow007 in https://github.com/apache/arrow-rs/pull/7452 +/

Re: [PR] API for reading Variant data and metadata [arrow-rs]

2025-05-22 Thread via GitHub
scovich commented on code in PR #7535: URL: https://github.com/apache/arrow-rs/pull/7535#discussion_r2103236238 ## parquet-variant/src/decoder.rs: ## @@ -0,0 +1,199 @@ +// NOTE: Largely based on the implementation of @PinkCrow007 in https://github.com/apache/arrow-rs/pull/7452

Re: [PR] GH-46189 [C#] Use pooled buffers in ArrowStreamWriter [arrow]

2025-05-22 Thread via GitHub
m-v-w commented on PR #46190: URL: https://github.com/apache/arrow/pull/46190#issuecomment-2902369166 Hi @adamreeve , @CurtHagenlocher , thanks for considering my PR. The issue I encountered with `ArrowStreamWriter` is it allocates large buffers on compression and array slicin

Re: [PR] GH-46189 [C#] Use pooled buffers in ArrowStreamWriter [arrow]

2025-05-22 Thread via GitHub
m-v-w commented on code in PR #46190: URL: https://github.com/apache/arrow/pull/46190#discussion_r2103287005 ## csharp/src/Apache.Arrow/Ipc/ICompressionCodec.cs: ## @@ -44,5 +44,21 @@ void Compress(ReadOnlyMemory source, Stream destination) #else ; #endif + +

Re: [PR] GH-24833 Implement IPC RecordBatch body buffer compression [arrow-js]

2025-05-22 Thread via GitHub
trxcllnt commented on code in PR #14: URL: https://github.com/apache/arrow-js/pull/14#discussion_r2103290165 ## src/ipc/reader.ts: ## @@ -369,9 +389,51 @@ abstract class RecordBatchReaderImpl implements RecordB new Vector(data)) : new Vector(data)).memo

Re: [I] Building ADBC and flightsql driver on Windows [arrow-adbc]

2025-05-22 Thread via GitHub
IIFE commented on issue #2846: URL: https://github.com/apache/arrow-adbc/issues/2846#issuecomment-2902382549 @WillAyd I've managed to get a little bit further with Visual Studio 2022 by using this preset: ``` { "name": "debug-msvc2022", "displayName": "debug

Re: [I] Building ADBC and flightsql driver on Windows [arrow-adbc]

2025-05-22 Thread via GitHub
IIFE commented on issue #2846: URL: https://github.com/apache/arrow-adbc/issues/2846#issuecomment-2902385049 **P.S:** I'm mainly familiar with Windows dev using Visual Studio, so cmake is a bit foreign to me :) -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] Move Selection logic into ReadPlan builder [arrow-rs]

2025-05-22 Thread via GitHub
alamb commented on PR #7537: URL: https://github.com/apache/arrow-rs/pull/7537#issuecomment-2902385734 I am pretty happy with how this currently looks, but before I mark it for review I want to make a proof of concept that I can actually improve performance with it -- This is an automate

[PR] GH-46529: [C++] Convert static inline type trait functions to constexpr [arrow]

2025-05-22 Thread via GitHub
david1437 opened a new pull request, #46559: URL: https://github.com/apache/arrow/pull/46559 ### Rationale for this change As C++ versions have increased certain patterns that used to be optimal have been replaced with new more powerful forms. In this scenario static inline function

Re: [I] Building ADBC and flightsql driver on Windows [arrow-adbc]

2025-05-22 Thread via GitHub
WillAyd commented on issue #2846: URL: https://github.com/apache/arrow-adbc/issues/2846#issuecomment-2901089026 Hmm that's strange - can you share the error message the linker is providing? -- This is an automated message from the Apache Git Service. To respond to the message, please log o

Re: [I] [CI][Dev] Use pre-commit for iwyu [arrow]

2025-05-22 Thread via GitHub
WillAyd commented on issue #46543: URL: https://github.com/apache/arrow/issues/46543#issuecomment-2901097556 I have a similar experience as @pitrou more recently where the tool seems to provide strange feedback. In some cases, iwyu ends up creating a loop where it constantly tells me to add

Re: [PR] fix: TCP reset retry again and recursively iterates sources [arrow-rs-object-store]

2025-05-22 Thread via GitHub
Jun-Yuan commented on PR #378: URL: https://github.com/apache/arrow-rs-object-store/pull/378#issuecomment-2901159348 > Perhaps you could come up with a reproducer, I don't follow why this would change the behaviour. If hyper is able to classify the error we use the more precise classificat

Re: [PR] GH-46556: [GLib] Add GArrowUuidDataType [arrow]

2025-05-22 Thread via GitHub
github-actions[bot] commented on PR #46558: URL: https://github.com/apache/arrow/pull/46558#issuecomment-2901163286 :warning: GitHub issue #46556 **has been automatically assigned in GitHub** to PR creator. -- This is an automated message from the Apache Git Service. To respond to the mes

Re: [I] Building ADBC and flightsql driver on Windows [arrow-adbc]

2025-05-22 Thread via GitHub
WillAyd commented on issue #2846: URL: https://github.com/apache/arrow-adbc/issues/2846#issuecomment-2901174457 I don't have a windows machine to test this out on, but maybe the Meson configuration works here? It looks like that is generating the expected extension: https://github.c

Re: [PR] GH-52: Make RangeEqualsVisitor of RunEndEncodedVector more efficient [arrow-java]

2025-05-22 Thread via GitHub
lidavidm commented on PR #761: URL: https://github.com/apache/arrow-java/pull/761#issuecomment-2901254584 And I think those JNI failures are unrelated...I'll have to find time and take a look. -- This is an automated message from the Apache Git Service. To respond to the message, please l

Re: [I] [C++] During casting, `allow_int_overflow` option ignored when `allow_float_truncate` set to True [arrow]

2025-05-22 Thread via GitHub
thisisnic commented on issue #46557: URL: https://github.com/apache/arrow/issues/46557#issuecomment-2901388937 This appears to be a C++ issue as the same results are shown in R: ``` > library(arrow) > x <- scalar(1e308) > arrow:::cast(x, int32(), allow_float_truncate = TRUE, allow

[PR] fix(c/validation): Use disabler pattern for validation_dep in Meson [arrow-adbc]

2025-05-22 Thread via GitHub
WillAyd opened a new pull request, #2849: URL: https://github.com/apache/arrow-adbc/pull/2849 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e

Re: [I] Building ADBC and flightsql driver on Windows [arrow-adbc]

2025-05-22 Thread via GitHub
IIFE commented on issue #2846: URL: https://github.com/apache/arrow-adbc/issues/2846#issuecomment-2902035404 > Ugh...OK. Sorry that configuration is not well tested on Windows... > > For one more minor tweak, you can try changing this line: > > [arrow-adbc/c/driver_manager/meso

Re: [PR] fix(c): Ignore dl dependency on Windows with Meson [arrow-adbc]

2025-05-22 Thread via GitHub
WillAyd commented on PR #2848: URL: https://github.com/apache/arrow-adbc/pull/2848#issuecomment-290207 This was discovered by user @iife in https://github.com/apache/arrow-adbc/issues/2846#issuecomment-2901974194 We don't test our Meson configuration on Windows in CI (maybe we sho

Re: [PR] fix(c/validation): Use disabler pattern for validation_dep in Meson [arrow-adbc]

2025-05-22 Thread via GitHub
WillAyd commented on PR #2849: URL: https://github.com/apache/arrow-adbc/pull/2849#issuecomment-2902019763 This fixes the issue described here: https://github.com/apache/arrow-adbc/issues/2846#issuecomment-2901972313 Our CI tests every option being set, but when tests were disab

Re: [I] Question on Using Arrow with Go and DuckDB for Python Interop [arrow-go]

2025-05-22 Thread via GitHub
aotarola commented on issue #384: URL: https://github.com/apache/arrow-go/issues/384#issuecomment-2902135974 @zeroshade I was able to pull this out via the https://github.com/marcboeker/go-duckdb lib. I'm curious why you suggest using drivermgr instead? it seems like a direct way of doing w

Re: [PR] API for reading Variant data and metadata [arrow-rs]

2025-05-22 Thread via GitHub
alamb commented on code in PR #7535: URL: https://github.com/apache/arrow-rs/pull/7535#discussion_r2103135519 ## parquet-variant/src/decoder.rs: ## @@ -0,0 +1,149 @@ +// NOTE: Largely based on the implementation of @PinkCrow007 in https://github.com/apache/arrow-rs/pull/7452 +/

[PR] fix the reset retry [arrow-rs-object-store]

2025-05-22 Thread via GitHub
Jun-Yuan opened a new pull request, #381: URL: https://github.com/apache/arrow-rs-object-store/pull/381 # Which issue does this PR close? Closes #. # Rationale for this change # What changes are included in this PR? # Are there any user-fac

Re: [I] filter_record_batch panics with empty struct array. [arrow-rs]

2025-05-22 Thread via GitHub
alamb closed issue #7538: filter_record_batch panics with empty struct array. URL: https://github.com/apache/arrow-rs/issues/7538 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] fix the reset retry [arrow-rs-object-store]

2025-05-22 Thread via GitHub
Jun-Yuan closed pull request #381: fix the reset retry URL: https://github.com/apache/arrow-rs-object-store/pull/381 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

Re: [I] Question on Using Arrow with Go and DuckDB for Python Interop [arrow-go]

2025-05-22 Thread via GitHub
zeroshade commented on issue #384: URL: https://github.com/apache/arrow-go/issues/384#issuecomment-2902167805 > it seems like a direct way of doing what I want without the extra abstraction provided by go-duckdb, but is there any other reason to use it? Mostly just that it's more dire

Re: [I] Building ADBC and flightsql driver on Windows [arrow-adbc]

2025-05-22 Thread via GitHub
WillAyd commented on issue #2846: URL: https://github.com/apache/arrow-adbc/issues/2846#issuecomment-2902181285 Well looks like things are borked there too...I am not an expert in Windows symbol visibility but we _might_ need to ensure our configurations (both CMake and Meson) are handling

Re: [PR] Fix `filter_record_batch` panics with empty struct array [arrow-rs]

2025-05-22 Thread via GitHub
alamb commented on PR #7539: URL: https://github.com/apache/arrow-rs/pull/7539#issuecomment-2902179350 Thanks again @thorfour -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

Re: [I] [C++] Make type_traits function constexpr [arrow]

2025-05-22 Thread via GitHub
david1437 commented on issue #46529: URL: https://github.com/apache/arrow/issues/46529#issuecomment-2902392213 @pitrou PR ready -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

Re: [PR] Add `rand-many-types` data in Parquet and DuckDB formats [arrow-experiments]

2025-05-22 Thread via GitHub
ianmcook commented on PR #47: URL: https://github.com/apache/arrow-experiments/pull/47#issuecomment-2902390032 I've been creating some examples (elsewhere) to demonstrate interop between various libraries and engines (including DuckDB) via Arrow IPC, ADBC, and Parquet. Having these files on

Re: [PR] GH-24833 Implement IPC RecordBatch body buffer compression [arrow-js]

2025-05-22 Thread via GitHub
Djjanks commented on code in PR #14: URL: https://github.com/apache/arrow-js/pull/14#discussion_r2103307604 ## src/ipc/reader.ts: ## @@ -369,9 +389,51 @@ abstract class RecordBatchReaderImpl implements RecordB new Vector(data)) : new Vector(data)).memoi

Re: [PR] Move Selection logic into ReadPlan builder [arrow-rs]

2025-05-22 Thread via GitHub
alamb commented on PR #7537: URL: https://github.com/apache/arrow-rs/pull/7537#issuecomment-2902444038 🤖: Benchmark completed Details ``` groupalamb_row_selection_plan main -

Re: [PR] GH-46508: [C++] Upgrade OpenTelemetry cpp to avoid build error on recent Clang [arrow]

2025-05-22 Thread via GitHub
zanmato1984 commented on code in PR #46509: URL: https://github.com/apache/arrow/pull/46509#discussion_r2103328816 ## cpp/cmake_modules/ThirdpartyToolchain.cmake: ## @@ -4845,6 +4845,15 @@ macro(build_opentelemetry) version) set(OPENTELEMETRY_BUILD_BYPRODUCTS) set(O

Re: [PR] Move Selection logic into ReadPlan builder [arrow-rs]

2025-05-22 Thread via GitHub
alamb commented on PR #7537: URL: https://github.com/apache/arrow-rs/pull/7537#issuecomment-2902388172 🤖 `./gh_compare_arrow.sh` [Benchmark Script](https://github.com/alamb/datafusion-benchmarking/blob/main/gh_compare_arrow.sh) Running Linux aal-dev 6.11.0-1013-gcp #13~24.04.1-Ubuntu SMP

Re: [I] Add native support to write out `UnionArray` in JSON writer [arrow-rs]

2025-05-22 Thread via GitHub
kumarlokesh commented on issue #7302: URL: https://github.com/apache/arrow-rs/issues/7302#issuecomment-2902452032 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

Re: [PR] API for reading Variant data and metadata [arrow-rs]

2025-05-22 Thread via GitHub
mkarbo commented on code in PR #7535: URL: https://github.com/apache/arrow-rs/pull/7535#discussion_r2103383668 ## parquet-variant/src/variant.rs: ## @@ -0,0 +1,415 @@ +use crate::decoder::{ +self, get_basic_type, get_primitive_type, VariantBasicType, VariantPrimitiveType, +

Re: [PR] Add `rand-many-types` data in Parquet and DuckDB formats [arrow-experiments]

2025-05-22 Thread via GitHub
ianmcook commented on PR #47: URL: https://github.com/apache/arrow-experiments/pull/47#issuecomment-2902547358 @amoeba sounds great. Do you mind pushing a commit here? I just added you as a collaborator on my fork. -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] API for reading Variant data and metadata [arrow-rs]

2025-05-22 Thread via GitHub
mkarbo commented on code in PR #7535: URL: https://github.com/apache/arrow-rs/pull/7535#discussion_r2103381365 ## parquet-variant/src/variant.rs: ## @@ -182,83 +183,127 @@ impl<'m> VariantMetadata<'m> { } /// Get the key-name by index -pub fn get_by(&self, index:

Re: [I] [Python] usability improvements for a "minimal" pyarrow [arrow]

2025-05-22 Thread via GitHub
h-vetinari commented on issue #38536: URL: https://github.com/apache/arrow/issues/38536#issuecomment-2902551395 > [...] adding more pyarrow- variants to pypi [or] document how users can set applicable options when building from an sdist. Those two things are almost entirely orthogonal

Re: [PR] API for reading Variant data and metadata [arrow-rs]

2025-05-22 Thread via GitHub
mkarbo commented on code in PR #7535: URL: https://github.com/apache/arrow-rs/pull/7535#discussion_r2103398459 ## parquet-variant/src/decoder.rs: ## @@ -0,0 +1,199 @@ +// NOTE: Largely based on the implementation of @PinkCrow007 in https://github.com/apache/arrow-rs/pull/7452 +

Re: [PR] API for reading Variant data and metadata [arrow-rs]

2025-05-22 Thread via GitHub
mkarbo commented on code in PR #7535: URL: https://github.com/apache/arrow-rs/pull/7535#discussion_r2103399834 ## parquet-variant/src/decoder.rs: ## Review Comment: Based on the implementation of @PinkCrow007 in https://github.com/apache/arrow-rs/pull/7452 and the feedback

Re: [I] [Python] usability improvements for a "minimal" pyarrow [arrow]

2025-05-22 Thread via GitHub
WillAyd commented on issue #38536: URL: https://github.com/apache/arrow/issues/38536#issuecomment-2902581319 Fair point - I think its in any case a question of what percentage of users care about a smaller wheel in the first place. In the general case I would agree with the ratio, but in th

Re: [PR] Add `rand-many-types` data in Parquet and DuckDB formats [arrow-experiments]

2025-05-22 Thread via GitHub
amoeba commented on PR #47: URL: https://github.com/apache/arrow-experiments/pull/47#issuecomment-2902602589 I get a weird error when I push, maybe due to LFS? ``` error: Authentication error: Authentication required: You must have push access to verify locks error: failed to pu

Re: [PR] Add `rand-many-types` data in Parquet and DuckDB formats [arrow-experiments]

2025-05-22 Thread via GitHub
amoeba commented on PR #47: URL: https://github.com/apache/arrow-experiments/pull/47#issuecomment-2902613912 Got it to work after I accepted the invite. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

  1   2   3   >