Re: [PR] GH-46087: [FlightSQL] Allow returning column remarks in FlightSQL's CommandGetTables [arrow-java]

2025-04-23 Thread via GitHub
github-actions[bot] commented on PR #727: URL: https://github.com/apache/arrow-java/pull/727#issuecomment-2826547818 Thank you for opening a pull request! Please label the PR with one or more of: - bug-fix - chore - dependencies - documentation - enhancement

Re: [PR] GH-46087: [FlightSQL] Allow returning column remarks in FlightSQL's CommandGetTables [arrow]

2025-04-23 Thread via GitHub
mateuszrzeszutek commented on PR #46110: URL: https://github.com/apache/arrow/pull/46110#issuecomment-2826548507 @lidavidm here's the draft implementation for Java https://github.com/apache/arrow-java/pull/727 I'll prepare a PR for Go as well -- This is an automated message from the Ap

Re: [PR] GH-45664: [C++] Allow LargeString,LargeBinary,FixedSizeBinary,StringView and BinaryView for RecordBatch::MakeStatisticsArray() [arrow]

2025-04-23 Thread via GitHub
andishgar commented on code in PR #46031: URL: https://github.com/apache/arrow/pull/46031#discussion_r2057626230 ## cpp/src/arrow/record_batch_test.cc: ## @@ -1215,6 +1244,21 @@ Result> MakeStatisticsArray( std::move(stati

Re: [PR] GH-45664: [C++] Allow LargeString,LargeBinary,FixedSizeBinary,StringView and BinaryView for RecordBatch::MakeStatisticsArray() [arrow]

2025-04-23 Thread via GitHub
kou commented on code in PR #46031: URL: https://github.com/apache/arrow/pull/46031#discussion_r2057611991 ## cpp/src/arrow/record_batch_test.cc: ## @@ -1215,6 +1244,21 @@ Result> MakeStatisticsArray( std::move(statistics_

Re: [I] [C++][Parquet] Thoughts about classes in key_toolkit.h [arrow]

2025-04-23 Thread via GitHub
kapoisu commented on issue #46217: URL: https://github.com/apache/arrow/issues/46217#issuecomment-2826469372 Thanks for the clarification Adam! I'll submit a PR to fix the timestamp then. In addition, I noticed that when I built parquet with PARQUET_REQUIRE_ENCRYPTION=ON, the OpenSSL

[PR] feat(csharp/src/Drivers/Apache): Performance improvement - Replace TSocketTransport with TBufferedTransport [arrow-adbc]

2025-04-23 Thread via GitHub
sudhiremmadi opened a new pull request, #2742: URL: https://github.com/apache/arrow-adbc/pull/2742 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

Re: [PR] GH-46196: [C++] Remove ARROW_USE_PRECOMPILED_HEADERS and related logic [arrow]

2025-04-23 Thread via GitHub
singh1203 commented on PR #46200: URL: https://github.com/apache/arrow/pull/46200#issuecomment-2826464544 @pitrou I appreciate you applying and providing the commit on my behalf, especially for the Python codebase. I felt that since the issue is for C++, we ought to stick to that. -- T

Re: [PR] GH-45664: [C++] Allow LargeString,LargeBinary,FixedSizeBinary,StringView and BinaryView for RecordBatch::MakeStatisticsArray() [arrow]

2025-04-23 Thread via GitHub
andishgar commented on code in PR #46031: URL: https://github.com/apache/arrow/pull/46031#discussion_r2057586695 ## cpp/src/arrow/record_batch_test.cc: ## @@ -1215,6 +1244,21 @@ Result> MakeStatisticsArray( std::move(stati

Re: [PR] GH-463: Improve TZ support for JDBC driver [arrow-java]

2025-04-23 Thread via GitHub
aiguofer commented on code in PR #464: URL: https://github.com/apache/arrow-java/pull/464#discussion_r2057539864 ## flight/flight-sql-jdbc-core/src/main/java/org/apache/arrow/driver/jdbc/accessor/impl/calendar/ArrowFlightJdbcTimeStampVectorAccessor.java: ## @@ -68,11 +78,57 @@ p

Re: [I] Drivers are unusable when statically linked with CMake [arrow-adbc]

2025-04-23 Thread via GitHub
lidavidm commented on issue #2562: URL: https://github.com/apache/arrow-adbc/issues/2562#issuecomment-2826386493 See https://github.com/apache/arrow-adbc/pull/2738 there are a few caveats to keep in mind (as noted in the doc page added there) -- This is an automated message from the Apach

Re: [I] rust: split workspaces so that dependencies of adbc_core aren't tied to what the datafusion driver requires [arrow-adbc]

2025-04-23 Thread via GitHub
paleolimbot commented on issue #2739: URL: https://github.com/apache/arrow-adbc/issues/2739#issuecomment-2826283337 For what it's worth, I asked at the DataFusion community call and it seems this type of dependency weirdness between Arrow, Data Fusion, and other tooling happens everywhere.

Re: [PR] GH-45522: [Parquet][C++] Parquet GEOMETRY and GEOGRAPHY logical type implementations [arrow]

2025-04-23 Thread via GitHub
paleolimbot commented on PR #45459: URL: https://github.com/apache/arrow/pull/45459#issuecomment-2826346276 I think this is ready for another round when time allows! A few notes: - @jorisvandenbossche mentioned in a recent GeoParquet sync that it might be nice if the GEOMETRY type was

Re: [PR] GH-463: Improve TZ support for JDBC driver [arrow-java]

2025-04-23 Thread via GitHub
aiguofer commented on PR #464: URL: https://github.com/apache/arrow-java/pull/464#issuecomment-2826346557 @lidavidm sorry been super busy, but thanks for the feedback! I'll fix that typo and throw an exception if trying to fetch objects with offset info when the underlying vector does not.

Re: [PR] feat(c): Use C++ visibility support in Meson configuration [arrow-adbc]

2025-04-23 Thread via GitHub
lidavidm commented on code in PR #2740: URL: https://github.com/apache/arrow-adbc/pull/2740#discussion_r2057438901 ## c/include/arrow-adbc/adbc.h: ## @@ -280,7 +284,7 @@ typedef uint8_t AdbcStatusCode; /// ADBC_ERROR_VENDOR_CODE_PRIVATE_DATA. Clients are required to initialize

Re: [PR] feat(c): Use C++ visibility support in Meson configuration [arrow-adbc]

2025-04-23 Thread via GitHub
paleolimbot commented on code in PR #2740: URL: https://github.com/apache/arrow-adbc/pull/2740#discussion_r2057418800 ## c/include/arrow-adbc/adbc.h: ## @@ -280,7 +284,7 @@ typedef uint8_t AdbcStatusCode; /// ADBC_ERROR_VENDOR_CODE_PRIVATE_DATA. Clients are required to initial

Re: [PR] ci: verify source using OS-installed dependencies [arrow-adbc]

2025-04-23 Thread via GitHub
lidavidm merged PR #2718: URL: https://github.com/apache/arrow-adbc/pull/2718 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.a

Re: [PR] GH-45908: [C++][Docs] Expose basic {Array,...}FromJSON helpers as public APIs [arrow]

2025-04-23 Thread via GitHub
kou commented on PR #46180: URL: https://github.com/apache/arrow/pull/46180#issuecomment-2826024870 If we think that JSON means JSONL too, I think that we don't need to distinguish `*FromJSON` and `arrow/json/`. If we think that JSON doesn't mean JSONL, I think that renaming `arrow/j

Re: [I] [C++] Test Failure With ARROW_BUILD_TESTS=ON and ARROW_COMPUTE=OFF [arrow]

2025-04-23 Thread via GitHub
wgtmac commented on issue #46157: URL: https://github.com/apache/arrow/issues/46157#issuecomment-2825969960 I ran into the same issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] feat(csharp/src/Drivers/Databricks): Add option to enable using direct results for statements [arrow-adbc]

2025-04-23 Thread via GitHub
jadewang-db commented on code in PR #2737: URL: https://github.com/apache/arrow-adbc/pull/2737#discussion_r2056580154 ## csharp/src/Drivers/Databricks/DatabricksReader.cs: ## @@ -39,6 +39,17 @@ public DatabricksReader(DatabricksStatement statement, Schema schema, bool isLz4

Re: [PR] [EXP] GH-44792: [C++] Require C++20 [arrow]

2025-04-23 Thread via GitHub
kou commented on code in PR #45445: URL: https://github.com/apache/arrow/pull/45445#discussion_r2057206611 ## dev/release/verify-yum.sh: ## @@ -220,7 +220,7 @@ pushd build/minimal_build ${cmake_command} . make -j$(nproc) ./arrow-example -c++ -o arrow-example example.cc $(pkg-

Re: [PR] GH-45664: [C++] Allow LargeString,LargeBinary,FixedSizeBinary,StringView and BinaryView for RecordBatch::MakeStatisticsArray() [arrow]

2025-04-23 Thread via GitHub
kou commented on code in PR #46031: URL: https://github.com/apache/arrow/pull/46031#discussion_r2057193334 ## cpp/src/arrow/record_batch_test.cc: ## @@ -1423,34 +1467,148 @@ TEST_F(TestRecordBatch, MakeStatisticsArrayMaxApproximate) { AssertArraysEqual(*expected_statistics_a

Re: [PR] GH-46215: [C++][Docs] Add README for Meson subprojects directory [arrow]

2025-04-23 Thread via GitHub
kou commented on code in PR #46216: URL: https://github.com/apache/arrow/pull/46216#discussion_r2057185457 ## cpp/subprojects/README.md: ## @@ -0,0 +1,81 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE f

Re: [PR] ci: verify source using OS-installed dependencies [arrow-adbc]

2025-04-23 Thread via GitHub
kou commented on code in PR #2718: URL: https://github.com/apache/arrow-adbc/pull/2718#discussion_r2057182150 ## dev/release/verify-release-candidate.sh: ## @@ -246,38 +246,42 @@ install_dotnet() { return 0 fi - show_info "Ensuring that .NET is installed..." - - if d

Re: [I] [Python] Dataset.to_batches accumulates memory usage and leaks [arrow]

2025-04-23 Thread via GitHub
wingkitlee0 commented on issue #39808: URL: https://github.com/apache/arrow/issues/39808#issuecomment-2825922411 Came across this issue recently and I can still see this https://github.com/apache/arrow/issues/39808#issuecomment-2163183635 Previously I tried `pre_buffer=False` and `use

Re: [PR] GH-45908: [C++][Docs] Expose basic {Array,...}FromJSON helpers as public APIs [arrow]

2025-04-23 Thread via GitHub
amoeba commented on PR #46180: URL: https://github.com/apache/arrow/pull/46180#issuecomment-2825902742 I'm not sure the benefit of putting these helpers in the public API outweighs the downsides of renaming `arrow/json` for users and the work required for us. @pitrou's point about avoiding

Re: [PR] fix(ci): Prevent flaky ASAN failures with Dremio [arrow-adbc]

2025-04-23 Thread via GitHub
WillAyd commented on PR #2604: URL: https://github.com/apache/arrow-adbc/pull/2604#issuecomment-2825015231 After some more research, the patch that was applied to clang had already been in place with gcc since 2013. So I think I erred in assuming a newer version would change this. Need to g

Re: [I] Add end-user logging and tracing for drivers [arrow-adbc]

2025-04-23 Thread via GitHub
lidavidm commented on issue #2210: URL: https://github.com/apache/arrow-adbc/issues/2210#issuecomment-2825850914 Would you configure the drivers, or would you just set the env vars? My impression of how OTel worked is that you just enable OTel and let it pick up the ambient configuration fr

Re: [PR] GH-45028: [C++][Compute] Allow cast to reorder struct fields [arrow]

2025-04-23 Thread via GitHub

Re: [I] c/driver/sqlite: Column type is always int64 with empty table [arrow-adbc]

2025-04-23 Thread via GitHub
lidavidm commented on issue #581: URL: https://github.com/apache/arrow-adbc/issues/581#issuecomment-2825773353 > I think ultimately, what we want is [#1514](https://github.com/apache/arrow-adbc/issues/1514) -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] feat(c): Use C++ visibility support in Meson configuration [arrow-adbc]

2025-04-23 Thread via GitHub
lidavidm commented on code in PR #2740: URL: https://github.com/apache/arrow-adbc/pull/2740#discussion_r2057046862 ## c/include/arrow-adbc/adbc.h: ## @@ -280,7 +284,7 @@ typedef uint8_t AdbcStatusCode; /// ADBC_ERROR_VENDOR_CODE_PRIVATE_DATA. Clients are required to initialize

Re: [PR] fix(rust/core): remove the Mutex around the FFI driver object [arrow-adbc]

2025-04-23 Thread via GitHub
lidavidm commented on PR #2736: URL: https://github.com/apache/arrow-adbc/pull/2736#issuecomment-2825771829 There is only one copy. (The others are just synchronized via pre-commit) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

Re: [PR] GH-46196: [C++] Remove ARROW_USE_PRECOMPILED_HEADERS and related logic [arrow]

2025-04-23 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #46200: URL: https://github.com/apache/arrow/pull/46200#issuecomment-2825598681 After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit f8e0f30ccf6599e8bfa990d586d8e48a4fd18894. There were no

Re: [I] [C++][Parquet] Thoughts about classes in key_toolkit.h [arrow]

2025-04-23 Thread via GitHub
adamreeve commented on issue #46217: URL: https://github.com/apache/arrow/issues/46217#issuecomment-2825621423 `KeyToolkit` is technically public so this would be a breaking change, but I don't think this was intentional. Eg. for key rotation users should be using the `CryptoFactory`, so I

Re: [PR] MINOR: [Dev] Add apt/yum build directories to `.gitignore` [arrow]

2025-04-23 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #46018: URL: https://github.com/apache/arrow/pull/46018#issuecomment-2825601996 After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit 520ae44272d491bbb52eb3c9b84864ed7088f11a. There were no

Re: [PR] feat(csharp/src/Drivers/BigQuery): Add support for AAD/Entra authentication [arrow-adbc]

2025-04-23 Thread via GitHub
CurtHagenlocher merged PR #2655: URL: https://github.com/apache/arrow-adbc/pull/2655 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

Re: [PR] GH-45908: [C++][Docs] Expose basic {Array,...}FromJSON helpers as public APIs [arrow]

2025-04-23 Thread via GitHub
kou commented on PR #46180: URL: https://github.com/apache/arrow/pull/46180#issuecomment-2825507083 > The most correct solution seems to be renaming to `arrow/jsonl/reader.h` maybe with a namespace alias that we can put through a deprecation cycle I like this. -- This is an

Re: [PR] GH-45957: [Python] Expose `allow_delayed_open` on S3FileSystem [arrow]

2025-04-23 Thread via GitHub
gmcrocetti commented on code in PR #46078: URL: https://github.com/apache/arrow/pull/46078#discussion_r2056846027 ## python/pyarrow/tests/test_fs.py: ## @@ -1839,6 +1846,20 @@ def test_s3_real_aws_region_selection(): 's3://x-arrow-nonexistent-bucket?region=us-east-3')

Re: [PR] GH-45957: [Python] Expose `allow_delayed_open` on S3FileSystem [arrow]

2025-04-23 Thread via GitHub
gmcrocetti commented on code in PR #46078: URL: https://github.com/apache/arrow/pull/46078#discussion_r2056846027 ## python/pyarrow/tests/test_fs.py: ## @@ -1839,6 +1846,20 @@ def test_s3_real_aws_region_selection(): 's3://x-arrow-nonexistent-bucket?region=us-east-3')

Re: [PR] GH-45957: [Python] Expose `allow_delayed_open` on S3FileSystem [arrow]

2025-04-23 Thread via GitHub
gmcrocetti commented on code in PR #46078: URL: https://github.com/apache/arrow/pull/46078#discussion_r2056846027 ## python/pyarrow/tests/test_fs.py: ## @@ -1839,6 +1846,20 @@ def test_s3_real_aws_region_selection(): 's3://x-arrow-nonexistent-bucket?region=us-east-3')

Re: [PR] GH-45957: [Python] Expose `allow_delayed_open` on S3FileSystem [arrow]

2025-04-23 Thread via GitHub
gmcrocetti commented on code in PR #46078: URL: https://github.com/apache/arrow/pull/46078#discussion_r2056846027 ## python/pyarrow/tests/test_fs.py: ## @@ -1839,6 +1846,20 @@ def test_s3_real_aws_region_selection(): 's3://x-arrow-nonexistent-bucket?region=us-east-3')

Re: [PR] GH-45957: [Python] Expose `allow_delayed_open` on S3FileSystem [arrow]

2025-04-23 Thread via GitHub
gmcrocetti commented on code in PR #46078: URL: https://github.com/apache/arrow/pull/46078#discussion_r2056846027 ## python/pyarrow/tests/test_fs.py: ## @@ -1839,6 +1846,20 @@ def test_s3_real_aws_region_selection(): 's3://x-arrow-nonexistent-bucket?region=us-east-3')

[PR] Fix out of bounds crash in RleValueDecoder [arrow-rs]

2025-04-23 Thread via GitHub
apilloud opened a new pull request, #7441: URL: https://github.com/apache/arrow-rs/pull/7441 # Which issue does this PR close? This is in the same thread as #6737, not sure if it needs its own issue. # Rationale for this change Allows bad Parquet files to produce an erro

Re: [PR] feat(csharp/src/Drivers/Databricks): Add option to enable using direct results for statements [arrow-adbc]

2025-04-23 Thread via GitHub
alexguo-db commented on PR #2737: URL: https://github.com/apache/arrow-adbc/pull/2737#issuecomment-2825441566 @CurtHagenlocher I believe the feedback is addressed now, but please wait for https://github.com/apache/arrow-adbc/pull/2678 before merging -- This is an automated message fro

[PR] pub use FooterTail [arrow-rs]

2025-04-23 Thread via GitHub
masonh22 opened a new pull request, #7440: URL: https://github.com/apache/arrow-rs/pull/7440 # Which issue does this PR close? Closes #7438. # Rationale for this change This makes it so that the result of `ParquetMetaDataReader::decode_footer_tail` is usable

Re: [PR] MINOR: [C++][Dev] Remove obsolete clang-tidy option [arrow]

2025-04-23 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #46213: URL: https://github.com/apache/arrow/pull/46213#issuecomment-2825422955 After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit cdc3e5a6524eb41e06ebcddf97c1566d87d2871e. There were no

[PR] Support writing encrypted Parquet files with plaintext footers [arrow-rs]

2025-04-23 Thread via GitHub
rok opened a new pull request, #7439: URL: https://github.com/apache/arrow-rs/pull/7439 Closes #7320. # Rationale for this change The Parquet format allows encrypting some or all column data while keeping footers in plaintext for compatibility with readers that don't support

[I] Move parquet::file::metadata::reader::FooterTail to parquet::file::metadata so that it is public [arrow-rs]

2025-04-23 Thread via GitHub
masonh22 opened a new issue, #7438: URL: https://github.com/apache/arrow-rs/issues/7438 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** `ParquetMetaDataReader::decode_footer` was deprecated in favor of `ParquetMetaDataReader::

Re: [I] Move parquet::file::metadata::reader::FooterTail to parquet::file::metadata so that it is public [arrow-rs]

2025-04-23 Thread via GitHub
masonh22 commented on issue #7438: URL: https://github.com/apache/arrow-rs/issues/7438#issuecomment-2825393583 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

Re: [PR] [EXP] GH-44792: [C++] Require C++20 [arrow]

2025-04-23 Thread via GitHub
pitrou commented on PR #45445: URL: https://github.com/apache/arrow/pull/45445#issuecomment-2825009857 @github-actions crossbow submit *wheel*cp313* -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[PR] feat(c): Use C++ visibility support in Meson configuration [arrow-adbc]

2025-04-23 Thread via GitHub
WillAyd opened a new pull request, #2740: URL: https://github.com/apache/arrow-adbc/pull/2740 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e

Re: [PR] Feat/deterministic metadata encoding [arrow-rs]

2025-04-23 Thread via GitHub
timsaucer commented on PR #7437: URL: https://github.com/apache/arrow-rs/pull/7437#issuecomment-2825315010 I currently based this off 54.3.1 but I will updated it to `main` after we have completed internal testing. -- This is an automated message from the Apache Git Service. To respond to

[PR] Feat/deterministic metadata encoding [arrow-rs]

2025-04-23 Thread via GitHub
timsaucer opened a new pull request, #7437: URL: https://github.com/apache/arrow-rs/pull/7437 # Which issue does this PR close? None, but I can open one if necessary. # Rationale for this change The ordering of metadata is not consistent since it uses a HashMap. It can

Re: [PR] chore(rust): make Arrow version selection more flexible [arrow-adbc]

2025-04-23 Thread via GitHub
felipecrv commented on PR #2525: URL: https://github.com/apache/arrow-adbc/pull/2525#issuecomment-2825297326 > > Why can't we split workspaces in this repo? > > We can also do that. -> https://github.com/apache/arrow-adbc/issues/2739 -- This is an automated message from the A

Re: [PR] GH-45028: [C++][Compute] Allow cast to reorder struct fields [arrow]

2025-04-23 Thread via GitHub
Tom-Newton commented on PR #45246: URL: https://github.com/apache/arrow/pull/45246#issuecomment-2825181220 Sorry for the direct ping, but @lidavidm please could you review. This is a devlopment on https://github.com/apache/arrow/pull/44587 which you reviewed for me previously. -- This i

Re: [PR] [EXP] GH-44792: [C++] Require C++20 [arrow]

2025-04-23 Thread via GitHub
pitrou commented on PR #45445: URL: https://github.com/apache/arrow/pull/45445#issuecomment-2825128283 @assignUser See R Crossbow build results just above. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] GH-45957: [Python] Expose `allow_delayed_open` on S3FileSystem [arrow]

2025-04-23 Thread via GitHub
pitrou commented on PR #46078: URL: https://github.com/apache/arrow/pull/46078#issuecomment-2824924423 > @pitrou @AlenkaF do you believe `allow_delayed_open` should be accepted as a query parameter when using it as a uri ? If it's not already the case, then yes. -- This is an autom

Re: [PR] [EXP] GH-44792: [C++] Require C++20 [arrow]

2025-04-23 Thread via GitHub
github-actions[bot] commented on PR #45445: URL: https://github.com/apache/arrow/pull/45445#issuecomment-2825017619 Revision: 41b3f1dceb5caa99962122eb8952523efed1711d Submitted crossbow builds: [ursacomputing/crossbow @ actions-f357963111](https://github.com/ursacomputing/crossbow/bra

Re: [PR] feat(csharp/src/Drivers/Databricks): Add option to enable using direct results for statements [arrow-adbc]

2025-04-23 Thread via GitHub
jadewang-db commented on code in PR #2737: URL: https://github.com/apache/arrow-adbc/pull/2737#discussion_r2056579305 ## csharp/src/Drivers/Databricks/CloudFetch/CloudFetchReader.cs: ## @@ -117,6 +117,24 @@ public CloudFetchReader(DatabricksStatement statement, Schema schema, b

Re: [PR] Speedup take_bytes (-35% -69%) by precalculating capacity [arrow-rs]

2025-04-23 Thread via GitHub
Dandandan commented on PR #7422: URL: https://github.com/apache/arrow-rs/pull/7422#issuecomment-2825068017 > Is it fair to say that in general you've done a few refactors recently that replace `MutableBuffer` with `Vec` or collecting directly into the target `Buffer` type? Is there a partic

Re: [PR] [EXP] GH-44792: [C++] Require C++20 [arrow]

2025-04-23 Thread via GitHub
pitrou commented on PR #45445: URL: https://github.com/apache/arrow/pull/45445#issuecomment-2824940252 I've just rebased to fix conflicts -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [PR] feat(csharp/src/Drivers/Databricks): Add option to enable using direct results for statements [arrow-adbc]

2025-04-23 Thread via GitHub
jadewang-db commented on code in PR #2737: URL: https://github.com/apache/arrow-adbc/pull/2737#discussion_r2056600497 ## csharp/src/Drivers/Databricks/DatabricksReader.cs: ## @@ -39,6 +39,17 @@ public DatabricksReader(DatabricksStatement statement, Schema schema, bool isLz4

Re: [PR] feat(csharp/src/Drivers/Databricks): Add option to enable using direct results for statements [arrow-adbc]

2025-04-23 Thread via GitHub
jadewang-db commented on code in PR #2737: URL: https://github.com/apache/arrow-adbc/pull/2737#discussion_r2056579305 ## csharp/src/Drivers/Databricks/CloudFetch/CloudFetchReader.cs: ## @@ -117,6 +117,24 @@ public CloudFetchReader(DatabricksStatement statement, Schema schema, b

Re: [PR] [EXP] GH-44792: [C++] Require C++20 [arrow]

2025-04-23 Thread via GitHub
github-actions[bot] commented on PR #45445: URL: https://github.com/apache/arrow/pull/45445#issuecomment-2825017282 Revision: 41b3f1dceb5caa99962122eb8952523efed1711d Submitted crossbow builds: [ursacomputing/crossbow @ actions-737fa9ef86](https://github.com/ursacomputing/crossbow/bra

Re: [PR] [EXP] GH-44792: [C++] Require C++20 [arrow]

2025-04-23 Thread via GitHub
pitrou commented on PR #45445: URL: https://github.com/apache/arrow/pull/45445#issuecomment-2825010072 @github-actions crossbow submit -g r -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] GH-46209: [Documentation][C++][Compute] Internal documentation for row table [arrow]

2025-04-23 Thread via GitHub
zanmato1984 commented on PR #46210: URL: https://github.com/apache/arrow/pull/46210#issuecomment-2825002638 > Can we wrap lines to 80-90 characters to make reading easier? Thanks for looking! Lines wrapped. -- This is an automated message from the Apache Git Service. To respond to t

Re: [PR] GH-46209: [Documentation][C++][Compute] Internal documentation for row table [arrow]

2025-04-23 Thread via GitHub
zanmato1984 commented on code in PR #46210: URL: https://github.com/apache/arrow/pull/46210#discussion_r2056546534 ## cpp/src/arrow/compute/row/doc/row_table.md: ## @@ -0,0 +1,84 @@ + + +# Row Table + +## Overview + +The row table in Arrow represents data stored in row-major for

Re: [PR] feat(csharp/src/Drivers/BigQuery): Add support for AAD/Entra authentication [arrow-adbc]

2025-04-23 Thread via GitHub
CurtHagenlocher commented on code in PR #2655: URL: https://github.com/apache/arrow-adbc/pull/2655#discussion_r2056429112 ## csharp/src/Drivers/BigQuery/RetryManager.cs: ## @@ -0,0 +1,90 @@ + +/* +* Licensed to the Apache Software Foundation (ASF) under one or more +* contribut

Re: [PR] [EXP] GH-44792: [C++] Require C++20 [arrow]

2025-04-23 Thread via GitHub
github-actions[bot] commented on PR #45445: URL: https://github.com/apache/arrow/pull/45445#issuecomment-2824948644 Revision: 41b3f1dceb5caa99962122eb8952523efed1711d Submitted crossbow builds: [ursacomputing/crossbow @ actions-799ea7f430](https://github.com/ursacomputing/crossbow/bra

Re: [PR] Add a JSON reader option to ignore type conflicts [arrow-rs]

2025-04-23 Thread via GitHub
Blizzara commented on PR #7276: URL: https://github.com/apache/arrow-rs/pull/7276#issuecomment-2824947455 Hey @tustvold, have you had a chance to look at this? :) It would be very useful for our use-case as well 🤞 -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] [EXP] GH-44792: [C++] Require C++20 [arrow]

2025-04-23 Thread via GitHub
pitrou commented on PR #45445: URL: https://github.com/apache/arrow/pull/45445#issuecomment-2824939594 @github-actions crossbow submit -g cpp -g python -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [PR] Check for additional IO errors that should be retried [arrow-rs-object-store]

2025-04-23 Thread via GitHub
hachikuji commented on code in PR #319: URL: https://github.com/apache/arrow-rs-object-store/pull/319#discussion_r2056498053 ## src/client/connection.rs: ## @@ -108,12 +108,15 @@ impl HttpError { } else if e.is_timeout() { kind = HttpErrorK

Re: [PR] Check for additional IO errors that should be retried [arrow-rs-object-store]

2025-04-23 Thread via GitHub
hachikuji commented on code in PR #319: URL: https://github.com/apache/arrow-rs-object-store/pull/319#discussion_r2056498053 ## src/client/connection.rs: ## @@ -108,12 +108,15 @@ impl HttpError { } else if e.is_timeout() { kind = HttpErrorK

Re: [PR] WIP: [Release] Verify release-20.0.0-rc1 [arrow]

2025-04-23 Thread via GitHub
assignUser closed pull request #46152: WIP: [Release] Verify release-20.0.0-rc1 URL: https://github.com/apache/arrow/pull/46152 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [I] [Release][C++] verify-rc-source-cpp-macos-amd64 failed to build googlemock [arrow]

2025-04-23 Thread via GitHub
assignUser commented on issue #46195: URL: https://github.com/apache/arrow/issues/46195#issuecomment-2824895284 Fixed by #45986 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [PR] fix(rust/core): remove the Mutex around the FFI driver object [arrow-adbc]

2025-04-23 Thread via GitHub
mbrobbel commented on code in PR #2736: URL: https://github.com/apache/arrow-adbc/pull/2736#discussion_r2056443832 ## rust/core/src/driver_manager.rs: ## @@ -164,7 +164,7 @@ struct ManagedDriverInner { /// Implementation of [Driver]. #[derive(Clone)] pub struct ManagedDriver

[PR] fix(csharp/src/Apache.Arrow.Adbc/Extensions): fix Time conversion [arrow-adbc]

2025-04-23 Thread via GitHub
davidhcoe opened a new pull request, #2741: URL: https://github.com/apache/arrow-adbc/pull/2741 - Resolves https://github.com/apache/arrow-adbc/issues/2731 - Resolves a typo for Decimal64 - Some code clean up for the projects that are involved in the fix -- This is an automated messa

Re: [PR] feat(c): Use C++ visibility support in Meson configuration [arrow-adbc]

2025-04-23 Thread via GitHub
WillAyd commented on code in PR #2740: URL: https://github.com/apache/arrow-adbc/pull/2740#discussion_r2056367782 ## go/adbc/drivermgr/arrow-adbc/adbc.h: ## @@ -152,15 +152,19 @@ struct ArrowArrayStream { // Storage class macros for Windows // Allow overriding/aliasing with ap

Re: [I] arrow dataset: how to use date.year and date.month as partitioning [arrow]

2025-04-23 Thread via GitHub
AlexisBRENON commented on issue #14619: URL: https://github.com/apache/arrow/issues/14619#issuecomment-2824815924 I endend up with such kind of solutions: ```python import pyarrow as pa from pyarrow import compute as pc, dataset as ds, fs, parquet as pq partition_timestamp

Re: [PR] GH-45957: [Python] Expose `allow_delayed_open` on S3FileSystem [arrow]

2025-04-23 Thread via GitHub
gmcrocetti commented on PR #46078: URL: https://github.com/apache/arrow/pull/46078#issuecomment-2824825285 @pitrou @AlenkaF do you believe `allow_delayed_open` should be accepted as a query parameter when using it as a uri ? ```python FileSystem.from_uri("s3://path-to-bucket?allow_del

Re: [I] [C++][Parquet] Thoughts about classes in key_toolkit.h [arrow]

2025-04-23 Thread via GitHub
wgtmac commented on issue #46217: URL: https://github.com/apache/arrow/issues/46217#issuecomment-2824783597 cc @adamreeve -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

Re: [PR] feat(c): Use C++ visibility support in Meson configuration [arrow-adbc]

2025-04-23 Thread via GitHub
WillAyd commented on code in PR #2740: URL: https://github.com/apache/arrow-adbc/pull/2740#discussion_r2056369907 ## c/include/arrow-adbc/adbc.h: ## @@ -280,7 +284,7 @@ typedef uint8_t AdbcStatusCode; /// ADBC_ERROR_VENDOR_CODE_PRIVATE_DATA. Clients are required to initialize

Re: [PR] GH-45908: [C++][Docs] Expose basic {Array,...}FromJSON helpers as public APIs [arrow]

2025-04-23 Thread via GitHub
bkietz commented on PR #46180: URL: https://github.com/apache/arrow/pull/46180#issuecomment-282494 > I like `from_json_single` and `from_json_value` `from_json_value` seems fine, and I'll toss `from_json_document` into the mix as well > developers will not understand the di

[I] StructArray::try_new validation incorrectly returns an error when `logical_nulls()` returns Some() && null_count == 0 [arrow-rs]

2025-04-23 Thread via GitHub
phillipleblanc opened a new issue, #7435: URL: https://github.com/apache/arrow-rs/issues/7435 **Describe the bug** This logic in StructArray::try_new validates that if one of the child arrays in the StructArray has a null value that isn't properly masked by the parent, then it will be re

Re: [PR] GH-45957: [Python] Expose `allow_delayed_open` on S3FileSystem [arrow]

2025-04-23 Thread via GitHub
gmcrocetti commented on code in PR #46078: URL: https://github.com/apache/arrow/pull/46078#discussion_r2056315466 ## python/pyarrow/_s3fs.pyx: ## @@ -234,6 +234,8 @@ cdef class S3FileSystem(FileSystem): S3FileSystem(proxy_options={'scheme': 'http', 'host': 'localhos

Re: [PR] feat(csharp/src/Drivers/Databricks): Add option to enable using direct results for statements [arrow-adbc]

2025-04-23 Thread via GitHub
CurtHagenlocher commented on code in PR #2737: URL: https://github.com/apache/arrow-adbc/pull/2737#discussion_r2056167393 ## csharp/src/Drivers/Apache/Hive2/HiveServer2Statement.cs: ## @@ -104,12 +106,19 @@ private async Task ExecuteQueryAsyncInternal(CancellationToken canc

Re: [PR] fix(rust/core): remove the Mutex around the FFI driver object [arrow-adbc]

2025-04-23 Thread via GitHub
felipecrv commented on PR #2736: URL: https://github.com/apache/arrow-adbc/pull/2736#issuecomment-2824677371 > Should we specify that access to the private fields of the driver > > https://github.com/apache/arrow-adbc/blob/c6e1ab5d05fae6143b7fcd22896d0bc7ac14f9d3/c/include/arrow-adbc/

Re: [PR] fix(rust/core): remove the Mutex around the FFI driver object [arrow-adbc]

2025-04-23 Thread via GitHub
felipecrv commented on code in PR #2736: URL: https://github.com/apache/arrow-adbc/pull/2736#discussion_r2056301602 ## rust/core/src/driver_manager.rs: ## @@ -164,7 +164,7 @@ struct ManagedDriverInner { /// Implementation of [Driver]. #[derive(Clone)] pub struct ManagedDriver

[PR] Fix validation logic in `StructArray::try_new` to account for array.logical_nulls() returning Some() and null_count == 0 [arrow-rs]

2025-04-23 Thread via GitHub
phillipleblanc opened a new pull request, #7436: URL: https://github.com/apache/arrow-rs/pull/7436 # Which issue does this PR close? - Closes #7435 # Rationale for this change Fixes the validation logic in StructArray::try_new to not error if there aren't any unmasked n

Re: [PR] MINOR: [Dev] Add apt/yum build directories to `.gitignore` [arrow]

2025-04-23 Thread via GitHub
pitrou merged PR #46018: URL: https://github.com/apache/arrow/pull/46018 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache

Re: [I] [C++] Remove option ARROW_USE_PRECOMPILED_HEADERS [arrow]

2025-04-23 Thread via GitHub
pitrou commented on issue #46196: URL: https://github.com/apache/arrow/issues/46196#issuecomment-2824649002 Issue resolved by pull request 46200 https://github.com/apache/arrow/pull/46200 -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [PR] GH-46196: [C++] Remove ARROW_USE_PRECOMPILED_HEADERS and related logic [arrow]

2025-04-23 Thread via GitHub
pitrou merged PR #46200: URL: https://github.com/apache/arrow/pull/46200 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache

Re: [PR] GH-46215: [C++][Docs] Add README for Meson subprojects directory [arrow]

2025-04-23 Thread via GitHub
pitrou commented on code in PR #46216: URL: https://github.com/apache/arrow/pull/46216#discussion_r2056281632 ## cpp/subprojects/README.md: ## @@ -0,0 +1,64 @@ +# Meson Subprojects Review Comment: I think you need to add a license header above. ## cpp/subprojects/

Re: [PR] GH-46215: [C++][Docs] Add README for Meson subprojects directory [arrow]

2025-04-23 Thread via GitHub
github-actions[bot] commented on PR #46216: URL: https://github.com/apache/arrow/pull/46216#issuecomment-2824602737 :warning: GitHub issue #46215 **has been automatically assigned in GitHub** to PR creator. -- This is an automated message from the Apache Git Service. To respond to the mes

Re: [PR] GH-46215: [C++][Docs] Add README for Meson subprojects directory [arrow]

2025-04-23 Thread via GitHub
WillAyd commented on PR #46216: URL: https://github.com/apache/arrow/pull/46216#issuecomment-2824601980 @pitrou -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

[PR] GH-46215: [C++][Docs] Add README for Meson subprojects directory [arrow]

2025-04-23 Thread via GitHub
WillAyd opened a new pull request, #46216: URL: https://github.com/apache/arrow/pull/46216 ### Rationale for this change This clarifies what the subprojects directory does, for developers not familiar with Meson ### What changes are included in this PR? This adds a READM

Re: [PR] GH-45664: [C++] Allow LargeString,LargeBinary,FixedSizeBinary,StringView and BinaryView for RecordBatch::MakeStatisticsArray() [arrow]

2025-04-23 Thread via GitHub
andishgar commented on code in PR #46031: URL: https://github.com/apache/arrow/pull/46031#discussion_r2055545670 ## cpp/src/arrow/record_batch_test.cc: ## @@ -1423,35 +1452,132 @@ TEST_F(TestRecordBatch, MakeStatisticsArrayMaxApproximate) { AssertArraysEqual(*expected_statis

Re: [PR] GH-45957: [Python] Expose `allow_delayed_open` on S3FileSystem [arrow]

2025-04-23 Thread via GitHub
pitrou commented on code in PR #46078: URL: https://github.com/apache/arrow/pull/46078#discussion_r2056213810 ## python/pyarrow/tests/test_fs.py: ## @@ -1230,6 +1230,10 @@ def test_s3_options(pickle_module): assert pickle_module.loads(pickle_module.dumps(fs2)) == fs2 a

Re: [PR] GH-45957: [Python] Expose `allow_delayed_open` on S3FileSystem [arrow]

2025-04-23 Thread via GitHub
pitrou commented on code in PR #46078: URL: https://github.com/apache/arrow/pull/46078#discussion_r2056212394 ## python/pyarrow/_s3fs.pyx: ## @@ -234,6 +234,8 @@ cdef class S3FileSystem(FileSystem): S3FileSystem(proxy_options={'scheme': 'http', 'host': 'localhost',

Re: [PR] GH-38903: [R][Docs] Improve documentation of col_types [arrow]

2025-04-23 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #46145: URL: https://github.com/apache/arrow/pull/46145#issuecomment-2824526116 After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit 486670a7266cf6f49d0b7cc0209359332b27572a. There were no

Re: [PR] GH-45028: [C++][Compute] Allow cast to reorder struct fields [arrow]

2025-04-23 Thread via GitHub
Tom-Newton commented on PR #45246: URL: https://github.com/apache/arrow/pull/45246#issuecomment-2823722806 I forgot about this, but I would like to try to get it merged. I'll try to fix the merge conflicts from https://github.com/apache/arrow/pull/43782 in the next couple of days. Then I mi

  1   2   >