Re: [I] [SqlServer - ACERO_MAIN] Run c++ sql-server issue [arrow]

2024-08-07 Thread via GitHub
kou commented on issue #43611: URL: https://github.com/apache/arrow/issues/43611#issuecomment-2275060615 What is `acero_main - sql server`? Could you share more information? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] GH-43536: [Python] Declare support for free-threading in Cython [arrow]

2024-08-07 Thread via GitHub
kou commented on code in PR #43606: URL: https://github.com/apache/arrow/pull/43606#discussion_r1708730276 ## cpp/cmake_modules/UseCython.cmake: ## @@ -184,4 +184,24 @@ function(cython_add_module _name pyx_target_name generated_files) add_dependencies(${_name} ${pyx_target_n

Re: [PR] GH-38255: [Java] Implement Flight SQL Bulk Ingestion [arrow]

2024-08-07 Thread via GitHub
eramitmittal commented on PR #43551: URL: https://github.com/apache/arrow/pull/43551#issuecomment-2275035249 Thanks looking into the integration failures. They passed on my local machine. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [I] [Archery] "archery docker" should preferable call `docker compose`, not `docker-compose` [arrow]

2024-08-07 Thread via GitHub
kou commented on issue #43608: URL: https://github.com/apache/arrow/issues/43608#issuecomment-2275011212 Issue resolved by pull request 43586 https://github.com/apache/arrow/pull/43586 -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] GH-43608: [CI][Archery] Prefer `docker compose` over `docker-compose` [arrow]

2024-08-07 Thread via GitHub
kou merged PR #43586: URL: https://github.com/apache/arrow/pull/43586 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.or

Re: [PR] ci: work around download-artifact bug [arrow-adbc]

2024-08-07 Thread via GitHub
lidavidm merged PR #2067: URL: https://github.com/apache/arrow-adbc/pull/2067 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.a

Re: [PR] GH-38255: [Java] Implement Flight SQL Bulk Ingestion [arrow]

2024-08-07 Thread via GitHub
lidavidm commented on PR #43551: URL: https://github.com/apache/arrow/pull/43551#issuecomment-2274854913 It appears this fails the actual integration tests, though -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

Re: [I] [Java] Handle `offset` field from `ArrowArray` when `BufferImportTypeVisitor` imports offset buffer [arrow]

2024-08-07 Thread via GitHub
vibhatha commented on issue #42156: URL: https://github.com/apache/arrow/issues/42156#issuecomment-2274802932 @lidavidm @bkietz I have been working on a solution for this issue in the Java side, but I later realized that we don't have any C data interface tests to evaluate something like sl

Re: [PR] GH-43502: [Java] Fix Java JNI / AMD64 manylinux2014 Java JNI test not test dataset module [arrow]

2024-08-07 Thread via GitHub
vibhatha commented on PR #43503: URL: https://github.com/apache/arrow/pull/43503#issuecomment-2274799383 > I think the previous java-jars successful run proves its fine. However the Java JNI CI job is failing. Did we want to fix the TestCsvFragment options in this PR? Wasn't the obje

Re: [I] Streaming LIST data over ADBC [arrow-adbc]

2024-08-07 Thread via GitHub
paleolimbot commented on issue #2066: URL: https://github.com/apache/arrow-adbc/issues/2066#issuecomment-2274793528 I find installing the development ADBC Python packages rather difficult; however, I do have a build set up and ran your example (thanks!) against the postgres driver at main a

Re: [PR] feat: Support for Binaryview and StringView types [arrow-nanoarrow]

2024-08-07 Thread via GitHub
paleolimbot commented on PR #367: URL: https://github.com/apache/arrow-nanoarrow/pull/367#issuecomment-2274753382 I'll take a stab at this in the next few days to see what is required beyond this PR. You probably just need a "build by buffer" and "consume by buffer" level of support (as opp

Re: [PR] feat: Add IPC stream writing [arrow-nanoarrow]

2024-08-07 Thread via GitHub
paleolimbot commented on code in PR #571: URL: https://github.com/apache/arrow-nanoarrow/pull/571#discussion_r1708350957 ## src/nanoarrow/common/inline_buffer.h: ## @@ -451,8 +451,8 @@ static inline void ArrowBitClear(uint8_t* bits, int64_t i) { } static inline void ArrowBit

Re: [I] API for encoding/decoding ParquetMetadata with more control [arrow-rs]

2024-08-07 Thread via GitHub
adriangb commented on issue #6002: URL: https://github.com/apache/arrow-rs/issues/6002#issuecomment-2274687194 To alleviate concerns about the API design, could we keep that private? That is, we'd have: 1. `MetadataLoader`: the existing public async API for loading metadata. 2. `Parque

Re: [PR] GH-43454: [C++][Python] Add Opaque canonical extension type [arrow]

2024-08-07 Thread via GitHub
lidavidm commented on code in PR #43458: URL: https://github.com/apache/arrow/pull/43458#discussion_r1708234668 ## cpp/src/arrow/extension/opaque_test.cc: ## @@ -0,0 +1,189 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreemen

Re: [PR] Add time dictionary coercions [arrow-rs]

2024-08-07 Thread via GitHub
adriangb commented on PR #6208: URL: https://github.com/apache/arrow-rs/pull/6208#issuecomment-2274588332 Sorry I'm not sure I follow. Are you worried about the verbosity of handling all of these cases? -- This is an automated message from the Apache Git Service. To respond to the messag

[PR] minor: Suggest take on interleave docs [arrow-rs]

2024-08-07 Thread via GitHub
gstvg opened a new pull request, #6210: URL: https://github.com/apache/arrow-rs/pull/6210 # Which issue does this PR close? None # Rationale for this change Interleave docs suggests itself instead of take # What changes are included in this PR? Suggest take

Re: [I] Update Flight Implementation Status [arrow-rs]

2024-08-07 Thread via GitHub
Michael-J-Ward commented on issue #4337: URL: https://github.com/apache/arrow-rs/issues/4337#issuecomment-2274514436 triage: Should this be completed now that the arrow docs have been updated? https://github.com/apache/arrow/pull/39959 -- This is an automated message from the Apache

Re: [PR] feat(object_store): add `PermissionDenied` variant to top-level error [arrow-rs]

2024-08-07 Thread via GitHub
kyle-mccarthy commented on code in PR #6194: URL: https://github.com/apache/arrow-rs/pull/6194#discussion_r1708082719 ## object_store/src/client/retry.rs: ## @@ -86,6 +86,12 @@ impl Error { path, source: Box::new(self), }, +

Re: [PR] feat: Support for Binaryview and StringView types [arrow-nanoarrow]

2024-08-07 Thread via GitHub
JayjeetAtGithub commented on PR #367: URL: https://github.com/apache/arrow-nanoarrow/pull/367#issuecomment-2274481221 Hi @jorisvandenbossche @paleolimbot We are trying to support interop between `arrow::StringViewArray` / `arrow::BinaryViewArray` and `cudf::column` in `libcudf`. But since,

Re: [PR] Add time dictionary coercions [arrow-rs]

2024-08-07 Thread via GitHub
tustvold commented on PR #6208: URL: https://github.com/apache/arrow-rs/pull/6208#issuecomment-2274468149 I wonder if we want to reduce the codegen by coercing to the corresponding primitive array type and then coercing this to a dictionary -- This is an automated message from the Apache

Re: [I] Make it easy to write parquet to object_store -- Implement `AsyncFileWriter` for a type that implements `obj_store::MultipartUpload` for `AsyncArrowWriter` [arrow-rs]

2024-08-07 Thread via GitHub
tustvold closed issue #6200: Make it easy to write parquet to object_store -- Implement `AsyncFileWriter` for a type that implements `obj_store::MultipartUpload` for `AsyncArrowWriter` URL: https://github.com/apache/arrow-rs/issues/6200 -- This is an automated message from the Apache Git Ser

Re: [I] Make it easy to write parquet to object_store -- Implement `AsyncFileWriter` for a type that implements `obj_store::MultipartUpload` for `AsyncArrowWriter` [arrow-rs]

2024-08-07 Thread via GitHub
tustvold commented on issue #6200: URL: https://github.com/apache/arrow-rs/issues/6200#issuecomment-2274461346 I believe this is a duplicate of https://github.com/apache/arrow-rs/issues/5766 -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [PR] GH-43487: [Python] Sanitize Python reference handling in UDF implementation [arrow]

2024-08-07 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #43557: URL: https://github.com/apache/arrow/pull/43557#issuecomment-2274412136 After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit 1f2479908323daff3b08d1d585517239cae637d2. There were no

Re: [PR] feat: Add IPC stream writing [arrow-nanoarrow]

2024-08-07 Thread via GitHub
bkietz commented on PR #571: URL: https://github.com/apache/arrow-nanoarrow/pull/571#issuecomment-2274377726 [Valgrind](https://github.com/apache/arrow-nanoarrow/actions/runs/10291048514/job/28482448351?pr=571#step:11:1049) is complaining about an invalid read. I can reproduce this locally

Re: [I] Reproducible segfaults closing CSVWriter [arrow]

2024-08-07 Thread via GitHub
jpfeuffer commented on issue #43604: URL: https://github.com/apache/arrow/issues/43604#issuecomment-2274286765 By the way, even if I do not limit the chunks, the written file is always corrupt statrting from a few thousand lines in, with lines being mixed together etc. Is something w

Re: [I] Reproducible segfaults closing CSVWriter [arrow]

2024-08-07 Thread via GitHub
jpfeuffer commented on issue #43604: URL: https://github.com/apache/arrow/issues/43604#issuecomment-2274282871 Unfortunately nothing small enough to upload. And since it is not working I cannot produce a subset easily. Data is from here. Warning 15GB. https://ftp.enamine.net/downl

Re: [PR] GH-43427: [C++][Parquet] Deprecate ColumnChunk::file_offset field and no longer write Metadata at end of Chunk [arrow]

2024-08-07 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #43428: URL: https://github.com/apache/arrow/pull/43428#issuecomment-2274281909 After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit 9b584547c768fb09b2e33b4ad8797cf45c3b3b97. There were no

Re: [PR] feat: Add IPC stream writing [arrow-nanoarrow]

2024-08-07 Thread via GitHub
bkietz commented on code in PR #571: URL: https://github.com/apache/arrow-nanoarrow/pull/571#discussion_r1707778173 ## src/nanoarrow/nanoarrow_ipc.h: ## @@ -491,8 +501,59 @@ ArrowErrorCode ArrowIpcOutputStreamInitBuffer(struct ArrowIpcOutputStream* strea /// close_on_release a

Re: [PR] feat: Add IPC stream writing [arrow-nanoarrow]

2024-08-07 Thread via GitHub
bkietz commented on code in PR #571: URL: https://github.com/apache/arrow-nanoarrow/pull/571#discussion_r1707769703 ## src/nanoarrow/common/inline_types.h: ## @@ -314,7 +314,7 @@ static inline void ArrowErrorSetString(struct ArrowError* error, const char* src #define NANOARROW

Re: [PR] GH-43519: [Python] Set up CI for Python 3.13 [arrow]

2024-08-07 Thread via GitHub
pitrou commented on PR #43539: URL: https://github.com/apache/arrow/pull/43539#issuecomment-2274197678 Ok, while the wheels build successfully, the wheel tests fail due to an unavailable Docker image: ``` DEBUG:archery:Executing `['docker', 'build', '--build-arg', 'BUILDKIT_INLINE_CAC

Re: [PR] feat: Add IPC stream writing [arrow-nanoarrow]

2024-08-07 Thread via GitHub
paleolimbot commented on code in PR #571: URL: https://github.com/apache/arrow-nanoarrow/pull/571#discussion_r1707654816 ## src/nanoarrow/common/inline_types.h: ## @@ -314,7 +314,7 @@ static inline void ArrowErrorSetString(struct ArrowError* error, const char* src #define NANO

[PR] fix: Correctly handle take on dense union of a single selected type [arrow-rs]

2024-08-07 Thread via GitHub
gstvg opened a new pull request, #6209: URL: https://github.com/apache/arrow-rs/pull/6209 # Which issue does this PR close? Closes #6206. # What changes are included in this PR? At #5873, I naively called `filter_primitive` instead of `filter` to avoid arcing and downcas

Re: [PR] GH-43519: [Python] Set up CI for Python 3.13 [arrow]

2024-08-07 Thread via GitHub
github-actions[bot] commented on PR #43539: URL: https://github.com/apache/arrow/pull/43539#issuecomment-2274052340 Revision: f33cf7dac9d656bad26d00b89b84d1e5a65adfe9 Submitted crossbow builds: [ursacomputing/crossbow @ actions-1e4ed8b442](https://github.com/ursacomputing/crossbow/bra

Re: [PR] GH-43519: [Python] Set up CI for Python 3.13 [arrow]

2024-08-07 Thread via GitHub
pitrou commented on PR #43539: URL: https://github.com/apache/arrow/pull/43539#issuecomment-2274045652 @github-actions crossbow submit -g wheel -g python -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [I] [Go] Fatal Error in pqarrow.writeDenseArrow : invalid pointer [arrow]

2024-08-07 Thread via GitHub
zeroshade commented on issue #43596: URL: https://github.com/apache/arrow/issues/43596#issuecomment-2274001663 How reproducible is this? If it's reliably reproducible then we should be able to debug it -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] GH-43502: [Java] Fix Java JNI / AMD64 manylinux2014 Java JNI test not test dataset module [arrow]

2024-08-07 Thread via GitHub
danepitkin commented on PR #43503: URL: https://github.com/apache/arrow/pull/43503#issuecomment-2273982550 I think the previous java-jars successful run proves its fine. However the Java JNI CI job is failing. Did we want to fix the TestCsvFragment options in this PR? -- This is an autom

Re: [I] Using a take kernel on a dense union can result in reaching "unreachable" code [arrow-rs]

2024-08-07 Thread via GitHub
gstvg commented on issue #6206: URL: https://github.com/apache/arrow-rs/issues/6206#issuecomment-2273962554 Thanks for your report @mhilton, i will investigate -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

Re: [PR] feat(object_store): add `PermissionDenied` variant to top-level error [arrow-rs]

2024-08-07 Thread via GitHub
kyle-mccarthy commented on code in PR #6194: URL: https://github.com/apache/arrow-rs/pull/6194#discussion_r1707485784 ## object_store/src/lib.rs: ## @@ -1274,6 +1274,16 @@ pub enum Error { #[snafu(display("Operation not yet implemented."))] NotImplemented, +#[sna

Re: [PR] GH-43532: [Python] Remove usage of deprecated pkg_resources in setup.py [arrow]

2024-08-07 Thread via GitHub
tlm365 commented on code in PR #43602: URL: https://github.com/apache/arrow/pull/43602#discussion_r1707477004 ## dev/release/02-source-test.rb: ## @@ -84,6 +84,7 @@ def test_csharp_git_commit_information def test_python_version source Dir.chdir("#{@tag_name_no_rc}/p

Re: [PR] GH-43519: [Python] Set up CI for Python 3.13 [arrow]

2024-08-07 Thread via GitHub
lysnikolaou commented on PR #43539: URL: https://github.com/apache/arrow/pull/43539#issuecomment-2273927338 > You seem to be basing this PR on a slightly old version of git main, can you merge or rebase and then push again? Done. -- This is an automated message from the Apache Git

[PR] Add time dictionary coercions [arrow-rs]

2024-08-07 Thread via GitHub
adriangb opened a new pull request, #6208: URL: https://github.com/apache/arrow-rs/pull/6208 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-

Re: [PR] GH-43588: [Python] Allow tuple for rename columns [arrow]

2024-08-07 Thread via GitHub
0x26res commented on code in PR #43609: URL: https://github.com/apache/arrow/pull/43609#discussion_r1707421423 ## cpp/submodules/parquet-testing: ## Review Comment: I've rolled it back. I don't know why it always gets in a bad state. Something to do with my set up. --

Re: [PR] GH-43519: [Python] Set up CI for Python 3.13 [arrow]

2024-08-07 Thread via GitHub
pitrou commented on PR #43539: URL: https://github.com/apache/arrow/pull/43539#issuecomment-2273872141 @lysnikolaou You seem to be basing this PR on a slightly old version of git main, can you merge or rebase and then push again? -- This is an automated message from the Apache Git Service

Re: [PR] GH-43588: [Python] Allow tuple for rename columns [arrow]

2024-08-07 Thread via GitHub
mapleFU commented on code in PR #43609: URL: https://github.com/apache/arrow/pull/43609#discussion_r1707399810 ## cpp/submodules/parquet-testing: ## Review Comment: why this is updated? -- This is an automated message from the Apache Git Service. To respond to the messa

Re: [PR] GH-43588: [Python] Allow tuple for rename columns [arrow]

2024-08-07 Thread via GitHub
github-actions[bot] commented on PR #43609: URL: https://github.com/apache/arrow/pull/43609#issuecomment-2273865087 :warning: GitHub issue #43588 **has been automatically assigned in GitHub** to PR creator. -- This is an automated message from the Apache Git Service. To respond to the mes

[PR] GH-43588: [Python] Allow tuple for rename columns [arrow]

2024-08-07 Thread via GitHub
0x26res opened a new pull request, #43609: URL: https://github.com/apache/arrow/pull/43609 ### Rationale for this change Backward compatibility issue. ### What changes are included in this PR? Allow tuple (as well as list) in `Table.rename_columns` ### Are thes

[PR] feat(c/driver/postgresql): Support queries that bind parameters and return a result [arrow-adbc]

2024-08-07 Thread via GitHub
paleolimbot opened a new pull request, #2065: URL: https://github.com/apache/arrow-adbc/pull/2065 Work in progress! Some possible steps: - Separate the `BindStream` (since we'll need it in the result array reader or an extension of the result array reader) - Make the `BindStream` m

Re: [PR] GH-43532: [Python] Remove usage of deprecated pkg_resources in setup.py [arrow]

2024-08-07 Thread via GitHub
jorisvandenbossche commented on code in PR #43602: URL: https://github.com/apache/arrow/pull/43602#discussion_r1707382155 ## dev/release/02-source-test.rb: ## @@ -84,6 +84,7 @@ def test_csharp_git_commit_information def test_python_version source Dir.chdir("#{@tag_n

Re: [PR] GH-43536: [Python] Declare support for free-threading in Cython [arrow]

2024-08-07 Thread via GitHub
pitrou commented on code in PR #43606: URL: https://github.com/apache/arrow/pull/43606#discussion_r1707380788 ## python/CMakeLists.txt: ## @@ -855,6 +856,10 @@ set(CYTHON_FLAGS "${CYTHON_FLAGS}" "--warning-errors") # undocumented Cython feature. set(CYTHON_FLAGS "${CYTHON_FLAG

Re: [PR] GH-43608: [CI][Archery] Prefer `docker compose` over `docker-compose` [arrow]

2024-08-07 Thread via GitHub
pitrou commented on PR #43586: URL: https://github.com/apache/arrow/pull/43586#issuecomment-2273849251 The remaining CI failures look unrelated to these changes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] GH-43536: [Python] Declare support for free-threading in Cython [arrow]

2024-08-07 Thread via GitHub
lysnikolaou commented on code in PR #43606: URL: https://github.com/apache/arrow/pull/43606#discussion_r1707379512 ## python/CMakeLists.txt: ## @@ -855,6 +856,10 @@ set(CYTHON_FLAGS "${CYTHON_FLAGS}" "--warning-errors") # undocumented Cython feature. set(CYTHON_FLAGS "${CYTHON

Re: [PR] GH-43536: [Python] Declare support for free-threading in Cython [arrow]

2024-08-07 Thread via GitHub
pitrou commented on code in PR #43606: URL: https://github.com/apache/arrow/pull/43606#discussion_r1707375497 ## python/CMakeLists.txt: ## @@ -855,6 +856,10 @@ set(CYTHON_FLAGS "${CYTHON_FLAGS}" "--warning-errors") # undocumented Cython feature. set(CYTHON_FLAGS "${CYTHON_FLAG

Re: [PR] GH-17682: [C++][Python] Bool8 Extension Type Implementation [arrow]

2024-08-07 Thread via GitHub
pitrou commented on code in PR #43488: URL: https://github.com/apache/arrow/pull/43488#discussion_r1707362173 ## cpp/src/arrow/extension/bool8.cc: ## @@ -0,0 +1,58 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See

Re: [PR] GH-41719: [C++][Parquet] Cannot read encrypted parquet datasets via _metadata file [arrow]

2024-08-07 Thread via GitHub
rok commented on code in PR #41821: URL: https://github.com/apache/arrow/pull/41821#discussion_r1707371998 ## cpp/src/parquet/file_writer.cc: ## @@ -567,6 +567,44 @@ void WriteEncryptedFileMetadata(const FileMetaData& file_metadata, } } +void WriteEncryptedMetadataFile( +

Re: [PR] GH-17682: [C++][Python] Bool8 Extension Type Implementation [arrow]

2024-08-07 Thread via GitHub
pitrou commented on PR #43488: URL: https://github.com/apache/arrow/pull/43488#issuecomment-2273840962 @joellubi We'll need to update the "Canonical Extension types" table at the end of https://arrow.apache.org/docs/status.html#data-types -- This is an automated message from the Apache Gi

Re: [PR] GH-43598: [C++][Parquet] Parquet Metadata Printer supports print sort-columns [arrow]

2024-08-07 Thread via GitHub
wgtmac commented on PR #43599: URL: https://github.com/apache/arrow/pull/43599#issuecomment-2273835814 Should we break a new line for each item in `SortColumns`? It currently looks a little bit lengthy to me. -- This is an automated message from the Apache Git Service. To respond to the m

Re: [PR] GH-41719: [C++][Parquet] Cannot read encrypted parquet datasets via _metadata file [arrow]

2024-08-07 Thread via GitHub
rok commented on code in PR #41821: URL: https://github.com/apache/arrow/pull/41821#discussion_r1707350630 ## cpp/src/parquet/file_writer.cc: ## @@ -567,6 +567,44 @@ void WriteEncryptedFileMetadata(const FileMetaData& file_metadata, } } +void WriteEncryptedMetadataFile( +

Re: [PR] GH-43536: [Python] Do not use borrowed references APIs under free-threaded CPython [arrow]

2024-08-07 Thread via GitHub
pitrou commented on code in PR #43540: URL: https://github.com/apache/arrow/pull/43540#discussion_r1707355299 ## python/pyarrow/src/arrow/python/platform.h: ## @@ -24,7 +24,9 @@ // to mean Py_ssize_t (defining this to suppress deprecation warning) #define PY_SSIZE_T_CLEAN -#

Re: [PR] GH-41719: [C++][Parquet] Cannot read encrypted parquet datasets via _metadata file [arrow]

2024-08-07 Thread via GitHub
rok commented on code in PR #41821: URL: https://github.com/apache/arrow/pull/41821#discussion_r1707350630 ## cpp/src/parquet/file_writer.cc: ## @@ -567,6 +567,44 @@ void WriteEncryptedFileMetadata(const FileMetaData& file_metadata, } } +void WriteEncryptedMetadataFile( +

Re: [PR] GH-43536: [Python] Do not use borrowed references APIs under free-threaded CPython [arrow]

2024-08-07 Thread via GitHub
pitrou commented on code in PR #43540: URL: https://github.com/apache/arrow/pull/43540#discussion_r1707356156 ## python/pyarrow/src/arrow/python/vendored/pythoncapi_compat.h: ## @@ -0,0 +1,1519 @@ +// Header file providing new C API functions to old Python versions. +// +// File

Re: [PR] GH-43536: [Python] Do not use borrowed references APIs under free-threaded CPython [arrow]

2024-08-07 Thread via GitHub
pitrou commented on code in PR #43540: URL: https://github.com/apache/arrow/pull/43540#discussion_r1707355299 ## python/pyarrow/src/arrow/python/platform.h: ## @@ -24,7 +24,9 @@ // to mean Py_ssize_t (defining this to suppress deprecation warning) #define PY_SSIZE_T_CLEAN -#

Re: [PR] GH-41719: [C++][Parquet] Cannot read encrypted parquet datasets via _metadata file [arrow]

2024-08-07 Thread via GitHub
rok commented on code in PR #41821: URL: https://github.com/apache/arrow/pull/41821#discussion_r1707350630 ## cpp/src/parquet/file_writer.cc: ## @@ -567,6 +567,44 @@ void WriteEncryptedFileMetadata(const FileMetaData& file_metadata, } } +void WriteEncryptedMetadataFile( +

Re: [PR] GH-43532: [Python] Remove usage of deprecated pkg_resources in setup.py [arrow]

2024-08-07 Thread via GitHub
pitrou commented on code in PR #43602: URL: https://github.com/apache/arrow/pull/43602#discussion_r1707343803 ## dev/release/02-source-test.rb: ## @@ -84,6 +84,7 @@ def test_csharp_git_commit_information def test_python_version source Dir.chdir("#{@tag_name_no_rc}/p

Re: [PR] GH-43519: [Python] Set up CI for Python 3.13 [arrow]

2024-08-07 Thread via GitHub
github-actions[bot] commented on PR #43539: URL: https://github.com/apache/arrow/pull/43539#issuecomment-2273814598 Revision: 1d0335462093ed0cf103b42cb15f67571dead37c Submitted crossbow builds: [ursacomputing/crossbow @ actions-cf57ab8608](https://github.com/ursacomputing/crossbow/bra

Re: [PR] GH-43532: [Python] Remove usage of deprecated pkg_resources in setup.py [arrow]

2024-08-07 Thread via GitHub
pitrou commented on code in PR #43602: URL: https://github.com/apache/arrow/pull/43602#discussion_r1707340035 ## python/setup.py: ## @@ -18,6 +18,7 @@ # under the License. import contextlib +import numpy Review Comment: It would be nice if NumPy was not necessary simply

Re: [PR] GH-43536: [Python] Declare support for free-threading in Cython [arrow]

2024-08-07 Thread via GitHub
lysnikolaou commented on code in PR #43606: URL: https://github.com/apache/arrow/pull/43606#discussion_r1707332850 ## python/CMakeLists.txt: ## @@ -855,6 +856,10 @@ set(CYTHON_FLAGS "${CYTHON_FLAGS}" "--warning-errors") # undocumented Cython feature. set(CYTHON_FLAGS "${CYTHON

Re: [PR] GH-43519: [Python] Set up CI for Python 3.13 [arrow]

2024-08-07 Thread via GitHub
pitrou commented on PR #43539: URL: https://github.com/apache/arrow/pull/43539#issuecomment-2273806951 @github-actions crossbow submit -g wheel -g python -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] GH-43608: [CI][Archery] Prefer `docker compose` over `docker-compose` [arrow]

2024-08-07 Thread via GitHub
github-actions[bot] commented on PR #43586: URL: https://github.com/apache/arrow/pull/43586#issuecomment-2273791088 :warning: GitHub issue #43608 **has been automatically assigned in GitHub** to PR creator. -- This is an automated message from the Apache Git Service. To respond to the mes

Re: [PR] GH-43605: [Go][Parquet] Recover from panic in file reader [arrow]

2024-08-07 Thread via GitHub
mapleFU commented on PR #43607: URL: https://github.com/apache/arrow/pull/43607#issuecomment-2273789279 https://github.com/apache/parquet-testing/pull/48 have some corrput parquet file which is able to use, but it might need a few days to check and merge that. Maybe it's more easier to make

Re: [PR] GH-43605: [Go][Parquet] Recover from panic in file reader [arrow]

2024-08-07 Thread via GitHub
don4get commented on PR #43607: URL: https://github.com/apache/arrow/pull/43607#issuecomment-2273784101 > Do we know why it was causing the panic in the first place? Can you add the corrupted file to https://github.com/apache/arrow-testing and then use that for a test? Sorry but I ca

Re: [PR] GH-43519: [Python] Set up CI for Python 3.13 [arrow]

2024-08-07 Thread via GitHub
github-actions[bot] commented on PR #43539: URL: https://github.com/apache/arrow/pull/43539#issuecomment-2273780244 Revision: 704078819080761fe2809277819f21d9560500fb Submitted crossbow builds: [ursacomputing/crossbow @ actions-ac5fc4d60e](https://github.com/ursacomputing/crossbow/bra

Re: [PR] GH-43519: [Python] Set up CI for Python 3.13 [arrow]

2024-08-07 Thread via GitHub
jorisvandenbossche commented on PR #43539: URL: https://github.com/apache/arrow/pull/43539#issuecomment-2273775439 @github-actions crossbow submit wheel-manylinux-2-28-cp313-amd64 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

Re: [PR] GH-43519: [Python] Set up CI for Python 3.13 [arrow]

2024-08-07 Thread via GitHub
pitrou commented on code in PR #43539: URL: https://github.com/apache/arrow/pull/43539#discussion_r1707290196 ## ci/scripts/install_python.sh: ## @@ -46,7 +47,7 @@ full_version=${versions[$2]} if [ $platform = "macOS" ]; then echo "Downloading Python installer..." -i

Re: [PR] GH-43519: [Python] Set up CI for Python 3.13 [arrow]

2024-08-07 Thread via GitHub
pitrou commented on code in PR #43539: URL: https://github.com/apache/arrow/pull/43539#discussion_r1707291497 ## ci/docker/python-wheel-manylinux.dockerfile: ## @@ -103,7 +103,11 @@ RUN vcpkg install \ # Configure Python for applications running in the bash shell of this Docke

Re: [PR] GH-43388: [Python] Give precedence to pycapsule interface in pa.schema(..) [arrow]

2024-08-07 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #43486: URL: https://github.com/apache/arrow/pull/43486#issuecomment-2273765733 After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit b852b720d43547136e39fbcd9773ba11fe625909. There were no

Re: [PR] GH-43532: [Python] Remove usage of deprecated pkg_resources in setup.py [arrow]

2024-08-07 Thread via GitHub
github-actions[bot] commented on PR #43602: URL: https://github.com/apache/arrow/pull/43602#issuecomment-2273756350 :warning: GitHub issue #43532 **has been automatically assigned in GitHub** to PR creator. -- This is an automated message from the Apache Git Service. To respond to the mes

Re: [PR] GH-43536: [Python] Declare support for free-threading in Cython [arrow]

2024-08-07 Thread via GitHub
lysnikolaou commented on PR #43606: URL: https://github.com/apache/arrow/pull/43606#issuecomment-2273752798 > That said, I think it would be better to first add a CI build with a nogil Python. That makes sense. I'm already working on this and, at the same time, waiting for #43539 to

Re: [PR] GH-43536: [Python] Declare support for free-threading in Cython [arrow]

2024-08-07 Thread via GitHub
lysnikolaou commented on code in PR #43606: URL: https://github.com/apache/arrow/pull/43606#discussion_r1707270719 ## cpp/cmake_modules/UseCython.cmake: ## @@ -184,4 +184,24 @@ function(cython_add_module _name pyx_target_name generated_files) add_dependencies(${_name} ${pyx_

Re: [PR] GH-40592: [C++][Parquet] Implement SizeStatistics [arrow]

2024-08-07 Thread via GitHub
wgtmac commented on PR #40594: URL: https://github.com/apache/arrow/pull/40594#issuecomment-2273746348 @emkornfield @mapleFU Thanks for the feedback! I haven't addressed all comments from @pitrou yet. Will let you know once ready for review again. -- This is an automated message from the

Re: [PR] GH-43536: [Python] Declare support for free-threading in Cython [arrow]

2024-08-07 Thread via GitHub
pitrou commented on code in PR #43606: URL: https://github.com/apache/arrow/pull/43606#discussion_r1707267265 ## python/CMakeLists.txt: ## @@ -855,6 +856,10 @@ set(CYTHON_FLAGS "${CYTHON_FLAGS}" "--warning-errors") # undocumented Cython feature. set(CYTHON_FLAGS "${CYTHON_FLAG

Re: [PR] GH-43536: [Python] Declare support for free-threading in Cython [arrow]

2024-08-07 Thread via GitHub
pitrou commented on PR #43606: URL: https://github.com/apache/arrow/pull/43606#issuecomment-2273745693 @lysnikolaou Thanks for the PR. That said, I think it would be better to first add a CI build with a nogil Python. -- This is an automated message from the Apache Git Service. To respond

Re: [PR] GH-43536: [Python] Declare support for free-threading in Cython [arrow]

2024-08-07 Thread via GitHub
lysnikolaou commented on code in PR #43606: URL: https://github.com/apache/arrow/pull/43606#discussion_r1707262903 ## python/CMakeLists.txt: ## @@ -855,6 +856,10 @@ set(CYTHON_FLAGS "${CYTHON_FLAGS}" "--warning-errors") # undocumented Cython feature. set(CYTHON_FLAGS "${CYTHON

Re: [PR] GH-43536: [Python] Declare support for free-threading in Cython [arrow]

2024-08-07 Thread via GitHub
pitrou commented on code in PR #43606: URL: https://github.com/apache/arrow/pull/43606#discussion_r1707260352 ## cpp/cmake_modules/UseCython.cmake: ## @@ -184,4 +184,24 @@ function(cython_add_module _name pyx_target_name generated_files) add_dependencies(${_name} ${pyx_targe

Re: [PR] Support `StringView` and `BinaryView` in CDataInterface [arrow-rs]

2024-08-07 Thread via GitHub
alamb commented on PR #6171: URL: https://github.com/apache/arrow-rs/pull/6171#issuecomment-2273737965 This change will be included in https://github.com/apache/arrow-rs/issues/6016 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

Re: [PR] GH-43536: [Python] Declare support for free-threading in Cython [arrow]

2024-08-07 Thread via GitHub
pitrou commented on code in PR #43606: URL: https://github.com/apache/arrow/pull/43606#discussion_r1707259437 ## python/CMakeLists.txt: ## @@ -855,6 +856,10 @@ set(CYTHON_FLAGS "${CYTHON_FLAGS}" "--warning-errors") # undocumented Cython feature. set(CYTHON_FLAGS "${CYTHON_FLAG

Re: [PR] GH-43605: [Go][Parquet] Recover from panic in file reader [arrow]

2024-08-07 Thread via GitHub
zeroshade commented on PR #43607: URL: https://github.com/apache/arrow/pull/43607#issuecomment-2273735308 Do we know why it was causing the panic in the first place? Can you add the corrupted file to https://github.com/apache/arrow-testing and then use that for a test? -- This is an auto

Re: [I] Reproducible segfaults closing CSVWriter [arrow]

2024-08-07 Thread via GitHub
jorisvandenbossche commented on issue #43604: URL: https://github.com/apache/arrow/issues/43604#issuecomment-2273734113 @jpfeuffer thanks for the report! Could you also share a (snippet of) the csv file (or a similar one with dummy data) to see if we can reproduce this? -- This is

Re: [I] [Python] Table.rename_columns should accept tuple [arrow]

2024-08-07 Thread via GitHub
jorisvandenbossche commented on issue #43588: URL: https://github.com/apache/arrow/issues/43588#issuecomment-2273736328 If this worked before, let's indeed restore that. PR certainly welcome! -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [I] [Python] AttributeError: module 'pyarrow.compute' has no attribute 'equal' [arrow]

2024-08-07 Thread via GitHub
zeroRains commented on issue #43580: URL: https://github.com/apache/arrow/issues/43580#issuecomment-2273729659 > Do you have something like a Dockerfile where we could reproduce? I am not entirely sure why your cmake is unable to find `utf8proc`, from what I can see on the logs you are usin

Re: [PR] GH-41719: [C++][Parquet] Cannot read encrypted parquet datasets via _metadata file [arrow]

2024-08-07 Thread via GitHub
wgtmac commented on code in PR #41821: URL: https://github.com/apache/arrow/pull/41821#discussion_r1707247671 ## cpp/src/parquet/file_writer.cc: ## @@ -567,6 +567,44 @@ void WriteEncryptedFileMetadata(const FileMetaData& file_metadata, } } +void WriteEncryptedMetadataFile

Re: [PR] GH-43519: [Python] Set up CI for Python 3.13 [arrow]

2024-08-07 Thread via GitHub
jorisvandenbossche commented on code in PR #43539: URL: https://github.com/apache/arrow/pull/43539#discussion_r1707241258 ## dev/tasks/python-wheels/github.linux.yml: ## @@ -37,6 +37,11 @@ jobs: ARCHERY_USE_DOCKER_CLI: 0 {% endif %} PYTHON: "{{ python_versio

Re: [PR] DRAFT: [CI] Use `docker compose` instead of `docker-compose` [arrow]

2024-08-07 Thread via GitHub
github-actions[bot] commented on PR #43586: URL: https://github.com/apache/arrow/pull/43586#issuecomment-2273706568 Revision: 02cdd0ac270811fa3540b2dadabddc23eb49a121 Submitted crossbow builds: [ursacomputing/crossbow @ actions-e169e51a80](https://github.com/ursacomputing/crossbow/bra

Re: [PR] DRAFT: [CI] Use `docker compose` instead of `docker-compose` [arrow]

2024-08-07 Thread via GitHub
pitrou commented on PR #43586: URL: https://github.com/apache/arrow/pull/43586#issuecomment-2273701541 @github-actions crossbow submit *hdfs* -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] DRAFT: [CI] Use `docker compose` instead of `docker-compose` [arrow]

2024-08-07 Thread via GitHub
github-actions[bot] commented on PR #43586: URL: https://github.com/apache/arrow/pull/43586#issuecomment-2273698990 Revision: 2ac225a0b9d074f831df53fc195e29482453fab2 Submitted crossbow builds: [ursacomputing/crossbow @ actions-2515f0589f](https://github.com/ursacomputing/crossbow/bra

Re: [PR] DRAFT: [CI] Use `docker compose` instead of `docker-compose` [arrow]

2024-08-07 Thread via GitHub
github-actions[bot] commented on PR #43586: URL: https://github.com/apache/arrow/pull/43586#issuecomment-2273697478 Revision: 2ac225a0b9d074f831df53fc195e29482453fab2 Submitted crossbow builds: [ursacomputing/crossbow @ actions-bd221565a1](https://github.com/ursacomputing/crossbow/bra

Re: [PR] DRAFT: [CI] Use `docker compose` instead of `docker-compose` [arrow]

2024-08-07 Thread via GitHub
pitrou commented on PR #43586: URL: https://github.com/apache/arrow/pull/43586#issuecomment-2273686851 @github-actions crossbow submit -g nightly-tests -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [PR] DRAFT: [CI] Use `docker compose` instead of `docker-compose` [arrow]

2024-08-07 Thread via GitHub
pitrou commented on PR #43586: URL: https://github.com/apache/arrow/pull/43586#issuecomment-2273687051 @github-actions crossbow submit -g nightly-packaging -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] DRAFT: [CI] Use `docker compose` instead of `docker-compose` [arrow]

2024-08-07 Thread via GitHub
pitrou commented on PR #43586: URL: https://github.com/apache/arrow/pull/43586#issuecomment-2273681156 @kou How about we deprecate ARCHERY_USE_DOCKER_CLI in another issue/PR? This doesn't seem necessary right now. -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] DRAFT: [CI] Use `docker compose` instead of `docker-compose` [arrow]

2024-08-07 Thread via GitHub
pitrou commented on code in PR #43586: URL: https://github.com/apache/arrow/pull/43586#discussion_r1707194652 ## dev/archery/archery/docker/cli.py: ## @@ -47,18 +47,24 @@ def _execute(self, *args, **kwargs): help="Specify Arrow source directory.") @click.option('

Re: [PR] feat: Add IPC stream writing [arrow-nanoarrow]

2024-08-07 Thread via GitHub
bkietz commented on code in PR #571: URL: https://github.com/apache/arrow-nanoarrow/pull/571#discussion_r1707171256 ## src/nanoarrow/ipc/writer.c: ## @@ -150,3 +155,160 @@ ArrowErrorCode ArrowIpcOutputStreamInitFile(struct ArrowIpcOutputStream* stream, stream->private_data =

  1   2   3   >