Re: [I] [C++] Real null count for DictionaryArray [arrow]
bkietz closed issue #38457: [C++] Real null count for DictionaryArray URL: https://github.com/apache/arrow/issues/38457 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[I] [R] altrep.cpp contains errors in printf syntax [arrow]
paleolimbot opened a new issue, #38893: URL: https://github.com/apache/arrow/issues/38893 ### Describe the bug, including details regarding any error messages, version, and platform. On CI and some nightlies and CRAN we get: ``` Version: 14.0.0 Check: whether package can be installed Result: WARN Found the following significant warnings: altrep.cpp:155:55: warning: format specifies type 'int' but the argument has type 'R_xlen_t' (aka 'long') [-Wformat] altrep.cpp:160:15: warning: format specifies type 'int' but the argument has type 'int64_t' (aka 'long') [-Wformat] altrep.cpp:160:44: warning: format specifies type 'int' but the argument has type 'int64_t' (aka 'long') [-Wformat] altrep.cpp:822:16: warning: format string is not a string literal (potentially insecure) [-Wformat-security] See ‘/data/gannet/ripley/R/packages/tests-clang/arrow.Rcheck/00install.out’ for details. * used C++ compiler: ‘clang version 17.0.5’ Flavor: r-devel-linux-x86_64-fedora-clang ``` The code pathways here are infrequently taken but do need to be fixed. ### Component(s) R -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[I] go/adbc/driver/snowflake: improve bulk ingestion speed [arrow-adbc]
lidavidm opened a new issue, #1327: URL: https://github.com/apache/arrow-adbc/issues/1327 https://lists.apache.org/thread/9m33spjv3x9sd3r3wwnwhgm5m27k5wgq Because we use giant sets of bind parameters, things are very slow. We should instead try [`PUT`...`COPY INTO`](https://docs.snowflake.com/en/user-guide/data-load-local-file-system). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [Go] IPC LZ4 Decompressor does not return bytes to pool on Close [arrow]
zeroshade closed issue #38728: [Go] IPC LZ4 Decompressor does not return bytes to pool on Close URL: https://github.com/apache/arrow/issues/38728 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] Reading parquet file behavior change from 13.0.0 to 14.0.0 [arrow]
pitrou closed issue #38577: Reading parquet file behavior change from 13.0.0 to 14.0.0 URL: https://github.com/apache/arrow/issues/38577 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [C++] Parquet reading performance regressions [arrow]
pitrou closed issue #38432: [C++] Parquet reading performance regressions URL: https://github.com/apache/arrow/issues/38432 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] GH-38738: [C++] Add fuzz regression file [arrow-testing]
pitrou merged PR #98: URL: https://github.com/apache/arrow-testing/pull/98 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [Go] Enable GC checks [arrow]
pitrou closed issue #38824: [Go] Enable GC checks URL: https://github.com/apache/arrow/issues/38824 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [C++] Check for valid variadic buffer counts [arrow]
bkietz closed issue #38738: [C++] Check for valid variadic buffer counts URL: https://github.com/apache/arrow/issues/38738 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [Python] Random crash when running PyArrow from several threads [arrow]
ovcharenko closed issue #34097: [Python] Random crash when running PyArrow from several threads URL: https://github.com/apache/arrow/issues/34097 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[I] Extension type not preserved on reading from the serialized schema [arrow]
candiduslynx opened a new issue, #38891: URL: https://github.com/apache/arrow/issues/38891 ### Describe the bug, including details regarding any error messages, version, and platform. I have an extension type `json` that is defined as an extension over binary. When I serialize the schema I see that the field is correctly serialized as `Extension(json, BINARY)`. When reading the serialized schema back the filed type for `json` column becomes `BINARY`. I suppose it's because https://github.com/apache/arrow/blame/main/java/format/src/main/java/org/apache/arrow/flatbuf/Type.java doesn't support reading extension types properly. ### Component(s) Java -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[I] GCC complains about redundant move [arrow]
MrJia1997 opened a new issue, #38889: URL: https://github.com/apache/arrow/issues/38889 ### Describe the bug, including details regarding any error messages, version, and platform. OS: Ubuntu 22.04 GCC: 11.4.0 GCC complains about redundant move when `-Wall`, `-Wextra`, and `-Werror` are enabled, on the following line for example. Does anybody have the same issue? https://github.com/apache/arrow/blob/eb5de184a7e5d02f98526332ace54250417bd232/cpp/src/arrow/array/builder_base.h#L334 ### Component(s) C++ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[I] arrow::read_parquet col_select warning deprecated [arrow]
mikemdr1 opened a new issue, #38890: URL: https://github.com/apache/arrow/issues/38890 ### Describe the enhancement requested When you use 'col_select' parameter using an external vector you get next message (from tidyselect library) ![2023-11-27 02_50_31-RStudio](https://github.com/apache/arrow/assets/36677796/a5cf91da-a7fc-44d0-8f9f-2f62d30a2de7) This is because I use character vector as is ![2023-11-27 02_50_56-RStudio](https://github.com/apache/arrow/assets/36677796/9feb545f-6ced-4fca-ac3d-00c72f2a4537) The warning disappears when you embrace character vector with 'tidyselect::all_of'. ![2023-11-27 03_00_26-RStudio](https://github.com/apache/arrow/assets/36677796/d5172f47-1d58-415a-bd5e-8810cb6e13ea) It would be usefull if this behavior was by default ### Component(s) R -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[I] [R] Include and LIB flags are missing on macOS [arrow]
assignUser opened a new issue, #38902: URL: https://github.com/apache/arrow/issues/38902 ### Describe the bug, including details regarding any error messages, version, and platform. So far only happened on CRAN see log https://www.r-project.org/nosvn/R.check/r-release-macos-x86_64/arrow-00check.html I wan unable to reproduce locally, likely an issue with pkg-config (or without it?) ### Component(s) R -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [R] Update NEWS.md for 14.0.0.1 [arrow]
thisisnic closed issue #38864: [R] Update NEWS.md for 14.0.0.1 URL: https://github.com/apache/arrow/issues/38864 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[I] [R] [Docs] Improve documentation of `col_types` [arrow]
assignUser opened a new issue, #38903: URL: https://github.com/apache/arrow/issues/38903 ### Describe the enhancement requested In a recent [SO](https://stackoverflow.com/questions/77557377/how-to-convert-int-to-double-when-using-arrow-to-read-in-multiple-csvs-with-open) question about using partial schemas in `open_dataset` (which is possible using `col_types`) even a seasond arrow user did not know about the proper solution. The docs for open_dataset hide a lot of more specialized options behind a `...` and it it's not obvious how to find those as the linked dataset factory page also doesn't show all possibility. Some are explained in the specialized wrapper functions like https://arrow.apache.org/docs/r/reference/open_delim_dataset.html or https://arrow.apache.org/docs/r/reference/csv_convert_options.html but even there col_types is not described in a way that makes it obvious that it is to be used to pass in partial schemas. At the minimum the doc strings for `col_types` should make the inteded uses case clear, ideally we should link to the detailed descriptions from `open_dataset` or find another way to document the possible options more visibly. ### Component(s) Documentation, R -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[I] [R] Update R News for 14.0.2 [arrow]
assignUser opened a new issue, #38904: URL: https://github.com/apache/arrow/issues/38904 ### Describe the enhancement requested Update/consolidate news for 14.0.2 ### Component(s) R -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] Spelling errors identified by check-spelling [arrow]
domoritz closed issue #38900: Spelling errors identified by check-spelling URL: https://github.com/apache/arrow/issues/38900 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] Spelling errors identified by check-spelling [arrow]
assignUser closed issue #38900: Spelling errors identified by check-spelling URL: https://github.com/apache/arrow/issues/38900 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[I] [R] Fix failing windows build on CI [arrow]
paleolimbot opened a new issue, #38906: URL: https://github.com/apache/arrow/issues/38906 ### Describe the bug, including details regarding any error messages, version, and platform. Follow-up on https://github.com/apache/arrow/pull/38894#issuecomment-1828488227 : The Windows R-devel build is failing on CI because one of our dependencies (cpp11) has an invalid format string in a rarely-used function in its header ( https://github.com/r-lib/cpp11/pull/345 ) and we convert all errors to warnings (apparently, according to the output). To fix this we can: - Not test r-devel on Windows for every commit (and ensure we do so on nightly, which I'm pretty sure we do, but worth checking before removing this job) - Find where we convert warnings to errors and stop doing that - Check to see if some other flag is causing this to occur (maybe `-O0`?) I don't think we can or should wait for the PR linked above to merge upstream since this is causing CI to fail for all R PRs. cc @assignUser ### Component(s) R -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [R] altrep.cpp contains errors in printf syntax [arrow]
assignUser closed issue #38893: [R] altrep.cpp contains errors in printf syntax URL: https://github.com/apache/arrow/issues/38893 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[I] Spelling errors identified by check-spelling [arrow]
jsoref opened a new issue, #38905: URL: https://github.com/apache/arrow/issues/38905 ### Describe the bug, including details regarding any error messages, version, and platform. The [check-spelling action](https://github.com/marketplace/actions/check-spelling) enabled me to identify some misspellings... The misspellings have been reported at https://github.com/jsoref/apache-arrow/actions/runs/7008141484/attempts/1#summary-19063787889 ### Component(s) C#, C++, C++ - Gandiva, FlightRPC, GLib, Go, Java, Parquet, Python, R, Ruby -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[I] Spelling errors identified by check-spelling [arrow]
jsoref opened a new issue, #38900: URL: https://github.com/apache/arrow/issues/38900 ### Describe the bug, including details regarding any error messages, version, and platform. The [check-spelling action](https://github.com/marketplace/actions/check-spelling) enabled me to identify some misspellings... The misspellings have been reported at https://github.com/jsoref/apache-arrow/actions/runs/7008141484/attempts/1#summary-19063787889 ### Component(s) C#, C++, C++ - Gandiva, FlightRPC, GLib, Go, Java, Parquet, Python, R, Ruby -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] Extension type not preserved on reading from the serialized schema [arrow]
candiduslynx closed issue #38891: Extension type not preserved on reading from the serialized schema URL: https://github.com/apache/arrow/issues/38891 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [C++] GCC complains about redundant move [arrow]
MrJia1997 closed issue #38889: [C++] GCC complains about redundant move URL: https://github.com/apache/arrow/issues/38889 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[I] [C++] Stop using xsimd in public header [arrow]
kou opened a new issue, #38907: URL: https://github.com/apache/arrow/issues/38907 ### Describe the enhancement requested `cpp/src/arrow/util/bpacking_simd*_generated.h` are installed and they include `xsimd/xsimd.hpp`. ### Component(s) C++ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [Go] Data race of calling `GetToTimeFunc` of fixed timestamp data type [arrow]
bkietz closed issue #38795: [Go] Data race of calling `GetToTimeFunc` of fixed timestamp data type URL: https://github.com/apache/arrow/issues/38795 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [C++][JSON] kMaxParserNumRows Value Increase/Removal [arrow]
bkietz closed issue #28994: [C++][JSON] kMaxParserNumRows Value Increase/Removal URL: https://github.com/apache/arrow/issues/28994 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [Go] Adding Avro OCF reader [arrow]
zeroshade closed issue #36760: [Go] Adding Avro OCF reader URL: https://github.com/apache/arrow/issues/36760 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] GH-38738: [C++] Add fuzz regression file [arrow-testing]
bkietz opened a new pull request, #98: URL: https://github.com/apache/arrow-testing/pull/98 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] GH-38738: [C++] Add fuzz regression file [arrow-testing]
bkietz commented on PR #98: URL: https://github.com/apache/arrow-testing/pull/98#issuecomment-1828141498 I have only added one of the two regression cases since they both trigger the same bug -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] adbc_driver_manager.OperationalError: UNKNOWN: [Snowflake] arrow/ipc: unknown error while reading: cannot allocate memory [arrow-adbc]
zeroshade closed issue #1283: adbc_driver_manager.OperationalError: UNKNOWN: [Snowflake] arrow/ipc: unknown error while reading: cannot allocate memory URL: https://github.com/apache/arrow-adbc/issues/1283 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[I] [Packaging] Drop support for Ubuntu 23.04 [arrow]
kou opened a new issue, #38909: URL: https://github.com/apache/arrow/issues/38909 ### Describe the enhancement requested It will reach EOL on 2024-01 and our next major release will not be happen in this year. ### Component(s) Packaging -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[I] [Java][FlightSQL] In authentication, how does flight server get the IP/Hostname of ADBC client [arrow]
xinyiZzz opened a new issue, #38911: URL: https://github.com/apache/arrow/issues/38911 ### Describe the usage question you have. Please include as many useful details as possible. Now I implement `CallHeaderAuthenticator` in Java, and Override `AuthResult authenticate(CallHeaders incomingHeaders)`. ``` xxx implements CallHeaderAuthenticator { @Override public AuthResult authenticate(CallHeaders incomingHeaders) { } } ``` Then, `BasicCallHeaderAuthenticator` can decode username and password from `incomingHeaders`. I implements `BasicCallHeaderAuthenticator.CredentialValidator` and Override `AuthResult validate(String username, String password)` to complete authentication of username and password. ``` xxx implements BasicCallHeaderAuthenticator.CredentialValidator { @Override public AuthResult validate(String username, String password) { } } ``` But in my database(Apache Doris), the client's IP will also participate in authentication, so is there a way to get the ADBC client's IP in Flight Server? Thanks for your help. ### Component(s) FlightRPC -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[I] [Python] How to add one level of nesting to flat table? [arrow]
sergun opened a new issue, #38912: URL: https://github.com/apache/arrow/issues/38912 ### Describe the usage question you have. Please include as many useful details as possible. I have flat pa.Table: ``` table = pa.table({"a": [1, 2, 3], "b": [3, 4, 5]}) ``` How can I create new table from this one by adding one level of nesting? So I want to have a new table with only one column "c" of type struct with two fields "a" and "b" and keep data from original table. ### Component(s) Python -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org