Re: [I] csharp/src/Drivers/Apache: Rename this project and add NuGet creation [arrow-adbc]

2024-04-17 Thread via GitHub
CurtHagenlocher commented on issue #1726: URL: https://github.com/apache/arrow-adbc/issues/1726#issuecomment-2061287753 Splitting them is probably the right thing to do in a number of respects, but I wouldn't want to duplicate the code and for that we'd still need a single shared

Re: [I] DECIMAL regex in csv reader does not accept positive exponent specifier [arrow-rs]

2024-04-17 Thread via GitHub
tustvold commented on issue #5648: URL: https://github.com/apache/arrow-rs/issues/5648#issuecomment-2061310335 `label_issue.py` automatically added labels {'arrow'} from #5649 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [I] Serialize `FixedSizeBinary` as HEX with JSON Writer [arrow-rs]

2024-04-17 Thread via GitHub
tustvold commented on issue #5620: URL: https://github.com/apache/arrow-rs/issues/5620#issuecomment-2061309995 `label_issue.py` automatically added labels {'arrow'} from #5622 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [I] Release arrow-rs / parquet version (`51.0.0` or `50.1.0`) [arrow-rs]

2024-04-17 Thread via GitHub
tustvold commented on issue #5453: URL: https://github.com/apache/arrow-rs/issues/5453#issuecomment-2061308240 `label_issue.py` automatically added labels {'parquet'} from #5293 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [I] Unnecessary ownership makes it harder to use `RecordBatch::schema` and `Schema::try_merge` than it needs to be [arrow-rs]

2024-04-17 Thread via GitHub
tustvold commented on issue #5342: URL: https://github.com/apache/arrow-rs/issues/5342#issuecomment-2061307885 `label_issue.py` automatically added labels {'parquet'} from #5448 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [I] In Object Store, return version & etag on multipart put. [arrow-rs]

2024-04-17 Thread via GitHub
tustvold commented on issue #5443: URL: https://github.com/apache/arrow-rs/issues/5443#issuecomment-2061308165 `label_issue.py` automatically added labels {'object-store'} from #5500 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [I] Panic when displaying debug the results via log::info in the browser. [arrow-rs]

2024-04-17 Thread via GitHub
tustvold commented on issue #5599: URL: https://github.com/apache/arrow-rs/issues/5599#issuecomment-2061309795 `label_issue.py` automatically added labels {'arrow'} from #5603 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [I] Cannot access example Flight SQL Server from dbeaver [arrow-rs]

2024-04-17 Thread via GitHub
tustvold commented on issue #5540: URL: https://github.com/apache/arrow-rs/issues/5540#issuecomment-2061309044 `label_issue.py` automatically added labels {'arrow-flight'} from #5543 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] GH-41256: [Format][Docs] Add a canonical extension type specification for a generic text data format (e.g. JSON) [arrow]

2024-04-17 Thread via GitHub
pitrou commented on PR #41257: URL: https://github.com/apache/arrow/pull/41257#issuecomment-2061320269 I'm sorry, but this doesn't seem to make sense to me. We already have a "generic text data format": it's the STRING type. What we're talking about is a JSON extension type. --

Re: [I] Cannot access example Flight SQL Server from dbeaver [arrow-rs]

2024-04-17 Thread via GitHub
tustvold commented on issue #5540: URL: https://github.com/apache/arrow-rs/issues/5540#issuecomment-2061309005 `label_issue.py` automatically added labels {'arrow'} from #5543 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [I] `parquet / Build wasm32 (pull_request)` CI check failing on main [arrow-rs]

2024-04-17 Thread via GitHub
tustvold commented on issue #5565: URL: https://github.com/apache/arrow-rs/issues/5565#issuecomment-2061309453 `label_issue.py` automatically added labels {'parquet'} from #5567 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [I] Inconsistent Multipart Nomenclature [arrow-rs]

2024-04-17 Thread via GitHub
tustvold commented on issue #5526: URL: https://github.com/apache/arrow-rs/issues/5526#issuecomment-2061308935 `label_issue.py` automatically added labels {'object-store'} from #5500 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[PR] Prepare object_store 0.10.0 [arrow-datafusion]

2024-04-17 Thread via GitHub
tustvold opened a new pull request, #10116: URL: https://github.com/apache/arrow-datafusion/pull/10116 ## Which issue does this PR close? Closes #. ## Rationale for this change Updates to object_store 0.10.0 which contains some breaking changes ##

Re: [PR] GH-41112: [C++] Clean up unused parameter warnings [arrow]

2024-04-17 Thread via GitHub
pitrou commented on PR #4: URL: https://github.com/apache/arrow/pull/4#issuecomment-2061400572 @felipecrv This happens in the Arrow headers, so changing our own build flags wouldn't change anything here, AFAICT. That said, if other people think these warnings do need to be

Re: [PR] GH-37929: [Python] begin moving static settings to pyproject.toml [arrow]

2024-04-17 Thread via GitHub
pitrou commented on PR #41041: URL: https://github.com/apache/arrow/pull/41041#issuecomment-2061428897 Note that we're not married to setuptools_scm. If we find out that something else would work better for us, then we can switch to it. Found this comparison using a quick search:

Re: [PR] feat(rust): add public abstract API and dummy driver implementation [arrow-adbc]

2024-04-17 Thread via GitHub
alexandreyc commented on code in PR #1725: URL: https://github.com/apache/arrow-adbc/pull/1725#discussion_r1569030653 ## rust2/core/src/lib.rs: ## @@ -0,0 +1,520 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See

Re: [PR] GH-38255: [Go][C++] Implement Flight SQL Bulk Ingestion [arrow]

2024-04-17 Thread via GitHub
zeroshade merged PR #38385: URL: https://github.com/apache/arrow/pull/38385 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] GH-41256: [Format][Docs] Add a canonical extension type specification for a generic text data format (e.g. JSON) [arrow]

2024-04-17 Thread via GitHub
pitrou commented on PR #41257: URL: https://github.com/apache/arrow/pull/41257#issuecomment-2061524818 Ok, can we reboot this into an actual JSON extension type? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [I] c: define async version of ArrowArrayStream [arrow-adbc]

2024-04-17 Thread via GitHub
zeroshade commented on issue #811: URL: https://github.com/apache/arrow-adbc/issues/811#issuecomment-2061532782 Thanks for the sketch @CurtHagenlocher I plan on starting to tackle this more formally in the next couple weeks. -- This is an automated message from the Apache Git Service. To

Re: [PR] GH-41256: [Format][Docs] Add a canonical extension type specification for JSON [arrow]

2024-04-17 Thread via GitHub
rok commented on code in PR #41257: URL: https://github.com/apache/arrow/pull/41257#discussion_r1569069770 ## docs/source/format/CanonicalExtensions.rst: ## @@ -251,6 +251,25 @@ Variable shape tensor Values inside each **data** tensor element are stored in

Re: [PR] GH-41256: [Format][Docs] Add a canonical extension type specification for JSON [arrow]

2024-04-17 Thread via GitHub
rok commented on code in PR #41257: URL: https://github.com/apache/arrow/pull/41257#discussion_r1569070238 ## docs/source/format/CanonicalExtensions.rst: ## @@ -251,6 +251,25 @@ Variable shape tensor Values inside each **data** tensor element are stored in

Re: [PR] feat(rust): add public abstract API and dummy driver implementation [arrow-adbc]

2024-04-17 Thread via GitHub
mbrobbel commented on code in PR #1725: URL: https://github.com/apache/arrow-adbc/pull/1725#discussion_r1569070430 ## rust2/core/src/lib.rs: ## @@ -0,0 +1,520 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the

Re: [PR] GH-40959: [JS] Store Timestamps in 64 bits [arrow]

2024-04-17 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #40960: URL: https://github.com/apache/arrow/pull/40960#issuecomment-2061637006 After merging your PR, Conbench analyzed the 7 benchmarking runs that have been run so far on merge-commit 18876b2f266eaa2b7172a18c1518f22489e612c8. There were 2

Re: [PR] GH-41258: [C#][Integration] Fix comparison of sliced validity buffers with non-zero offsets [arrow]

2024-04-17 Thread via GitHub
paleolimbot merged PR #41259: URL: https://github.com/apache/arrow/pull/41259 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [I] [C#][Integration] csharp integration tests are failing on main [arrow]

2024-04-17 Thread via GitHub
paleolimbot commented on issue #41258: URL: https://github.com/apache/arrow/issues/41258#issuecomment-2061667745 Issue resolved by pull request 41259 https://github.com/apache/arrow/pull/41259 -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] GH-40563: [Go] Unable to JSON marshal float64 arrays which contain a NaN value [arrow]

2024-04-17 Thread via GitHub
zeroshade commented on code in PR #41109: URL: https://github.com/apache/arrow/pull/41109#discussion_r1569124909 ## go/arrow/array/float16.go: ## @@ -87,10 +87,18 @@ func (a *Float16) GetOneForMarshal(i int) interface{} { func (a *Float16) MarshalJSON() ([]byte, error) {

Re: [PR] GH-37720: [Java][FlightSQL] Implement stateless prepared statements [arrow]

2024-04-17 Thread via GitHub
stevelorddremio commented on code in PR #41237: URL: https://github.com/apache/arrow/pull/41237#discussion_r1569130119 ## java/flight/flight-sql/src/test/java/org/apache/arrow/flight/sql/example/FlightSqlStatelessExample.java: ## @@ -0,0 +1,246 @@ +/* + * Licensed to the Apache

Re: [I] Snowflake 0.11 driver incorrectly wrapping table names in double quotes [arrow-adbc]

2024-04-17 Thread via GitHub
zeroshade commented on issue #1721: URL: https://github.com/apache/arrow-adbc/issues/1721#issuecomment-2061711718 @davlee1972 Thanks for the examples here, it's definitely a problem if we're breaking when someone passes quotes in as part of the string. I'll take a look and see what our

Re: [I] feat: adding HDFS support in the object_store crate [arrow-rs]

2024-04-17 Thread via GitHub
milenkovicm commented on issue #5638: URL: https://github.com/apache/arrow-rs/issues/5638#issuecomment-2061251934 I'm not sure whats @alamb @tustvold opinion, would it make sense to have your repo in datafusion-contrib @Kimahriman ? -- This is an automated message from the Apache Git

Re: [PR] ARROW-17255: [C++][Parquet] Add JSON canonical extension type [arrow]

2024-04-17 Thread via GitHub
rok commented on PR #13901: URL: https://github.com/apache/arrow/pull/13901#issuecomment-2061271290 I've opened a more generic extension type proposal: https://github.com/apache/arrow/pull/41257/files. -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] GH-40964: [CI][Archery] Archery linking should also check for undefined symbols Linux [arrow]

2024-04-17 Thread via GitHub
pitrou commented on PR #40520: URL: https://github.com/apache/arrow/pull/40520#issuecomment-2061262854 @vibhatha Could you please add unit tests for the various parsing functions here? -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [I] Create `ArrowReaderMetadata` from externalized metadata [arrow-rs]

2024-04-17 Thread via GitHub
tustvold commented on issue #5582: URL: https://github.com/apache/arrow-rs/issues/5582#issuecomment-2061309659 `label_issue.py` automatically added labels {'parquet'} from #5583 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [I] Improve Retry Coverage [arrow-rs]

2024-04-17 Thread via GitHub
tustvold commented on issue #5608: URL: https://github.com/apache/arrow-rs/issues/5608#issuecomment-2061309857 `label_issue.py` automatically added labels {'object-store'} from #5609 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [I] FixedSizeListArray::try_new Errors on Entirely Null Array With Size 0 [arrow-rs]

2024-04-17 Thread via GitHub
tustvold commented on issue #5614: URL: https://github.com/apache/arrow-rs/issues/5614#issuecomment-2061309916 `label_issue.py` automatically added labels {'arrow'} from #5612 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [I] [object_store] minor bug: typos present in local variable [arrow-rs]

2024-04-17 Thread via GitHub
tustvold commented on issue #5628: URL: https://github.com/apache/arrow-rs/issues/5628#issuecomment-2061310088 `label_issue.py` automatically added labels {'object-store'} from #5629 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [I] Add support for the "r+" datatype in the C Data interface / `RunArray` [arrow-rs]

2024-04-17 Thread via GitHub
tustvold commented on issue #5631: URL: https://github.com/apache/arrow-rs/issues/5631#issuecomment-2061310175 `label_issue.py` automatically added labels {'arrow'} from #5632 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] GH-39645: [Python] Fix read_table for encrypted parquet [arrow]

2024-04-17 Thread via GitHub
tolleybot commented on PR #39438: URL: https://github.com/apache/arrow/pull/39438#issuecomment-2061298553 Do we think it's possible to get this onto Arrow 16 roadmap? @wgtmac @pitrou -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [I] `parquet / Build wasm32 (pull_request)` CI check failing on main [arrow-rs]

2024-04-17 Thread via GitHub
tustvold commented on issue #5565: URL: https://github.com/apache/arrow-rs/issues/5565#issuecomment-2061309400 `label_issue.py` automatically added labels {'arrow'} from #5525 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [I] Documentation fix: example in parquet/src/column/mod.rs is incorrect [arrow-rs]

2024-04-17 Thread via GitHub
tustvold commented on issue #5560: URL: https://github.com/apache/arrow-rs/issues/5560#issuecomment-2061309312 `label_issue.py` automatically added labels {'parquet'} from #5561 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[I] Split up arrow_cast::cast [arrow-rs]

2024-04-17 Thread via GitHub
tustvold opened a new issue, #5125: URL: https://github.com/apache/arrow-rs/issues/5125 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** The cast kernel is currently implemented as a single large module. This has gotten

Re: [PR] GH-41256: [Format][Docs] Add a canonical extension type specification for a generic text data format (e.g. JSON) [arrow]

2024-04-17 Thread via GitHub
pitrou commented on PR #41257: URL: https://github.com/apache/arrow/pull/41257#issuecomment-2061337916 By the way, the notion of a "generic" extension type meant to represent arbitrary third-party data is similar to the ["other / unsupported data

[PR] Prepare object_store 0.10.0 [arrow-rs]

2024-04-17 Thread via GitHub
tustvold opened a new pull request, #5658: URL: https://github.com/apache/arrow-rs/pull/5658 # Which issue does this PR close? Closes #5647 # Rationale for this change # What changes are included in this PR? # Are there any user-facing

Re: [PR] feat(rust): add public abstract API and dummy driver implementation [arrow-adbc]

2024-04-17 Thread via GitHub
alexandreyc commented on code in PR #1725: URL: https://github.com/apache/arrow-adbc/pull/1725#discussion_r1569003762 ## rust2/core/src/lib.rs: ## @@ -0,0 +1,520 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See

Re: [PR] perf: Better bit packing-unpacking algorithms [arrow-nanoarrow]

2024-04-17 Thread via GitHub
mapleFU commented on PR #326: URL: https://github.com/apache/arrow-nanoarrow/pull/326#issuecomment-2061546855 https://github.com/apache/arrow/issues/40845 I'm investigating improving the unpack in arrow, do you have some advices here? -- This is an automated message from the

Re: [PR] feat: Add manual test to calculate spark builtin functions coverage [arrow-datafusion-comet]

2024-04-17 Thread via GitHub
comphead commented on code in PR #263: URL: https://github.com/apache/arrow-datafusion-comet/pull/263#discussion_r1569054389 ## doc/spark_coverage_agg.txt: ## @@ -0,0 +1,9 @@ ++---+--+---+ +|result |reason

Re: [PR] fix(docs): Fix typo in documentation for `ArrowSchemaSetTypeUnion()` [arrow-nanoarrow]

2024-04-17 Thread via GitHub
paleolimbot merged PR #432: URL: https://github.com/apache/arrow-nanoarrow/pull/432 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] GH-41179: [Docs] Documentation for Dissociated IPC Protocol [arrow]

2024-04-17 Thread via GitHub
pitrou commented on code in PR #41180: URL: https://github.com/apache/arrow/pull/41180#discussion_r1569076394 ## docs/source/format/DissociatedIPC.rst: ## @@ -0,0 +1,335 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license

Re: [PR] GH-41179: [Docs] Documentation for Dissociated IPC Protocol [arrow]

2024-04-17 Thread via GitHub
pitrou commented on code in PR #41180: URL: https://github.com/apache/arrow/pull/41180#discussion_r1569075868 ## docs/source/format/DissociatedIPC.rst: ## @@ -0,0 +1,335 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license

Re: [PR] GH-41179: [Docs] Documentation for Dissociated IPC Protocol [arrow]

2024-04-17 Thread via GitHub
pitrou commented on code in PR #41180: URL: https://github.com/apache/arrow/pull/41180#discussion_r1569073986 ## docs/source/format/DissociatedIPC.rst: ## @@ -0,0 +1,335 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license

[PR] feat: `DataFrame` supports unnesting multiple columns [arrow-datafusion]

2024-04-17 Thread via GitHub
jonahgao opened a new pull request, #10118: URL: https://github.com/apache/arrow-datafusion/pull/10118 ## Which issue does this PR close? N/A ## Rationale for this change #10044 has enabled SQL to support unnesting multiple columns, this PR adds the same functionality to

Re: [PR] feat: Add manual test to calculate spark builtin functions coverage [arrow-datafusion-comet]

2024-04-17 Thread via GitHub
comphead commented on code in PR #263: URL: https://github.com/apache/arrow-datafusion-comet/pull/263#discussion_r1569085132 ## doc/spark_coverage.txt: ## @@ -0,0 +1,421 @@

Re: [PR] GH-41186: [C++][Parquet][Doc] Denote PARQUET:field_id in parquet.rst [arrow]

2024-04-17 Thread via GitHub
pitrou commented on code in PR #41187: URL: https://github.com/apache/arrow/pull/41187#discussion_r1569084382 ## docs/source/cpp/parquet.rst: ## @@ -571,6 +571,19 @@ More specifically, Parquet C++ supports: * EncryptionWithFooterKey and EncryptionWithColumnKey modes. *

Re: [PR] GH-41186: [C++][Parquet][Doc] Denote PARQUET:field_id in parquet.rst [arrow]

2024-04-17 Thread via GitHub
pitrou commented on code in PR #41187: URL: https://github.com/apache/arrow/pull/41187#discussion_r1569084922 ## docs/source/cpp/parquet.rst: ## @@ -571,6 +571,19 @@ More specifically, Parquet C++ supports: * EncryptionWithFooterKey and EncryptionWithColumnKey modes. *

Re: [PR] perf: Better bit packing-unpacking algorithms [arrow-nanoarrow]

2024-04-17 Thread via GitHub
WillAyd commented on PR #326: URL: https://github.com/apache/arrow-nanoarrow/pull/326#issuecomment-2061681249 Hey @mapleFU - that's great. I didn't read through everything you posted in that issue but the research is impressive, and certainly beyond what I was able to accomplish here

Re: [PR] GH-37720: [Java][FlightSQL] Implement stateless prepared statements [arrow]

2024-04-17 Thread via GitHub
stevelorddremio commented on code in PR #41237: URL: https://github.com/apache/arrow/pull/41237#discussion_r1569126617 ## java/flight/flight-sql/src/test/java/org/apache/arrow/flight/sql/example/FlightSqlStatelessExample.java: ## @@ -0,0 +1,246 @@ +/* + * Licensed to the Apache

Re: [PR] GH-40964: [CI][Archery] Archery linking should also check for undefined symbols Linux [arrow]

2024-04-17 Thread via GitHub
pitrou commented on code in PR #40520: URL: https://github.com/apache/arrow/pull/40520#discussion_r1568836864 ## dev/archery/archery/linking.py: ## @@ -61,9 +63,83 @@ def list_dependency_names(self): names.append(name) return names +def

Re: [PR] feat(rust): add public abstract API and dummy driver implementation [arrow-adbc]

2024-04-17 Thread via GitHub
mbrobbel commented on code in PR #1725: URL: https://github.com/apache/arrow-adbc/pull/1725#discussion_r1568857747 ## rust2/core/src/lib.rs: ## @@ -0,0 +1,520 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the

Re: [PR] feat(rust): add public abstract API and dummy driver implementation [arrow-adbc]

2024-04-17 Thread via GitHub
mbrobbel commented on code in PR #1725: URL: https://github.com/apache/arrow-adbc/pull/1725#discussion_r1568861423 ## rust2/core/src/lib.rs: ## @@ -0,0 +1,520 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the

[PR] Refactor `UnwrapCastInComparison` to remove `Expr` clones [arrow-datafusion]

2024-04-17 Thread via GitHub
peter-toth opened a new pull request, #10115: URL: https://github.com/apache/arrow-datafusion/pull/10115 ## Which issue does this PR close? Part of https://github.com/apache/arrow-datafusion/issues/9637, follow-up to https://github.com/apache/arrow-datafusion/pull/10087. ##

Re: [PR] GH-41256: [Format][Docs] Add a canonical extension type specification for a generic text data format (e.g. JSON) [arrow]

2024-04-17 Thread via GitHub
pitrou commented on PR #41257: URL: https://github.com/apache/arrow/pull/41257#issuecomment-2061350504 @lidavidm See the PR submitted here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] GH-41256: [Format][Docs] Add a canonical extension type specification for a generic text data format (e.g. JSON) [arrow]

2024-04-17 Thread via GitHub
rok commented on PR #41257: URL: https://github.com/apache/arrow/pull/41257#issuecomment-2061349391 Right, the `media_type` information would enable end users to interpret how data is encoded. Alternatively we could have one extension per endoding (JSON, YAML, etc) or say that JSON is

Re: [PR] GH-41256: [Format][Docs] Add a canonical extension type specification for a generic text data format (e.g. JSON) [arrow]

2024-04-17 Thread via GitHub
pitrou commented on PR #41257: URL: https://github.com/apache/arrow/pull/41257#issuecomment-2061355446 JSON is not to consider as a _file format_ here but as a _data type_. Several database engines such as PostgreSQL allow for first-class JSON columns and that's what we're trying to convey

[PR] Update proc-macro2 requirement from =1.0.80 to =1.0.81 [arrow-rs]

2024-04-17 Thread via GitHub
dependabot[bot] opened a new pull request, #5659: URL: https://github.com/apache/arrow-rs/pull/5659 Updates the requirements on [proc-macro2](https://github.com/dtolnay/proc-macro2) to permit the latest version. Release notes Sourced from

Re: [PR] GH-41256: [Format][Docs] Add a canonical extension type specification for a generic text data format (e.g. JSON) [arrow]

2024-04-17 Thread via GitHub
rok commented on PR #41257: URL: https://github.com/apache/arrow/pull/41257#issuecomment-2061386906 > JSON is not to consider as a _file format_ here but as a _data type_. Several database engines such as PostgreSQL allow for first-class JSON columns and that's what we're trying to convey

Re: [PR] WIP: [Release] Verify release-16.0.0-rc0 [arrow]

2024-04-17 Thread via GitHub
ursabot commented on PR #41235: URL: https://github.com/apache/arrow/pull/41235#issuecomment-2061455714 Benchmark runs are scheduled for commit 6a28035c2b49b432dc63f5ee7524d76b4ed2d762. Watch https://buildkite.com/apache-arrow and https://conbench.ursa.dev for updates. A comment will be

Re: [PR] WIP: [Release] Verify release-16.0.0-rc0 [arrow]

2024-04-17 Thread via GitHub
raulcd commented on PR #41235: URL: https://github.com/apache/arrow/pull/41235#issuecomment-2061455346 @ursabot please benchmark -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] feat(rust): add public abstract API and dummy driver implementation [arrow-adbc]

2024-04-17 Thread via GitHub
lidavidm commented on code in PR #1725: URL: https://github.com/apache/arrow-adbc/pull/1725#discussion_r1569013032 ## rust2/core/src/lib.rs: ## @@ -0,0 +1,520 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the

Re: [PR] GH-41179: [Docs] Documentation for Dissociated IPC Protocol [arrow]

2024-04-17 Thread via GitHub
github-actions[bot] commented on PR #41180: URL: https://github.com/apache/arrow/pull/41180#issuecomment-2061522462 Revision: b1f03e1832488b3f2e75a9f397e12e0fb9171a93 Submitted crossbow builds: [ursacomputing/crossbow @

Re: [PR] GH-41256: [Format][Docs] Add a canonical extension type specification for JSON [arrow]

2024-04-17 Thread via GitHub
rok commented on PR #41257: URL: https://github.com/apache/arrow/pull/41257#issuecomment-2061561335 @pitrou this is now fully JSONified. :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] feat: Add support for Utf8Type and TimeStamp in Parquet statistics [arrow-datafusion]

2024-04-17 Thread via GitHub
matthewmturner commented on PR #9129: URL: https://github.com/apache/arrow-datafusion/pull/9129#issuecomment-2061595462 @alamb @Weijun-H i have plans to pick up #8295 next week unless you both think that this can be completed before then (I havent looked yet to see whether it makes sense

Re: [PR] GH-41179: [Docs] Documentation for Dissociated IPC Protocol [arrow]

2024-04-17 Thread via GitHub
pitrou commented on code in PR #41180: URL: https://github.com/apache/arrow/pull/41180#discussion_r1569077263 ## docs/source/format/DissociatedIPC.rst: ## @@ -0,0 +1,335 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license

Re: [PR] GH-41179: [Docs] Documentation for Dissociated IPC Protocol [arrow]

2024-04-17 Thread via GitHub
pitrou commented on code in PR #41180: URL: https://github.com/apache/arrow/pull/41180#discussion_r1569077805 ## docs/source/format/DissociatedIPC.rst: ## @@ -0,0 +1,335 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license

Re: [PR] feat: `DataFrame` supports unnesting multiple columns [arrow-datafusion]

2024-04-17 Thread via GitHub
jonahgao commented on code in PR #10118: URL: https://github.com/apache/arrow-datafusion/pull/10118#discussion_r1569078468 ## datafusion/sqllogictest/test_files/unnest.slt: ## @@ -383,5 +383,23 @@ select unnest(array_remove(column1, 3)) - 1 as c1, column3 from unnest_table;

Re: [PR] feat: `DataFrame` supports unnesting multiple columns [arrow-datafusion]

2024-04-17 Thread via GitHub
jonahgao commented on code in PR #10118: URL: https://github.com/apache/arrow-datafusion/pull/10118#discussion_r1569078468 ## datafusion/sqllogictest/test_files/unnest.slt: ## @@ -383,5 +383,23 @@ select unnest(array_remove(column1, 3)) - 1 as c1, column3 from unnest_table;

Re: [PR] Refactor `UnwrapCastInComparison` to remove `Expr` clones [arrow-datafusion]

2024-04-17 Thread via GitHub
peter-toth commented on code in PR #10115: URL: https://github.com/apache/arrow-datafusion/pull/10115#discussion_r1569102800 ## datafusion/optimizer/src/unwrap_cast_in_comparison.rs: ## @@ -140,140 +139,132 @@ struct UnwrapCastExprRewriter { impl TreeNodeRewriter for

Re: [PR] Refactor `UnwrapCastInComparison` to remove `Expr` clones [arrow-datafusion]

2024-04-17 Thread via GitHub
peter-toth commented on code in PR #10115: URL: https://github.com/apache/arrow-datafusion/pull/10115#discussion_r1569103777 ## datafusion/optimizer/src/unwrap_cast_in_comparison.rs: ## @@ -140,140 +139,132 @@ struct UnwrapCastExprRewriter { impl TreeNodeRewriter for

Re: [PR] GH-41186: [C++][Parquet][Doc] Denote PARQUET:field_id in parquet.rst [arrow]

2024-04-17 Thread via GitHub
mapleFU commented on code in PR #41187: URL: https://github.com/apache/arrow/pull/41187#discussion_r1569103822 ## docs/source/cpp/parquet.rst: ## @@ -571,6 +571,19 @@ More specifically, Parquet C++ supports: * EncryptionWithFooterKey and EncryptionWithColumnKey modes. *

Re: [PR] feat: `DataFrame` supports unnesting multiple columns [arrow-datafusion]

2024-04-17 Thread via GitHub
jonahgao commented on code in PR #10118: URL: https://github.com/apache/arrow-datafusion/pull/10118#discussion_r1569104248 ## datafusion/core/tests/dataframe/mod.rs: ## @@ -1437,6 +1438,91 @@ async fn unnest_analyze_metrics() -> Result<()> { Ok(()) } + +#[tokio::test]

Re: [PR] GH-37720: [Java][FlightSQL] Implement stateless prepared statements [arrow]

2024-04-17 Thread via GitHub
stevelorddremio commented on code in PR #41237: URL: https://github.com/apache/arrow/pull/41237#discussion_r1569126617 ## java/flight/flight-sql/src/test/java/org/apache/arrow/flight/sql/example/FlightSqlStatelessExample.java: ## @@ -0,0 +1,246 @@ +/* + * Licensed to the Apache

Re: [PR] feat(rust): add public abstract API and dummy driver implementation [arrow-adbc]

2024-04-17 Thread via GitHub
lidavidm commented on code in PR #1725: URL: https://github.com/apache/arrow-adbc/pull/1725#discussion_r1568835003 ## .github/workflows/rust.yml: ## @@ -38,7 +38,7 @@ permissions: defaults: run: -working-directory: rust +working-directory: rust2 Review Comment:

Re: [PR] GH-40964: [CI][Archery] Archery linking should also check for undefined symbols Linux [arrow]

2024-04-17 Thread via GitHub
pitrou commented on code in PR #40520: URL: https://github.com/apache/arrow/pull/40520#discussion_r1568835195 ## dev/archery/archery/linking.py: ## @@ -61,9 +63,83 @@ def list_dependency_names(self): names.append(name) return names +def

Re: [PR] feat(rust): add public abstract API and dummy driver implementation [arrow-adbc]

2024-04-17 Thread via GitHub
lidavidm commented on code in PR #1725: URL: https://github.com/apache/arrow-adbc/pull/1725#discussion_r1568835401 ## .github/workflows/rust.yml: ## @@ -38,7 +38,7 @@ permissions: defaults: run: -working-directory: rust +working-directory: rust2 Review Comment:

Re: [PR] GH-41256: [Format][Docs] Add a canonical extension type specification for a generic text data format (e.g. JSON) [arrow]

2024-04-17 Thread via GitHub
github-actions[bot] commented on PR #41257: URL: https://github.com/apache/arrow/pull/41257#issuecomment-2061267907 :warning: GitHub issue #41256 **has been automatically assigned in GitHub** to PR creator. -- This is an automated message from the Apache Git Service. To respond to the

Re: [I] csharp/src/Drivers/Apache: Rename this project and add NuGet creation [arrow-adbc]

2024-04-17 Thread via GitHub
davidhcoe commented on issue #1726: URL: https://github.com/apache/arrow-adbc/issues/1726#issuecomment-2061276601 I would actually prefer they each have their own names: - Apache.Arrow.Adbc.Drivers.Hive - Apache.Arrow.Adbc.Drivers.Impala - Apache.Arrow.Adbc.Drivers.Spark

Re: [I] IPC code writes data with insufficient alignment [arrow-rs]

2024-04-17 Thread via GitHub
tustvold commented on issue #5553: URL: https://github.com/apache/arrow-rs/issues/5553#issuecomment-2061309191 `label_issue.py` automatically added labels {'arrow'} from #5554 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [I] IPC code writes data with insufficient alignment [arrow-rs]

2024-04-17 Thread via GitHub
tustvold commented on issue #5553: URL: https://github.com/apache/arrow-rs/issues/5553#issuecomment-2061309234 `label_issue.py` automatically added labels {'arrow-flight'} from #5554 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [I] Add scientific notation decimal parsing in `parse_decimal` [arrow-rs]

2024-04-17 Thread via GitHub
tustvold commented on issue #5549: URL: https://github.com/apache/arrow-rs/issues/5549#issuecomment-2061309111 `label_issue.py` automatically added labels {'arrow'} from #5611 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [I] Arrow Flight format support for `StringViewArray` and `BinaryViewArray` [arrow-rs]

2024-04-17 Thread via GitHub
tustvold commented on issue #5507: URL: https://github.com/apache/arrow-rs/issues/5507#issuecomment-2061308616 `label_issue.py` automatically added labels {'arrow'} from #5481 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [I] `filter` kernel support for `StringViewArray` and `BinaryViewArray` [arrow-rs]

2024-04-17 Thread via GitHub
tustvold commented on issue #5510: URL: https://github.com/apache/arrow-rs/issues/5510#issuecomment-2061308714 `label_issue.py` automatically added labels {'arrow'} from #5481 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [I] `take` kernel support for `StringViewArray` and `BinaryViewArray` [arrow-rs]

2024-04-17 Thread via GitHub
tustvold commented on issue #5511: URL: https://github.com/apache/arrow-rs/issues/5511#issuecomment-2061308783 `label_issue.py` automatically added labels {'arrow'} from #5602 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [I] Local object store copy/rename with nonexistent `from` file loops forever instead of erroring [arrow-rs]

2024-04-17 Thread via GitHub
tustvold commented on issue #5503: URL: https://github.com/apache/arrow-rs/issues/5503#issuecomment-2061308538 `label_issue.py` automatically added labels {'object-store'} from #5528 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [I] object_store: allow setting content-type per request [arrow-rs]

2024-04-17 Thread via GitHub
tustvold commented on issue #5329: URL: https://github.com/apache/arrow-rs/issues/5329#issuecomment-2061307791 `label_issue.py` automatically added labels {'object-store'} from #5650 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [I] Add a mutable builder for `Offset`s [arrow-rs]

2024-04-17 Thread via GitHub
tustvold commented on issue #5384: URL: https://github.com/apache/arrow-rs/issues/5384#issuecomment-2061308086 `label_issue.py` automatically added labels {'arrow'} from #5440 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] GH-41102: [Packaging][Release] Create unique git tags for release candidates (e.g. apache-arrow-{MAJOR}.{MINOR}.{PATCH}-rc{RC_NUM}) [arrow]

2024-04-17 Thread via GitHub
sgilmore10 commented on code in PR #41131: URL: https://github.com/apache/arrow/pull/41131#discussion_r1568893669 ## dev/release/02-source.sh: ## @@ -25,6 +25,7 @@ set -eu : ${SOURCE_UPLOAD:=${SOURCE_DEFAULT}} : ${SOURCE_PR:=${SOURCE_DEFAULT}} :

Re: [PR] Prepare object_store 0.10.0 [arrow-rs]

2024-04-17 Thread via GitHub
tustvold commented on PR #5658: URL: https://github.com/apache/arrow-rs/pull/5658#issuecomment-2061371238 DF PR - https://github.com/apache/arrow-datafusion/pull/10116 I've also confirmed that the parquet crate compiles against this version of object_store without issue -- This is

[PR] improve monotonicity api [arrow-datafusion]

2024-04-17 Thread via GitHub
tinfoil-knight opened a new pull request, #10117: URL: https://github.com/apache/arrow-datafusion/pull/10117 ## Which issue does this PR close? Closes #9879 . ## Rationale for this change The `Vec>` type used to express the monotonicity of scalar

Re: [PR] GH-40997: [C++] Get null_bit_id according to are_cols_in_encoding_order in NullUpdateColumnToRow_avx2 [arrow]

2024-04-17 Thread via GitHub
zanmato1984 commented on code in PR #40998: URL: https://github.com/apache/arrow/pull/40998#discussion_r1568981056 ## cpp/src/arrow/compute/row/row_internal.cc: ## @@ -68,6 +68,10 @@ void RowTableMetadata::FromColumnMetadataVector( // For the varying-length column, the

Re: [PR] GH-34785: [C++][Parquet] Parquet Bloom Filter Writer Implementation [arrow]

2024-04-17 Thread via GitHub
wgtmac commented on code in PR #37400: URL: https://github.com/apache/arrow/pull/37400#discussion_r156984 ## cpp/src/parquet/bloom_filter_builder.cc: ## @@ -0,0 +1,155 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license

Re: [PR] feat(rust): add public abstract API and dummy driver implementation [arrow-adbc]

2024-04-17 Thread via GitHub
lidavidm commented on code in PR #1725: URL: https://github.com/apache/arrow-adbc/pull/1725#discussion_r1569011077 ## rust2/core/src/lib.rs: ## @@ -0,0 +1,520 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the

Re: [PR] feat(rust): add public abstract API and dummy driver implementation [arrow-adbc]

2024-04-17 Thread via GitHub
alexandreyc commented on code in PR #1725: URL: https://github.com/apache/arrow-adbc/pull/1725#discussion_r1569003762 ## rust2/core/src/lib.rs: ## @@ -0,0 +1,520 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See

<    1   2   3   4   >