[GitHub] [arrow] kou commented on pull request #33751: WIP: [Release] Verify release-11.0.0-rc0

2023-01-18 Thread GitBox
kou commented on PR #33751: URL: https://github.com/apache/arrow/pull/33751#issuecomment-1396262181 I re-ran the failed source verification jobs on macOS arm64. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [arrow] kou merged pull request #33755: GH-33754: [CI] Install brewfile dependencies for verification task jobs on M1

2023-01-18 Thread GitBox
kou merged PR #33755: URL: https://github.com/apache/arrow/pull/33755 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [arrow] emkornfield commented on a diff in pull request #17877: PARQUET-2225:[C++][Parquet] Allow reading dense with RecordReader

2023-01-18 Thread GitBox
emkornfield commented on code in PR #17877: URL: https://github.com/apache/arrow/pull/17877#discussion_r1080689054 ## cpp/src/parquet/column_reader.cc: ## @@ -1829,21 +1831,34 @@ class TypedRecordReader : public TypedColumnReaderImpl, int64_t null_count = 0; if

[GitHub] [arrow] emkornfield commented on a diff in pull request #17877: PARQUET-2225:[C++][Parquet] Allow reading dense with RecordReader

2023-01-18 Thread GitBox
emkornfield commented on code in PR #17877: URL: https://github.com/apache/arrow/pull/17877#discussion_r1080687189 ## cpp/src/parquet/column_reader.cc: ## @@ -1829,21 +1831,34 @@ class TypedRecordReader : public TypedColumnReaderImpl, int64_t null_count = 0; if

[GitHub] [arrow] emkornfield commented on a diff in pull request #17877: PARQUET-2225:[C++][Parquet] Allow reading dense with RecordReader

2023-01-18 Thread GitBox
emkornfield commented on code in PR #17877: URL: https://github.com/apache/arrow/pull/17877#discussion_r1080681488 ## cpp/src/parquet/column_reader.h: ## @@ -367,6 +368,7 @@ class PARQUET_EXPORT RecordReader { uint8_t* values() const { return values_->mutable_data(); }

[GitHub] [arrow] emkornfield commented on a diff in pull request #17877: PARQUET-2225:[C++][Parquet] Allow reading dense with RecordReader

2023-01-18 Thread GitBox
emkornfield commented on code in PR #17877: URL: https://github.com/apache/arrow/pull/17877#discussion_r1080679491 ## cpp/src/parquet/column_reader.h: ## @@ -311,13 +311,14 @@ class PARQUET_EXPORT RecordReader { static std::shared_ptr Make( Review Comment: please add

[GitHub] [arrow-datafusion-python] andygrove opened a new issue, #134: Add script for Python linting

2023-01-18 Thread GitBox
andygrove opened a new issue, #134: URL: https://github.com/apache/arrow-datafusion-python/issues/134 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** CI runs Python linters and fails the build if formatting is incorrect.

[GitHub] [arrow-julia] quinnj commented on issue #186: Support Arrow Flight RPC

2023-01-18 Thread GitBox
quinnj commented on issue #186: URL: https://github.com/apache/arrow-julia/issues/186#issuecomment-1396239024 Yes, @Drvi is my colleague and we worked on the ProtoBuf.jl rewrite together. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [arrow-datafusion-python] andygrove commented on pull request #115: Upgrade to DataFusion 16.0.0

2023-01-18 Thread GitBox
andygrove commented on PR #115: URL: https://github.com/apache/arrow-datafusion-python/pull/115#issuecomment-1396233523 @francis-du fyi -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] minyoung commented on pull request #14989: ARROW-18438: [Go][Parquet] Panic in bitmap writer

2023-01-18 Thread GitBox
minyoung commented on PR #14989: URL: https://github.com/apache/arrow/pull/14989#issuecomment-1396232056 @zeroshade you are correct about the first test case (my bad about that). What about the second test case where everything is marked as nullable though? -- This is an automated

[GitHub] [arrow] github-actions[bot] commented on pull request #33772: GH-15137: [C++][CI] ASAN error in streaming JSON reader tests

2023-01-18 Thread GitBox
github-actions[bot] commented on PR #33772: URL: https://github.com/apache/arrow/pull/33772#issuecomment-1396228962 :warning: GitHub issue #15137 **has been automatically assigned in GitHub** to PR creator. -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] github-actions[bot] commented on pull request #33772: GH-15137: [C++][CI] ASAN error in streaming JSON reader tests

2023-01-18 Thread GitBox
github-actions[bot] commented on PR #33772: URL: https://github.com/apache/arrow/pull/33772#issuecomment-1396228918 * Closes: #15137 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [arrow] benibus opened a new pull request, #33772: GH-15137: [C++][CI] ASAN error in streaming JSON reader tests

2023-01-18 Thread GitBox
benibus opened a new pull request, #33772: URL: https://github.com/apache/arrow/pull/33772 The input streams passed to the reader weren't properly taking ownership of their test strings, despite the stream (potentially) outliving the test's scope. -- This is an automated message from the

[GitHub] [arrow-datafusion-python] andygrove commented on pull request #115: Upgrade to DataFusion 16.0.0

2023-01-18 Thread GitBox
andygrove commented on PR #115: URL: https://github.com/apache/arrow-datafusion-python/pull/115#issuecomment-1396228300 @jdye64 I have now fixed all the regressions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [arrow] ursabot commented on pull request #33718: GH-33717: [Go] Flight SQL Server handle StreamChunk errors

2023-01-18 Thread GitBox
ursabot commented on PR #33718: URL: https://github.com/apache/arrow/pull/33718#issuecomment-1396221935 Benchmark runs are scheduled for baseline = 98da8191242f22d3de225a08c387f5b150bb7a5c and contender = 85a111f8d5cef3a668c9cf8c47ccb943048e50f6. 85a111f8d5cef3a668c9cf8c47ccb943048e50f6

[GitHub] [arrow-datafusion] avantgardnerio commented on a diff in pull request #4977: Infer values for inserts

2023-01-18 Thread GitBox
avantgardnerio commented on code in PR #4977: URL: https://github.com/apache/arrow-datafusion/pull/4977#discussion_r1080658490 ## datafusion/sql/tests/integration_test.rs: ## @@ -3390,6 +3390,75 @@ Dml: op=[Update] table=[person]

[GitHub] [arrow] felipecrv commented on a diff in pull request #15083: GH-33566: [C++] Add support for nullary and n-ary aggregate functions

2023-01-18 Thread GitBox
felipecrv commented on code in PR #15083: URL: https://github.com/apache/arrow/pull/15083#discussion_r1080657470 ## python/pyarrow/_compute.pyx: ## @@ -2202,12 +2202,18 @@ def _group_by(args, keys, aggregations): _pack_compute_args(args, _args)

[GitHub] [arrow] felipecrv commented on a diff in pull request #15083: GH-33566: [C++] Add support for nullary and n-ary aggregate functions

2023-01-18 Thread GitBox
felipecrv commented on code in PR #15083: URL: https://github.com/apache/arrow/pull/15083#discussion_r1080656608 ## python/pyarrow/table.pxi: ## @@ -5358,36 +5358,45 @@ list[tuple(str, str, FunctionOptions)] values_sum: [[3,7,5]] keys:

[GitHub] [arrow-datafusion] avantgardnerio commented on a diff in pull request #4977: Infer values for inserts

2023-01-18 Thread GitBox
avantgardnerio commented on code in PR #4977: URL: https://github.com/apache/arrow-datafusion/pull/4977#discussion_r1080656176 ## datafusion/sql/tests/integration_test.rs: ## @@ -3390,6 +3390,45 @@ Dml: op=[Update] table=[person]

[GitHub] [arrow] felipecrv commented on a diff in pull request #15083: GH-33566: [C++] Add support for nullary and n-ary aggregate functions

2023-01-18 Thread GitBox
felipecrv commented on code in PR #15083: URL: https://github.com/apache/arrow/pull/15083#discussion_r1080654731 ## python/pyarrow/table.pxi: ## @@ -5334,7 +5334,7 @@ class TableGroupBy: Parameters -- aggregations : list[tuple(str, str)] or \

[GitHub] [arrow-rs] comphead commented on issue #3547: Arrow CSV writer should not fail when cannot cast the value

2023-01-18 Thread GitBox
comphead commented on issue #3547: URL: https://github.com/apache/arrow-rs/issues/3547#issuecomment-1396210364 @tustvold @alamb please assign this to me -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow-rs] alamb commented on pull request #3514: Use array_value_to_string in arrow-csv

2023-01-18 Thread GitBox
alamb commented on PR #3514: URL: https://github.com/apache/arrow-rs/pull/3514#issuecomment-1396204210 Thanks @JayjeetAtGithub -- I plan to take a look tomorrow -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [arrow-datafusion] andygrove commented on a diff in pull request #4977: Infer values for updates

2023-01-18 Thread GitBox
andygrove commented on code in PR #4977: URL: https://github.com/apache/arrow-datafusion/pull/4977#discussion_r1080641201 ## datafusion/sql/tests/integration_test.rs: ## @@ -3390,6 +3390,45 @@ Dml: op=[Update] table=[person] prepare_stmt_replace_params_quick_test(plan,

[GitHub] [arrow] rok commented on pull request #33731: GH-15231: [C++][Benchmarking] Add new memory pool metrics and track in benchmarks

2023-01-18 Thread GitBox
rok commented on PR #33731: URL: https://github.com/apache/arrow/pull/33731#issuecomment-1396182972 > @rok you mean expose in the benchmark output? We have a limited number of fields there. If we'd want it we'd probably want a separate benchmark altogether. It would probably give

[GitHub] [arrow] otegami commented on issue #33749: [Ruby] Add Arrow::RecordBatch#each_raw_record

2023-01-18 Thread GitBox
otegami commented on issue #33749: URL: https://github.com/apache/arrow/issues/33749#issuecomment-1396180034 Thank you for reviewing it. I've just fixed it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [arrow] otegami commented on issue #33750: [GLib] Add chunk_size property to GArrowTableBatchReader

2023-01-18 Thread GitBox
otegami commented on issue #33750: URL: https://github.com/apache/arrow/issues/33750#issuecomment-1396176291 Thank you for reviewing it. I've just fixed it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [arrow-datafusion] melgenek commented on pull request #4834: (#4462) Postgres compatibility tests using sqllogictest

2023-01-18 Thread GitBox
melgenek commented on PR #4834: URL: https://github.com/apache/arrow-datafusion/pull/4834#issuecomment-1396153997 @alamb The pr is updated according to your previous review: - results are defined explicitly - Postgres runs only a subset of tests file prefixed with `pg_compat_` -

[GitHub] [arrow-datafusion] avantgardnerio opened a new pull request, #4977: Infer values for updates

2023-01-18 Thread GitBox
avantgardnerio opened a new pull request, #4977: URL: https://github.com/apache/arrow-datafusion/pull/4977 # Which issue does this PR close? Closes #4976. # Rationale for this change Described in issue. # What changes are included in this PR? Type inference

[GitHub] [arrow-datafusion] melgenek commented on a diff in pull request #4834: (#4462) Postgres compatibility tests using sqllogictest

2023-01-18 Thread GitBox
melgenek commented on code in PR #4834: URL: https://github.com/apache/arrow-datafusion/pull/4834#discussion_r1080604136 ## datafusion/core/tests/sqllogictests/postgres/test_files/simple_except.slt: ## @@ -0,0 +1,27 @@ +# Licensed to the Apache Software Foundation (ASF) under

[GitHub] [arrow-datafusion] melgenek commented on a diff in pull request #4834: (#4462) Postgres compatibility tests using sqllogictest

2023-01-18 Thread GitBox
melgenek commented on code in PR #4834: URL: https://github.com/apache/arrow-datafusion/pull/4834#discussion_r1080603376 ## datafusion/core/tests/sqllogictests/postgres/test_files/self_join_with_alias.slt: ## @@ -0,0 +1,24 @@ +# Licensed to the Apache Software Foundation (ASF)

[GitHub] [arrow-datafusion] avantgardnerio opened a new issue, #4976: Infer prepared statement parameter types for insert queries with values clauses

2023-01-18 Thread GitBox
avantgardnerio opened a new issue, #4976: URL: https://github.com/apache/arrow-datafusion/issues/4976 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** As of recently, we added support to run DML statements, and to infer

[GitHub] [arrow-datafusion] melgenek commented on a diff in pull request #4834: (#4462) Postgres compatibility tests using sqllogictest

2023-01-18 Thread GitBox
melgenek commented on code in PR #4834: URL: https://github.com/apache/arrow-datafusion/pull/4834#discussion_r1080602268 ## datafusion/core/tests/sqllogictests/src/main.rs: ## @@ -15,112 +15,135 @@ // specific language governing permissions and limitations // under the

[GitHub] [arrow-datafusion] melgenek commented on a diff in pull request #4834: (#4462) Postgres compatibility tests using sqllogictest

2023-01-18 Thread GitBox
melgenek commented on code in PR #4834: URL: https://github.com/apache/arrow-datafusion/pull/4834#discussion_r1080601217 ## datafusion/core/tests/sqllogictests/postgres/test_files/simple_window_partition_order_aggregation.slt: ## @@ -0,0 +1,37 @@ +# Licensed to the Apache

[GitHub] [arrow-datafusion] melgenek commented on a diff in pull request #4834: (#4462) Postgres compatibility tests using sqllogictest

2023-01-18 Thread GitBox
melgenek commented on code in PR #4834: URL: https://github.com/apache/arrow-datafusion/pull/4834#discussion_r1080600633 ## datafusion/core/tests/sqllogictests/postgres/test_files/simple_aggregation.slt: ## @@ -0,0 +1,29 @@ +# Licensed to the Apache Software Foundation (ASF)

[GitHub] [arrow-datafusion] melgenek commented on a diff in pull request #4834: (#4462) Postgres compatibility tests using sqllogictest

2023-01-18 Thread GitBox
melgenek commented on code in PR #4834: URL: https://github.com/apache/arrow-datafusion/pull/4834#discussion_r1080579104 ## datafusion/core/Cargo.toml: ## @@ -104,17 +104,22 @@ xz2 = { version = "0.1", optional = true } [dev-dependencies] arrow = { version = "31.0.0",

[GitHub] [arrow-ballista] adriangb commented on issue #173: Add support for Python UDFs in distributed queries

2023-01-18 Thread GitBox
adriangb commented on issue #173: URL: https://github.com/apache/arrow-ballista/issues/173#issuecomment-1396138349 > Where would the HTTP server be hosted? Scheduler? Single Executor process? Multiple Executor processes? An entirely new process? I'd think it'd be very similar to an

[GitHub] [arrow-datafusion-python] jdye64 commented on pull request #115: Upgrade to DataFusion 16.0.0

2023-01-18 Thread GitBox
jdye64 commented on PR #115: URL: https://github.com/apache/arrow-datafusion-python/pull/115#issuecomment-1396127683 Actually I see that `config.rs` also needs some refactoring and updating. That also seems like a heavy fix for this PR however. -- This is an automated message from the

[GitHub] [arrow-datafusion] andygrove opened a new pull request, #4975: [maint-16.x] Prep for release

2023-01-18 Thread GitBox
andygrove opened a new pull request, #4975: URL: https://github.com/apache/arrow-datafusion/pull/4975 # Which issue does this PR close? Closes #. # Rationale for this change # What changes are included in this PR? # Are these changes

[GitHub] [arrow] lidavidm merged pull request #33766: MINOR: [C++][Docs] Fix broken URL in C++ Building docs

2023-01-18 Thread GitBox
lidavidm merged PR #33766: URL: https://github.com/apache/arrow/pull/33766 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [arrow-datafusion] ursabot commented on pull request #4701: Infer prepared statement parameter types

2023-01-18 Thread GitBox
ursabot commented on PR #4701: URL: https://github.com/apache/arrow-datafusion/pull/4701#issuecomment-1396123254 Benchmark runs are scheduled for baseline = 64fa312ecc5f32294e70fd7389e18cb41f25e732 and contender = e6a050058bd704f73b38106b7abf21dc4539eebc.

[GitHub] [arrow] github-actions[bot] commented on pull request #33770: GH-33760: [R] Push projection expressions into ScanNode

2023-01-18 Thread GitBox
github-actions[bot] commented on PR #33770: URL: https://github.com/apache/arrow/pull/33770#issuecomment-1396123137 :warning: GitHub issue #33760 **has been automatically assigned in GitHub** to PR creator. -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] github-actions[bot] commented on pull request #33770: GH-33760: [R] Push projection expressions into ScanNode

2023-01-18 Thread GitBox
github-actions[bot] commented on PR #33770: URL: https://github.com/apache/arrow/pull/33770#issuecomment-1396123106 * Closes: #33760 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [arrow] nealrichardson opened a new pull request, #33770: GH-33760: [R] Push projection expressions into ScanNode

2023-01-18 Thread GitBox
nealrichardson opened a new pull request, #33770: URL: https://github.com/apache/arrow/pull/33770 ### Rationale for this change Followup to https://github.com/apache/arrow/pull/19706/files#r1073391100 with the goal of deleting and simplifying some code. Unfortunately, it does

[GitHub] [arrow] lidavidm commented on issue #33767: [Go] Exported ArrowArrayStream.get_next doesn't handle uninitialized ArrowArrays well

2023-01-18 Thread GitBox
lidavidm commented on issue #33767: URL: https://github.com/apache/arrow/issues/33767#issuecomment-1396121047 @pitrou I noticed this while using PyArrow to import an ArrowArrayStream exported from Go. PyArrow apparently passes in an uninitialized ArrowArray to Go, and Go was assuming it

[GitHub] [arrow] westonpace commented on issue #33759: How to limit the memory consumption of to_batches()

2023-01-18 Thread GitBox
westonpace commented on issue #33759: URL: https://github.com/apache/arrow/issues/33759#issuecomment-1396119442 Which version of pyarrow are you using? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow] wjones127 commented on pull request #33731: GH-15231: [C++][Benchmarking] Add new memory pool metrics and track in benchmarks

2023-01-18 Thread GitBox
wjones127 commented on PR #33731: URL: https://github.com/apache/arrow/pull/33731#issuecomment-1396118796 > Would there be need to expose [jemalloc stats](https://github.com/apache/arrow/blob/1d9366f19e4b9846b33cc0c7bd7941cb5f482d74/cpp/src/arrow/memory_pool_test.cc#L190-L199) as well?

[GitHub] [arrow-datafusion-python] jdye64 commented on pull request #115: Upgrade to DataFusion 16.0.0

2023-01-18 Thread GitBox
jdye64 commented on PR #115: URL: https://github.com/apache/arrow-datafusion-python/pull/115#issuecomment-1396118042 @andygrove given that the remaining test failures preventing this PR from getting merged all all related to `window` logic, and it seems window logic bindings in general

[GitHub] [arrow] wjones127 commented on pull request #33731: GH-15231: [C++][Benchmarking] Add new memory pool metrics and track in benchmarks

2023-01-18 Thread GitBox
wjones127 commented on PR #33731: URL: https://github.com/apache/arrow/pull/33731#issuecomment-1396117917 Local benchmarks confirm this doesn't seem to have a meaningful affect on performance: ```

[GitHub] [arrow-datafusion] avantgardnerio closed issue #4683: Support PREPARE statements without explicit parameters

2023-01-18 Thread GitBox
avantgardnerio closed issue #4683: Support PREPARE statements without explicit parameters URL: https://github.com/apache/arrow-datafusion/issues/4683 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow-datafusion] avantgardnerio merged pull request #4701: Infer prepared statement parameter types

2023-01-18 Thread GitBox
avantgardnerio merged PR #4701: URL: https://github.com/apache/arrow-datafusion/pull/4701 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [arrow] ziggythehamster commented on issue #33733: [Packaging] Amazon Linux 2 RPMs - openssl-devel cannot coexist with openssl11-devel and breaks installing arrow-devel

2023-01-18 Thread GitBox
ziggythehamster commented on issue #33733: URL: https://github.com/apache/arrow/issues/33733#issuecomment-1396103997 > In your use case, what packages depend on `openssl11-devel`? Several internal packages require it. It would have been better for AL2 to ship the headers in an

[GitHub] [arrow] westonpace commented on pull request #33684: GH-15171: [C++] Pass std::string_view by value

2023-01-18 Thread GitBox
westonpace commented on PR #33684: URL: https://github.com/apache/arrow/pull/33684#issuecomment-1396102377 @ucasfl thanks for cleaning this up! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] github-actions[bot] commented on pull request #33768: GH-33767: [Go] Clear out parameter in ArrowArrayStream.get_next

2023-01-18 Thread GitBox
github-actions[bot] commented on PR #33768: URL: https://github.com/apache/arrow/pull/33768#issuecomment-1396097331 * Closes: #33767 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [arrow] rok commented on pull request #33731: GH-15231: [C++][Benchmarking] Add new memory pool metrics and track in benchmarks

2023-01-18 Thread GitBox
rok commented on PR #33731: URL: https://github.com/apache/arrow/pull/33731#issuecomment-1396081435 Would there be need to expose [jemalloc stats](https://github.com/apache/arrow/blob/1d9366f19e4b9846b33cc0c7bd7941cb5f482d74/cpp/src/arrow/memory_pool_test.cc#L190-L199) as well? -- This

[GitHub] [arrow-rs] ursabot commented on pull request #3557: Update pyo3 requirement from 0.17 to 0.18

2023-01-18 Thread GitBox
ursabot commented on PR #3557: URL: https://github.com/apache/arrow-rs/pull/3557#issuecomment-1396077719 Benchmark runs are scheduled for baseline = 3ae1c728b266c1ba801409eb7f4b901285783e94 and contender = de62808a9d65e052ff3e89550bf780d952c8ceae. de62808a9d65e052ff3e89550bf780d952c8ceae

[GitHub] [arrow-ballista] jdye64 commented on issue #173: Add support for Python UDFs in distributed queries

2023-01-18 Thread GitBox
jdye64 commented on issue #173: URL: https://github.com/apache/arrow-ballista/issues/173#issuecomment-1396077503 I have no hands-on experience with the HTTP UDFs model but am intrigued by the approach. Thank you for laying out your thoughts. Had some thoughts and questions.

[GitHub] [arrow] rok commented on issue #33762: [Dev] Remove Jira support from merge script

2023-01-18 Thread GitBox
rok commented on issue #33762: URL: https://github.com/apache/arrow/issues/33762#issuecomment-1396075486 I'd be happy to do the migration. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] wjones127 commented on pull request #33731: GH-15231: [C++][Benchmarking] Add new memory pool metrics and track in benchmarks

2023-01-18 Thread GitBox
wjones127 commented on PR #33731: URL: https://github.com/apache/arrow/pull/33731#issuecomment-1396074423 Testing locally, this doesn't seem to have a meaningful performance impact: ```

[GitHub] [arrow-rs] tustvold merged pull request #3557: Update pyo3 requirement from 0.17 to 0.18

2023-01-18 Thread GitBox
tustvold merged PR #3557: URL: https://github.com/apache/arrow-rs/pull/3557 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [arrow-datafusion-python] andygrove closed issue #122: support creating arrow-datafusion-python conda environment

2023-01-18 Thread GitBox
andygrove closed issue #122: support creating arrow-datafusion-python conda environment URL: https://github.com/apache/arrow-datafusion-python/issues/122 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow-datafusion-python] andygrove merged pull request #124: Introduce conda directory containing datafusion-dev.yaml conda enviro…

2023-01-18 Thread GitBox
andygrove merged PR #124: URL: https://github.com/apache/arrow-datafusion-python/pull/124 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [arrow-datafusion] saikrishna1-bidgely commented on pull request #4908: added a method to read multiple locations at the same time.

2023-01-18 Thread GitBox
saikrishna1-bidgely commented on PR #4908: URL: https://github.com/apache/arrow-datafusion/pull/4908#issuecomment-1396064487 > This looks like a nice improvement @saikrishna1-bidgely > > I think we should add a test for this new functionality so that we don't accidentally break the

[GitHub] [arrow-rs] askoa commented on a diff in pull request #3553: feat: Add `RunEndEncodedArray`

2023-01-18 Thread GitBox
askoa commented on code in PR #3553: URL: https://github.com/apache/arrow-rs/pull/3553#discussion_r1080529425 ## arrow-array/src/types.rs: ## @@ -240,6 +240,17 @@ impl ArrowDictionaryKeyType for UInt32Type {} impl ArrowDictionaryKeyType for UInt64Type {} +/// A subtype of

[GitHub] [arrow] ursabot commented on pull request #33680: GH-33679: [JS] Update dependencies

2023-01-18 Thread GitBox
ursabot commented on PR #33680: URL: https://github.com/apache/arrow/pull/33680#issuecomment-1396051505 Benchmark runs are scheduled for baseline = c525b57295e5ab9cb9e2591342d0b01a357660a3 and contender = 98da8191242f22d3de225a08c387f5b150bb7a5c. 98da8191242f22d3de225a08c387f5b150bb7a5c

[GitHub] [arrow] lidavidm commented on issue #32584: [C++][FlightRPC] Fix linking of Flight/gRPC example on MacOS

2023-01-18 Thread GitBox
lidavidm commented on issue #32584: URL: https://github.com/apache/arrow/issues/32584#issuecomment-1387744157 I'm not sure since I wasn't the original reporter. If it's fixed for you, let's close it. -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [arrow] amoeba commented on issue #32584: [C++][FlightRPC] Fix linking of Flight/gRPC example on MacOS

2023-01-18 Thread GitBox
amoeba commented on issue #32584: URL: https://github.com/apache/arrow/issues/32584#issuecomment-1387743095 Hi @lidavidm, I think this is no longer an issue since https://github.com/apache/arrow/pull/14077 was merged. This example builds and runs successfully on my macOS laptop and the

[GitHub] [arrow] wjones127 commented on a diff in pull request #33748: GH-33746: [R] Update NEWS.md for 11.0.0

2023-01-18 Thread GitBox
wjones127 commented on code in PR #33748: URL: https://github.com/apache/arrow/pull/33748#discussion_r1074039131 ## r/NEWS.md: ## @@ -19,6 +19,77 @@ # arrow 10.0.1.9000 +## New features + +### Docs + +* A substantial reorganisation, rewrite of and addition to, many of the

[GitHub] [arrow-rs] viirya commented on pull request #3559: [python] Remove FutureWarning

2023-01-18 Thread GitBox
viirya commented on PR #3559: URL: https://github.com/apache/arrow-rs/pull/3559#issuecomment-1387729848 Thank you @changhiskhan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [arrow] amoeba opened a new pull request, #33766: MINOR: [C++][Docs] Fix broken URL in C++ Building docs

2023-01-18 Thread GitBox
amoeba opened a new pull request, #33766: URL: https://github.com/apache/arrow/pull/33766 ### Rationale for this change Prior to this change, a link in [Building Arrow C++](https://arrow.apache.org/docs/developers/cpp/building.html) to the CMake UNITY_BUILD docs 404s due to a typo.

[GitHub] [arrow-rs] changhiskhan closed pull request #3559: [python] Remove FutureWarning

2023-01-18 Thread GitBox
changhiskhan closed pull request #3559: [python] Remove FutureWarning URL: https://github.com/apache/arrow-rs/pull/3559 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [arrow-rs] viirya commented on a diff in pull request #3553: feat: Add `RunEndEncodedArray`

2023-01-18 Thread GitBox
viirya commented on code in PR #3553: URL: https://github.com/apache/arrow-rs/pull/3553#discussion_r1074016831 ## arrow-array/src/types.rs: ## @@ -240,6 +240,17 @@ impl ArrowDictionaryKeyType for UInt32Type {} impl ArrowDictionaryKeyType for UInt64Type {} +/// A subtype of

[GitHub] [arrow-rs] askoa commented on a diff in pull request #3553: feat: Add `RunEndEncodedArray`

2023-01-18 Thread GitBox
askoa commented on code in PR #3553: URL: https://github.com/apache/arrow-rs/pull/3553#discussion_r1074013570 ## arrow-array/src/builder/primitive_ree_array_builder.rs: ## @@ -0,0 +1,218 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor

[GitHub] [arrow-rs] changhiskhan opened a new pull request, #3559: [python] Remove FutureWarning

2023-01-18 Thread GitBox
changhiskhan opened a new pull request, #3559: URL: https://github.com/apache/arrow-rs/pull/3559 Currently RecordBatch::to_pyarrow passes the Schema in the second arg position which causes a FutureWarning. Instead we change it to use the `schema` kwarg. # Which issue does this PR

[GitHub] [arrow-rs] askoa commented on a diff in pull request #3553: feat: Add `RunEndEncodedArray`

2023-01-18 Thread GitBox
askoa commented on code in PR #3553: URL: https://github.com/apache/arrow-rs/pull/3553#discussion_r1074002982 ## arrow-array/src/array/run_end_encoded_array.rs: ## @@ -0,0 +1,518 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license

[GitHub] [arrow-rs] askoa commented on a diff in pull request #3553: feat: Add `RunEndEncodedArray`

2023-01-18 Thread GitBox
askoa commented on code in PR #3553: URL: https://github.com/apache/arrow-rs/pull/3553#discussion_r1074002717 ## arrow-array/src/types.rs: ## @@ -240,6 +240,17 @@ impl ArrowDictionaryKeyType for UInt32Type {} impl ArrowDictionaryKeyType for UInt64Type {} +/// A subtype of

[GitHub] [arrow-rs] tustvold commented on a diff in pull request #3558: Re-encode dictionaries in selection kernels

2023-01-18 Thread GitBox
tustvold commented on code in PR #3558: URL: https://github.com/apache/arrow-rs/pull/3558#discussion_r1073983917 ## arrow-select/src/dictionary.rs: ## @@ -0,0 +1,171 @@ +use crate::interleave::interleave; +use arrow_array::builder::BooleanBufferBuilder; +use

[GitHub] [arrow-rs] tustvold opened a new pull request, #3558: Re-encode dictionaries in selection kernels

2023-01-18 Thread GitBox
tustvold opened a new pull request, #3558: URL: https://github.com/apache/arrow-rs/pull/3558 _Need to benchmark and integrate into interleave kernel, but creating as draft for visibility_ # Which issue does this PR close? Closes #506 Relates to #2832 #

[GitHub] [arrow-rs] tustvold closed issue #347: Reduce memory of concat kernel

2023-01-18 Thread GitBox
tustvold closed issue #347: Reduce memory of concat kernel URL: https://github.com/apache/arrow-rs/issues/347 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

[GitHub] [arrow] nealrichardson commented on issue #33758: SparkR Arrow "Hello World" Error: 'write_arrow' is not an exported object from 'namespace:arrow'

2023-01-18 Thread GitBox
nealrichardson commented on issue #33758: URL: https://github.com/apache/arrow/issues/33758#issuecomment-1387643701 > Arrow 8.0.0 works. > > Are you are stating the issue lies within SparkR and I should engage them? Looks like it has [already been

[GitHub] [arrow-datafusion] jdye64 commented on pull request #4892: refactor and add simple function to deserialize and serialize proto b…

2023-01-18 Thread GitBox
jdye64 commented on PR #4892: URL: https://github.com/apache/arrow-datafusion/pull/4892#issuecomment-1387641133 @andygrove does this look ok to you after the changes to `DataFusionError::Substrait`? -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [arrow-datafusion-python] jdye64 commented on issue #132: Add Python bindings for substrait module

2023-01-18 Thread GitBox
jdye64 commented on issue #132: URL: https://github.com/apache/arrow-datafusion-python/issues/132#issuecomment-1387623270 Depends on https://github.com/apache/arrow-datafusion-python/pull/115 since substrait was not added until the DataFusion 16.0.0 release. -- This is an automated

[GitHub] [arrow-datafusion-python] jdye64 commented on issue #133: Bindings for CSV/JSON compression support

2023-01-18 Thread GitBox
jdye64 commented on issue #133: URL: https://github.com/apache/arrow-datafusion-python/issues/133#issuecomment-1387622417 Depends on https://github.com/apache/arrow-datafusion-python/pull/115 since `CSVReadOptions` are not present until version 16.0.0 -- This is an automated message

[GitHub] [arrow-datafusion-python] jdye64 opened a new issue, #133: Bindings for CSV/JSON compression support

2023-01-18 Thread GitBox
jdye64 opened a new issue, #133: URL: https://github.com/apache/arrow-datafusion-python/issues/133 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** Support for reading compressed CSV/JSON files was recently added to

[GitHub] [arrow] cyborne100 commented on issue #33758: SparkR Arrow "Hello World" Error: 'write_arrow' is not an exported object from 'namespace:arrow'

2023-01-18 Thread GitBox
cyborne100 commented on issue #33758: URL: https://github.com/apache/arrow/issues/33758#issuecomment-1387609699 Arrow 8.0.0 works. Are you are stating the issue lies within SparkR and I should engage them? Was SparkR archived or moved to Apache Spark proper

[GitHub] [arrow] assignUser commented on issue #33762: [Dev] Remove Jira support from merge script

2023-01-18 Thread GitBox
assignUser commented on issue #33762: URL: https://github.com/apache/arrow/issues/33762#issuecomment-1387592504 Maybe @pitrou can start a lazy consensus thread on the parquet mailing list about that? -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [arrow-rs] dependabot[bot] commented on pull request #3552: Update pyo3 requirement from 0.17 to 0.18

2023-01-18 Thread GitBox
dependabot[bot] commented on PR #3552: URL: https://github.com/apache/arrow-rs/pull/3552#issuecomment-1387581564 OK, I won't notify you again about this release, but will get in touch when a new version is available. If you'd rather skip all updates until the next major or minor version,

[GitHub] [arrow-rs] viirya closed pull request #3552: Update pyo3 requirement from 0.17 to 0.18

2023-01-18 Thread GitBox
viirya closed pull request #3552: Update pyo3 requirement from 0.17 to 0.18 URL: https://github.com/apache/arrow-rs/pull/3552 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [arrow-rs] alamb commented on issue #36: Workaround for writing Cargo.lock to read-only mounted source directory in docker-compose

2023-01-18 Thread GitBox
alamb commented on issue #36: URL: https://github.com/apache/arrow-rs/issues/36#issuecomment-1387576999 I am not sure what this ticket is referring to so let's close it -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [arrow-rs] alamb closed issue #36: Workaround for writing Cargo.lock to read-only mounted source directory in docker-compose

2023-01-18 Thread GitBox
alamb closed issue #36: Workaround for writing Cargo.lock to read-only mounted source directory in docker-compose URL: https://github.com/apache/arrow-rs/issues/36 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [arrow] nealrichardson commented on issue #33762: [Dev] Remove Jira support from merge script

2023-01-18 Thread GitBox
nealrichardson commented on issue #33762: URL: https://github.com/apache/arrow/issues/33762#issuecomment-1387555152 > Would love to get rid of all of that but we still have ~5 PARQUET issues per release from JIRA. IMO we should just get rid of that too, I think most of the parquet issues

[GitHub] [arrow-rs] viirya opened a new pull request, #3557: Update pyo3 requirement from 0.17 to 0.18

2023-01-18 Thread GitBox
viirya opened a new pull request, #3557: URL: https://github.com/apache/arrow-rs/pull/3557 # Which issue does this PR close? Closes #. # Rationale for this change # What changes are included in this PR? # Are there any user-facing changes?

[GitHub] [arrow] jacobmarble commented on pull request #33716: WIP: DO NOT MERGE: Apache Arrow Flight SQL adapter for PostgreSQL plan

2023-01-18 Thread GitBox
jacobmarble commented on PR #33716: URL: https://github.com/apache/arrow/pull/33716#issuecomment-1387541762 I'm also curious about use cases. Deployed [PostgreSQL FE/BE](https://www.postgresql.org/docs/15/protocol.html) clients number in the zillions, so there would seem to be more

[GitHub] [arrow] assignUser commented on issue #33762: [Dev] Remove Jira support from merge script

2023-01-18 Thread GitBox
assignUser commented on issue #33762: URL: https://github.com/apache/arrow/issues/33762#issuecomment-1387536040 Would love to get rid of all of that but we still have ~5 PARQUET issues per release from JIRA. IMO we should just get rid of that too, I think most of the parquet issues are

[GitHub] [arrow-datafusion] ursabot commented on pull request #4945: Minor: Reduce even more redundancy creating window_agg in sort_enforcement tests

2023-01-18 Thread GitBox
ursabot commented on PR #4945: URL: https://github.com/apache/arrow-datafusion/pull/4945#issuecomment-1387534173 Benchmark runs are scheduled for baseline = 896fd3f8f0fec61a699b7f883f5508dc5658fc96 and contender = 64fa312ecc5f32294e70fd7389e18cb41f25e732.

[GitHub] [arrow-datafusion] ursabot commented on pull request #4971: Add DataFusionError::Substrait variant to DataFusionError enum

2023-01-18 Thread GitBox
ursabot commented on PR #4971: URL: https://github.com/apache/arrow-datafusion/pull/4971#issuecomment-1387534128 Benchmark runs are scheduled for baseline = 8a34fe13927b5882fe2affdefa80d332877ca97b and contender = 896fd3f8f0fec61a699b7f883f5508dc5658fc96.

[GitHub] [arrow-datafusion] ursabot commented on pull request #4942: fix: `FieldNotFound` error message without valid fields

2023-01-18 Thread GitBox
ursabot commented on PR #4942: URL: https://github.com/apache/arrow-datafusion/pull/4942#issuecomment-1387534089 Benchmark runs are scheduled for baseline = ef4ca7e8f0cda9d0c5fe73381473a2e3a989c601 and contender = 8a34fe13927b5882fe2affdefa80d332877ca97b.

[GitHub] [arrow-datafusion] ursabot commented on pull request #4916: Improve documentation for ExprVisitor, port simple uses to new walking function

2023-01-18 Thread GitBox
ursabot commented on PR #4916: URL: https://github.com/apache/arrow-datafusion/pull/4916#issuecomment-1387534044 Benchmark runs are scheduled for baseline = ba9fc129b11fe08dd2be98a4cd7915d230e29488 and contender = ef4ca7e8f0cda9d0c5fe73381473a2e3a989c601.

[GitHub] [arrow] github-actions[bot] commented on pull request #33764: GH-15109: [Python] Allow creation of non empty struct array with zero field

2023-01-18 Thread GitBox
github-actions[bot] commented on PR #33764: URL: https://github.com/apache/arrow/pull/33764#issuecomment-1387533188 :warning: GitHub issue #15109 **has been automatically assigned in GitHub** to PR creator. -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] github-actions[bot] commented on pull request #33764: GH-15109: [Python] Allow creation of non empty struct array with zero field

2023-01-18 Thread GitBox
github-actions[bot] commented on PR #33764: URL: https://github.com/apache/arrow/pull/33764#issuecomment-1387533113 * Closes: #15109 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [arrow] 0x26res opened a new pull request, #33764: GH-15109: [Python] Allow creation of non empty struct array with zero field

2023-01-18 Thread GitBox
0x26res opened a new pull request, #33764: URL: https://github.com/apache/arrow/pull/33764 @jorisvandenbossche - I made minor changes to the C++ code, I couldn't help myself - I've followed your suggestion, I've used `{}` to represent an empty struct python value, instead of `[]`.

<    3   4   5   6   7   8   9   10   11   12   >