[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #4992: Implement `std::error::Error::source()` for `DataFusionError`, make `DataFusionError::find_root` more generic

2023-01-19 Thread GitBox
alamb commented on code in PR #4992: URL: https://github.com/apache/arrow-datafusion/pull/4992#discussion_r1081668860 ## datafusion/common/src/error.rs: ## @@ -359,84 +380,49 @@ impl DataFusionError { /// /// This may be the same as `self`. pub fn find_root(&self)

[GitHub] [arrow-datafusion] alamb opened a new pull request, #4992: Implement `std::error::Error::source()` for `DataFusionError`, make `DataFusionError::find_root` more generic

2023-01-19 Thread GitBox
alamb opened a new pull request, #4992: URL: https://github.com/apache/arrow-datafusion/pull/4992 # Which issue does this PR close? Closes https://github.com/apache/arrow-datafusion/issues/4991 # Rationale for this change In IOx (and in DataFusion) we often want to know w

[GitHub] [arrow] pitrou commented on issue #15054: [CI][Python] wheel-manylinux2014-* sometimes crashed on pytest exit

2023-01-19 Thread GitBox
pitrou commented on issue #15054: URL: https://github.com/apache/arrow/issues/15054#issuecomment-1397431670 Note original problem: "there is a global static RegionResolver that has an S3Client and that S3Client is being destroyed after S3 has already been destroyed". So the static RegionRes

[GitHub] [arrow] pitrou commented on issue #15054: [CI][Python] wheel-manylinux2014-* sometimes crashed on pytest exit

2023-01-19 Thread GitBox
pitrou commented on issue #15054: URL: https://github.com/apache/arrow/issues/15054#issuecomment-1397430555 Wouldn't the problem be basically the same? Depending on static destruction order, some structures inside AWS SDK may already have been finalized? -- This is an automated message fr

[GitHub] [arrow] westonpace commented on issue #15054: [CI][Python] wheel-manylinux2014-* sometimes crashed on pytest exit

2023-01-19 Thread GitBox
westonpace commented on issue #15054: URL: https://github.com/apache/arrow/issues/15054#issuecomment-1397429004 Perhaps we can add another static storage object that calls EnsureFinalized? -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [arrow-adbc] zeroshade commented on a diff in pull request #355: feat(python): add Flight SQL driver using Go library

2023-01-19 Thread GitBox
zeroshade commented on code in PR #355: URL: https://github.com/apache/arrow-adbc/pull/355#discussion_r1081663877 ## go/adbc/driver/flightsql/flightsql_adbc.go: ## @@ -116,22 +122,32 @@ type database struct { func (d *database) SetOptions(cnOptions map[string]string) error {

[GitHub] [arrow] pitrou commented on issue #15054: [CI][Python] wheel-manylinux2014-* sometimes crashed on pytest exit

2023-01-19 Thread GitBox
pitrou commented on issue #15054: URL: https://github.com/apache/arrow/issues/15054#issuecomment-1397427907 Ah! No, a C `atexit` hook would probably not work. I'm talking about a [Python `atexit` hook](https://docs.python.org/3/library/atexit.html). However, there is the flipped issue that

[GitHub] [arrow-adbc] zeroshade commented on a diff in pull request #355: feat(python): add Flight SQL driver using Go library

2023-01-19 Thread GitBox
zeroshade commented on code in PR #355: URL: https://github.com/apache/arrow-adbc/pull/355#discussion_r1081661077 ## go/adbc/driver/flightsql/flightsql_adbc.go: ## @@ -103,6 +103,12 @@ func (d Driver) NewDatabase(opts map[string]string) (adbc.Database, error) {

[GitHub] [arrow-adbc] zeroshade commented on a diff in pull request #355: feat(python): add Flight SQL driver using Go library

2023-01-19 Thread GitBox
zeroshade commented on code in PR #355: URL: https://github.com/apache/arrow-adbc/pull/355#discussion_r1081658427 ## ci/scripts/go_build.sh: ## @@ -42,6 +42,13 @@ main() { make all popd + +mkdir -p "${install_dir}/lib" +if [[ $(go env GOOS) ==

[GitHub] [arrow] westonpace commented on issue #15054: [CI][Python] wheel-manylinux2014-* sometimes crashed on pytest exit

2023-01-19 Thread GitBox
westonpace commented on issue #15054: URL: https://github.com/apache/arrow/issues/15054#issuecomment-1397422501 I agree that this isn't a good fix for users. I'm not sure an `atexit` hook would work though because it [won't be called before the static state is destroyed](https://en.cpprefe

[GitHub] [arrow] assignUser commented on issue #33786: [Release][C++] Release verification tasks fail with libxsimd-dev installed on ubuntu 22.04

2023-01-19 Thread GitBox
assignUser commented on issue #33786: URL: https://github.com/apache/arrow/issues/33786#issuecomment-1397421912 Ah yes this seems to be the case as 22.04 has xsimd 7.6.0 and we require 8.1.0: `Unpacking libxsimd-dev:amd64 (7.6.0-2) ` -- This is an automated message from the Apache Git

[GitHub] [arrow] assignUser commented on issue #33786: [Release][C++] Release verification tasks fail with libxsimd-dev installed on ubuntu 22.04

2023-01-19 Thread GitBox
assignUser commented on issue #33786: URL: https://github.com/apache/arrow/issues/33786#issuecomment-1397414258 resolve_dependency clearly does not correctly detect the system install (maybe discards it due to version?) and falls back on source build. So it is ossible that `Findxsimd.cmkae`

[GitHub] [arrow] nealrichardson commented on a diff in pull request #33748: GH-33746: [R] Update NEWS.md for 11.0.0

2023-01-19 Thread GitBox
nealrichardson commented on code in PR #33748: URL: https://github.com/apache/arrow/pull/33748#discussion_r1081651053 ## r/NEWS.md: ## @@ -19,6 +19,94 @@ # arrow 10.0.1.9000 +## Breaking changes + +* `map_batches()` is lazy by default; it now returns a `RecordBatchReader` +

[GitHub] [arrow-rs] tustvold commented on issue #3568: Add rlike string comparison

2023-01-19 Thread GitBox
tustvold commented on issue #3568: URL: https://github.com/apache/arrow-rs/issues/3568#issuecomment-1397408351 FWIW there is regex rewrite logic already in DataFusion that we could crib from - https://github.com/apache/arrow-datafusion/pull/4646 Tbh I'm not sure if this is better hand

[GitHub] [arrow] rok commented on pull request #33776: GH-15164: [C++][Parquet] BloomFilter fixing standard broken

2023-01-19 Thread GitBox
rok commented on PR #33776: URL: https://github.com/apache/arrow/pull/33776#issuecomment-1397404547 > Well, why should I have a assignee but I cannot edit the description of patch? Should someone be assignee or assign this task to me? ╮( ̄▽ ̄"")╭ I've assigned it to you. Can you check i

[GitHub] [arrow] wjones127 commented on a diff in pull request #33660: GH-33659: [Developer Tools] Add definition of Breaking Change and Critical Fix

2023-01-19 Thread GitBox
wjones127 commented on code in PR #33660: URL: https://github.com/apache/arrow/pull/33660#discussion_r1081643332 ## docs/source/developers/reviewing.rst: ## @@ -255,3 +255,43 @@ Social aspects * Like any communication, code reviews are governed by the Apache `Code of Conduct

[GitHub] [arrow-rs] snmvaughan opened a new issue, #3568: Add rlike string comparison

2023-01-19 Thread GitBox
snmvaughan opened a new issue, #3568: URL: https://github.com/apache/arrow-rs/issues/3568 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** Implement a Spark SQL `rlike`, which checks for specific fast paths in order to avoid using

[GitHub] [arrow] lidavidm merged pull request #33768: GH-33767: [Go] Clear out parameter in ArrowArrayStream.get_next

2023-01-19 Thread GitBox
lidavidm merged PR #33768: URL: https://github.com/apache/arrow/pull/33768 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apac

[GitHub] [arrow] zeroshade commented on pull request #14111: GH-32946: [Go] Implement REE Array and Compare

2023-01-19 Thread GitBox
zeroshade commented on PR #14111: URL: https://github.com/apache/arrow/pull/14111#issuecomment-1397400985 If there's no further comments before EOD today i'll merge this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [arrow] github-actions[bot] commented on pull request #33793: GH-15054: [Python] Finalize S3 on pytest exit to prevent shutdown crashes

2023-01-19 Thread GitBox
github-actions[bot] commented on PR #33793: URL: https://github.com/apache/arrow/pull/33793#issuecomment-1397400817 :warning: GitHub issue #15054 **has been automatically assigned in GitHub** to PR creator. -- This is an automated message from the Apache Git Service. To respond to the mes

[GitHub] [arrow] github-actions[bot] commented on pull request #33793: GH-15054: [Python] Finalize S3 on pytest exit to prevent shutdown crashes

2023-01-19 Thread GitBox
github-actions[bot] commented on PR #33793: URL: https://github.com/apache/arrow/pull/33793#issuecomment-1397400770 * Closes: #15054 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [arrow] westonpace opened a new pull request, #33793: GH-15054: [Python] Finalize S3 on pytest exit to prevent shutdown crashes

2023-01-19 Thread GitBox
westonpace opened a new pull request, #33793: URL: https://github.com/apache/arrow/pull/33793 For some reason destroying an S3Client at shutdown can cause a crash in our tests. It appears to be because S3's logging is already shut down. I wouldn't think this would happen because I would t

[GitHub] [arrow] pitrou commented on issue #15054: [CI][Python] wheel-manylinux2014-* sometimes crashed on pytest exit

2023-01-19 Thread GitBox
pitrou commented on issue #15054: URL: https://github.com/apache/arrow/issues/15054#issuecomment-1397398440 This may solve the problem in pytest, but will not solve it for users. We may try an `atexit` hook... -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow-datafusion] ursabot commented on pull request #4988: re-export substrait crate

2023-01-19 Thread GitBox
ursabot commented on PR #4988: URL: https://github.com/apache/arrow-datafusion/pull/4988#issuecomment-1397396643 Benchmark runs are scheduled for baseline = dde23efed94704044822bcefe49c0af7f9260088 and contender = 5025aa58f3cbb2a949de5afb7a11b5dba869e724. 5025aa58f3cbb2a949de5afb7a11b5dba

[GitHub] [arrow] github-actions[bot] commented on pull request #33792: GH-33789: [Go] Add Err() to RecordReader

2023-01-19 Thread GitBox
github-actions[bot] commented on PR #33792: URL: https://github.com/apache/arrow/pull/33792#issuecomment-1397394446 * Closes: #33789 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [arrow] lidavidm opened a new pull request, #33792: GH-33789: [Go] Add Err() to RecordReader

2023-01-19 Thread GitBox
lidavidm opened a new pull request, #33792: URL: https://github.com/apache/arrow/pull/33792 ### Rationale for this change Add Err() to the RecordReader interface so we can report errors. ### Are these changes tested? This is tested in the C Data Interface. ### Are

[GitHub] [arrow] westonpace commented on issue #15054: [CI][Python] wheel-manylinux2014-* sometimes crashed on pytest exit

2023-01-19 Thread GitBox
westonpace commented on issue #15054: URL: https://github.com/apache/arrow/issues/15054#issuecomment-1397392367 Shall we try this: https://github.com/apache/arrow/commit/59cc8cdf1be90da30086325c41d3e7d49d1483b9 ? `finalize_s3` will reset the region resolver before shutting down s3.

[GitHub] [arrow-datafusion] Dandandan commented on pull request #4988: re-export substrait crate

2023-01-19 Thread GitBox
Dandandan commented on PR #4988: URL: https://github.com/apache/arrow-datafusion/pull/4988#issuecomment-1397387893 Thank you @jdye64 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [arrow-datafusion] Dandandan merged pull request #4988: re-export substrait crate

2023-01-19 Thread GitBox
Dandandan merged PR #4988: URL: https://github.com/apache/arrow-datafusion/pull/4988 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

[GitHub] [arrow-datafusion] Dandandan closed issue #4987: Re-Export `substait` crate

2023-01-19 Thread GitBox
Dandandan closed issue #4987: Re-Export `substait` crate URL: https://github.com/apache/arrow-datafusion/issues/4987 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

[GitHub] [arrow] github-actions[bot] commented on pull request #33791: GH-33782: [Release] Vote email number of issues is querying JIRA and producing a wrong number

2023-01-19 Thread GitBox
github-actions[bot] commented on PR #33791: URL: https://github.com/apache/arrow/pull/33791#issuecomment-1397384518 :warning: GitHub issue #33782 **has been automatically assigned in GitHub** to PR creator. -- This is an automated message from the Apache Git Service. To respond to the mes

[GitHub] [arrow] github-actions[bot] commented on pull request #33791: GH-33782: [Release] Vote email number of issues is querying JIRA and producing a wrong number

2023-01-19 Thread GitBox
github-actions[bot] commented on PR #33791: URL: https://github.com/apache/arrow/pull/33791#issuecomment-1397384453 * Closes: #33782 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [arrow-rs] tustvold closed issue #2832: Deprecate MutableArrayData

2023-01-19 Thread GitBox
tustvold closed issue #2832: Deprecate MutableArrayData URL: https://github.com/apache/arrow-rs/issues/2832 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-ma

[GitHub] [arrow-rs] tustvold commented on issue #2832: Deprecate MutableArrayData

2023-01-19 Thread GitBox
tustvold commented on issue #2832: URL: https://github.com/apache/arrow-rs/issues/2832#issuecomment-1397384403 I think I have made peace with the existence of MutableArrayData -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

[GitHub] [arrow] rok opened a new pull request, #33791: GH-33782: [Release] Vote email number of issues is querying JIRA and producing a wrong number

2023-01-19 Thread GitBox
rok opened a new pull request, #33791: URL: https://github.com/apache/arrow/pull/33791 ### What changes are included in this PR? Release RC vote email now gets issue number and verify release PR's url from GitHub's GraphQL API. ### Are these changes tested? Changes were

[GitHub] [arrow-rs] alamb commented on issue #3566: Implement `Error::Source` for ArrowError and FlightError

2023-01-19 Thread GitBox
alamb commented on issue #3566: URL: https://github.com/apache/arrow-rs/issues/3566#issuecomment-1397371990 Follow on ticket in datafusion: https://github.com/apache/arrow-datafusion/issues/4991 -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [arrow-rs] alamb commented on a diff in pull request #3567: Implement `std::error::Error::source` for `ArrowError` and `FlightError`

2023-01-19 Thread GitBox
alamb commented on code in PR #3567: URL: https://github.com/apache/arrow-rs/pull/3567#discussion_r1081612923 ## arrow-flight/src/error.rs: ## @@ -52,7 +54,15 @@ impl std::fmt::Display for FlightError { } } -impl std::error::Error for FlightError {} +impl Error for Fligh

[GitHub] [arrow] nealrichardson commented on a diff in pull request #33770: GH-33760: [R][C++] Handle nested field refs in scanner

2023-01-19 Thread GitBox
nealrichardson commented on code in PR #33770: URL: https://github.com/apache/arrow/pull/33770#discussion_r1081612013 ## cpp/src/arrow/dataset/scanner.cc: ## @@ -135,20 +136,19 @@ Result> GetProjectedSchemaFromExpression( const std::shared_ptr& dataset_schema) { // proc

[GitHub] [arrow-datafusion] alamb opened a new issue, #4991: Implement `std::error::Error` for DataFusionError

2023-01-19 Thread GitBox
alamb opened a new issue, #4991: URL: https://github.com/apache/arrow-datafusion/issues/4991 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** In IOx (and in DataFusion) we often want to know what the root cause of an error is (e.g

[GitHub] [arrow-rs] alamb commented on issue #3566: Implement `Error::Source` for ArrowError and FlightError

2023-01-19 Thread GitBox
alamb commented on issue #3566: URL: https://github.com/apache/arrow-rs/issues/3566#issuecomment-1397359936 > I think we should replace the find_root method in DataFusion with this mechanism as well and embrace the standard 💪 On it! -- This is an automated message from the

[GitHub] [arrow-rs] crepererum commented on a diff in pull request #3567: Implement `std::error::Error::source` for `ArrowError` and `FlightError`

2023-01-19 Thread GitBox
crepererum commented on code in PR #3567: URL: https://github.com/apache/arrow-rs/pull/3567#discussion_r1081604539 ## arrow-flight/src/error.rs: ## @@ -52,7 +54,15 @@ impl std::fmt::Display for FlightError { } } -impl std::error::Error for FlightError {} +impl Error for

[GitHub] [arrow-rs] crepererum commented on issue #3566: Implement `Error::Source` for ArrowError and FlightError

2023-01-19 Thread GitBox
crepererum commented on issue #3566: URL: https://github.com/apache/arrow-rs/issues/3566#issuecomment-1397355652 I think we should replace the `find_root` method in DataFusion with this mechanism as well and embrace the standard :muscle: -- This is an automated message from the Apache Gi

[GitHub] [arrow-rs] alamb commented on a diff in pull request #3567: Implement `std::error::Error::source` for `ArrowError` and `FlightError`

2023-01-19 Thread GitBox
alamb commented on code in PR #3567: URL: https://github.com/apache/arrow-rs/pull/3567#discussion_r1081602031 ## arrow-flight/src/error.rs: ## @@ -52,7 +54,15 @@ impl std::fmt::Display for FlightError { } } -impl std::error::Error for FlightError {} +impl Error for Fligh

[GitHub] [arrow-rs] alamb opened a new pull request, #3567: Implement `std::error::Error::source` for `ArrowError` and `FlightError`

2023-01-19 Thread GitBox
alamb opened a new pull request, #3567: URL: https://github.com/apache/arrow-rs/pull/3567 # Which issue does this PR close? Close https://github.com/apache/arrow-rs/issues/3566 # Rationale for this change In IOx (and in DataFusion) we often want to know what the root cau

[GitHub] [arrow-datafusion] ozankabak commented on pull request #4989: Add support for linear range calculation

2023-01-19 Thread GitBox
ozankabak commented on PR #4989: URL: https://github.com/apache/arrow-datafusion/pull/4989#issuecomment-1397344630 @alamb, I think you will like this. As I was reading the segment tree paper from #4904, one of the remarks therein that stood out to me was that in RANGE frames a simple linea

[GitHub] [arrow] pitrou commented on issue #33765: [Python] Multiple warnings and asserts triggered in debug CPython 3.11

2023-01-19 Thread GitBox
pitrou commented on issue #33765: URL: https://github.com/apache/arrow/issues/33765#issuecomment-1397333607 This is a bit weird: all these classes and methods are generated by Cython and we're not doing anything particularly advanced in that regard. Did you build PyArrow yourself? -- Thi

[GitHub] [arrow] jorisvandenbossche commented on issue #33763: [Python] pa.map_() ignores child field metadata

2023-01-19 Thread GitBox
jorisvandenbossche commented on issue #33763: URL: https://github.com/apache/arrow/issues/33763#issuecomment-1397332990 Actually, this seems to working fine for me with the latest pyarrow: ``` In [72]: map_type = pa.map_( ...: pa.field("key", pa.string(), nullable=False,

[GitHub] [arrow] jorisvandenbossche commented on issue #33763: [Python] pa.map_() ignores child field metadata

2023-01-19 Thread GitBox
jorisvandenbossche commented on issue #33763: URL: https://github.com/apache/arrow/issues/33763#issuecomment-1397332098 > I believe it's a bug in pyarrow. Specifically at this line: A new field is created and used but without the metadata of the input field. That line should only

[GitHub] [arrow-rs] alamb opened a new issue, #3566: Implement `Error::Source` for ArrowError and FlightError

2023-01-19 Thread GitBox
alamb opened a new issue, #3566: URL: https://github.com/apache/arrow-rs/issues/3566 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** In IOx (and in DataFusion) we often want to know what the root cause of an error is (e.g was it a

[GitHub] [arrow] pitrou commented on issue #15054: [CI][Python] wheel-manylinux2014-* sometimes crashed on pytest exit

2023-01-19 Thread GitBox
pitrou commented on issue #15054: URL: https://github.com/apache/arrow/issues/15054#issuecomment-1397326538 Yes, it's probably that. Though "S3 has already been destroyed" is a bit vague (is it after DLL unload?). -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [arrow] westonpace commented on issue #15054: [CI][Python] wheel-manylinux2014-* sometimes crashed on pytest exit

2023-01-19 Thread GitBox
westonpace commented on issue #15054: URL: https://github.com/apache/arrow/issues/15054#issuecomment-1397323800 Looks like there is a global static `RegionResolver` that has an `S3Client` and that `S3Client` is being destroyed after S3 has already been destroyed? -- This is an automated m

[GitHub] [arrow] jorisvandenbossche commented on issue #33765: [Python] Multiple warnings and asserts triggered in debug CPython 3.11

2023-01-19 Thread GitBox
jorisvandenbossche commented on issue #33765: URL: https://github.com/apache/arrow/issues/33765#issuecomment-1397317927 cc @pitrou -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

[GitHub] [arrow-rs] tustvold commented on pull request #3563: Implement Extend for ArrayBuilder (#1841)

2023-01-19 Thread GitBox
tustvold commented on PR #3563: URL: https://github.com/apache/arrow-rs/pull/3563#issuecomment-1397315398 > What's the benefit of this over including an append_all function? https://github.com/apache/arrow-rs/pull/3563/files#diff-cb5b791e20e4536940eecb1466e034510c245d0d443fb89942b8ab9

[GitHub] [arrow-rs] askoa commented on pull request #3563: Implement Extend for ArrayBuilder (#1841)

2023-01-19 Thread GitBox
askoa commented on PR #3563: URL: https://github.com/apache/arrow-rs/pull/3563#issuecomment-1397308505 What's the benefit of this over including an `append_all` function? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [arrow] zeroshade commented on issue #33789: [Go] RecordReader has no way to propagate errors

2023-01-19 Thread GitBox
zeroshade commented on issue #33789: URL: https://github.com/apache/arrow/issues/33789#issuecomment-1397278591 At this point we might as well add the `Err() error` method to the `array.RecordReader` interface, this would allow us to potentially bring the `ipc.Reader` to be an `array.RecordR

[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #4958: Expose `sql_to_statement` and `statement_to_plan` on `SessionState`

2023-01-19 Thread GitBox
alamb commented on code in PR #4958: URL: https://github.com/apache/arrow-datafusion/pull/4958#discussion_r1081534824 ## datafusion/core/src/execution/context.rs: ## @@ -1729,6 +1741,15 @@ impl SessionState { query.statement_to_plan(statement) } +/// Creates

[GitHub] [arrow] pitrou commented on issue #33762: [Dev] Remove Jira support from merge script

2023-01-19 Thread GitBox
pitrou commented on issue #33762: URL: https://github.com/apache/arrow/issues/33762#issuecomment-1397269438 A mere lazy consensus wouldn't be enough as it's a disruptive change. It would need an actual PMC vote. Given the lack of reaction on the [previous ML thread](https://lists.apache.org

[GitHub] [arrow-datafusion] alamb commented on issue #4990: Write architecture overview guide:

2023-01-19 Thread GitBox
alamb commented on issue #4990: URL: https://github.com/apache/arrow-datafusion/issues/4990#issuecomment-1397267641 THe kind of outline I was thinking was: `Expr` `LogicalPlan` `PhysicalPlan` `PhysicalExpr` `Execution Model` (aka document the ExecutionPlan

[GitHub] [arrow] lidavidm commented on pull request #33768: GH-33767: [Go] Clear out parameter in ArrowArrayStream.get_next

2023-01-19 Thread GitBox
lidavidm commented on PR #33768: URL: https://github.com/apache/arrow/pull/33768#issuecomment-1397265982 I noticed the unit test doens't actually assert that err is nil (seems Go doesn't complain about unused return values?) so I'll fix that -- This is an automated message from the Apache

[GitHub] [arrow] lidavidm commented on issue #33767: [Go] Exported ArrowArrayStream.get_next doesn't handle uninitialized ArrowArrays well

2023-01-19 Thread GitBox
lidavidm commented on issue #33767: URL: https://github.com/apache/arrow/issues/33767#issuecomment-1397265138 That's what I thought :) (It was expecting it to be zero-initialized, because it would _call the release callback if present_.) -- This is an automated message from the Apa

[GitHub] [arrow-datafusion] alamb opened a new issue, #4990: Write architecture overview guide:

2023-01-19 Thread GitBox
alamb opened a new issue, #4990: URL: https://github.com/apache/arrow-datafusion/issues/4990 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** I would like to continue to grow / scale the community of users and contributors to D

[GitHub] [arrow] westonpace commented on issue #33783: [Release][C#] Release verification tasks fail with new version of C# 7.0.x

2023-01-19 Thread GitBox
westonpace commented on issue #33783: URL: https://github.com/apache/arrow/issues/33783#issuecomment-1397261770 > I think we can upgrade the unit test projects to compile for both .NET framework 6.0 and 7.0 but that is not normally done. I take back what I said about "not normally don

[GitHub] [arrow] westonpace commented on issue #33783: [Release][C#] Release verification tasks fail with new version of C# 7.0.x

2023-01-19 Thread GitBox
westonpace commented on issue #33783: URL: https://github.com/apache/arrow/issues/33783#issuecomment-1397257697 > I've found that if I install dotnet 7.0.102 via snap I get the above segmentation fault but if I install it via the Microsoft debian repositories: I also experienced this.

[GitHub] [arrow] pitrou commented on issue #33767: [Go] Exported ArrowArrayStream.get_next doesn't handle uninitialized ArrowArrays well

2023-01-19 Thread GitBox
pitrou commented on issue #33767: URL: https://github.com/apache/arrow/issues/33767#issuecomment-1397255508 The `out` parameter should be handled as an out-parameter, so it can't be expected to be initialized (to what?). -- This is an automated message from the Apache Git Service. To resp

[GitHub] [arrow] westonpace commented on issue #33783: [Release][C#] Release verification tasks fail with new version of C# 7.0.x

2023-01-19 Thread GitBox
westonpace commented on issue #33783: URL: https://github.com/apache/arrow/issues/33783#issuecomment-1397252779 The libraries are (should be) portable. Executables (unit tests in this case) are not portable. The .NET library projects are built to target .net standard and should be c

[GitHub] [arrow] raulcd commented on issue #33783: [Release][C#] Release verification tasks fail with new version of C# 7.0.x

2023-01-19 Thread GitBox
raulcd commented on issue #33783: URL: https://github.com/apache/arrow/issues/33783#issuecomment-1397240744 I've found that if I install dotnet 7.0.102 via snap I get the above segmentation fault but if I install it via the Microsoft debian repositories: ``` $ wget https://packages.mi

[GitHub] [arrow-datafusion-python] dependabot[bot] opened a new pull request, #143: build(deps): bump uuid from 0.8.2 to 1.2.2

2023-01-19 Thread GitBox
dependabot[bot] opened a new pull request, #143: URL: https://github.com/apache/arrow-datafusion-python/pull/143 Bumps [uuid](https://github.com/uuid-rs/uuid) from 0.8.2 to 1.2.2. Release notes Sourced from https://github.com/uuid-rs/uuid/releases";>uuid's releases. 1.2.2

[GitHub] [arrow-datafusion-python] dependabot[bot] closed pull request #96: build(deps): bump uuid from 0.8.2 to 1.2.1

2023-01-19 Thread GitBox
dependabot[bot] closed pull request #96: build(deps): bump uuid from 0.8.2 to 1.2.1 URL: https://github.com/apache/arrow-datafusion-python/pull/96 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow-datafusion-python] dependabot[bot] commented on pull request #96: build(deps): bump uuid from 0.8.2 to 1.2.1

2023-01-19 Thread GitBox
dependabot[bot] commented on PR #96: URL: https://github.com/apache/arrow-datafusion-python/pull/96#issuecomment-1397240332 Superseded by #143. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

[GitHub] [arrow-datafusion-python] andygrove merged pull request #141: Prepare for 0.8.0 release

2023-01-19 Thread GitBox
andygrove merged PR #141: URL: https://github.com/apache/arrow-datafusion-python/pull/141 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubsc

[GitHub] [arrow] pitrou commented on issue #15054: [CI][Python] wheel-manylinux2014-* sometimes crashed on pytest exit

2023-01-19 Thread GitBox
pitrou commented on issue #15054: URL: https://github.com/apache/arrow/issues/15054#issuecomment-1397236632 Judging by the reconstructed stack trace (thanks @lidavidm !), this has nothing to do with OpenSSL but with calling a logging method: ```c++ CurlHandleContainer::~CurlHandleConta

[GitHub] [arrow] paleolimbot commented on issue #15054: [CI][Python] wheel-manylinux2014-* sometimes crashed on pytest exit

2023-01-19 Thread GitBox
paleolimbot commented on issue #15054: URL: https://github.com/apache/arrow/issues/15054#issuecomment-1397230707 FIWIW this looks very similar/maybe is the same as #15189, which the R package fixed by skipping on platforms with a very old SSL runtime (MacOS 10.13 was the culprit for us).

[GitHub] [arrow] ursabot commented on pull request #33656: GH-33655: [C++][Parquet] Write parquet columns in parallel

2023-01-19 Thread GitBox
ursabot commented on PR #33656: URL: https://github.com/apache/arrow/pull/33656#issuecomment-1397215055 Benchmark runs are scheduled for baseline = 444dcb6779755fc33f3f81d647c188cf31abd23c and contender = c8d6110a26c41966e539e9fa2f5cb8c31dc2f0fe. c8d6110a26c41966e539e9fa2f5cb8c31dc2f0fe is

[GitHub] [arrow] westonpace commented on a diff in pull request #15083: GH-33566: [C++] Add support for nullary and n-ary aggregate functions

2023-01-19 Thread GitBox
westonpace commented on code in PR #15083: URL: https://github.com/apache/arrow/pull/15083#discussion_r1081477937 ## python/pyarrow/_compute.pyx: ## @@ -2202,12 +2202,18 @@ def _group_by(args, keys, aggregations): _pack_compute_args(args, &c_args) _pack_compute_args(ke

[GitHub] [arrow] pitrou merged pull request #33772: GH-15137: [C++][CI] Fix ASAN error in streaming JSON reader tests

2023-01-19 Thread GitBox
pitrou merged PR #33772: URL: https://github.com/apache/arrow/pull/33772 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache

[GitHub] [arrow-datafusion] mustafasrepo opened a new pull request, #4989: Feature/range linear

2023-01-19 Thread GitBox
mustafasrepo opened a new pull request, #4989: URL: https://github.com/apache/arrow-datafusion/pull/4989 # Which issue does this PR close? Closes #4979 # Rationale for this change During range calculation for window frames, we can use linear search instead of

[GitHub] [arrow] thisisnic merged pull request #33778: GH-33777: [R] Nightly builds failing due to dataset test not being skipped on builds without datasets module

2023-01-19 Thread GitBox
thisisnic merged PR #33778: URL: https://github.com/apache/arrow/pull/33778 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apa

[GitHub] [arrow] thisisnic commented on a diff in pull request #33748: GH-33746: [R] Update NEWS.md for 11.0.0

2023-01-19 Thread GitBox
thisisnic commented on code in PR #33748: URL: https://github.com/apache/arrow/pull/33748#discussion_r1081448080 ## r/NEWS.md: ## @@ -19,6 +19,94 @@ # arrow 10.0.1.9000 +## Breaking changes + +* `map_batches()` is lazy by default; it now returns a `RecordBatchReader` + ins

[GitHub] [arrow] thisisnic commented on a diff in pull request #33748: GH-33746: [R] Update NEWS.md for 11.0.0

2023-01-19 Thread GitBox
thisisnic commented on code in PR #33748: URL: https://github.com/apache/arrow/pull/33748#discussion_r1081446799 ## r/NEWS.md: ## @@ -19,6 +19,94 @@ # arrow 10.0.1.9000 +## Breaking changes + +* `map_batches()` is lazy by default; it now returns a `RecordBatchReader` + ins

[GitHub] [arrow-datafusion-python] andygrove commented on pull request #141: Prepare for 0.8.0 release

2023-01-19 Thread GitBox
andygrove commented on PR #141: URL: https://github.com/apache/arrow-datafusion-python/pull/141#issuecomment-1397166051 @Jimexist @francis-du @martin-g fyi -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [arrow-datafusion] jdye64 opened a new pull request, #4988: re-export substrait crate

2023-01-19 Thread GitBox
jdye64 opened a new pull request, #4988: URL: https://github.com/apache/arrow-datafusion/pull/4988 # Which issue does this PR close? Closes #4987 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow] thisisnic commented on a diff in pull request #33748: GH-33746: [R] Update NEWS.md for 11.0.0

2023-01-19 Thread GitBox
thisisnic commented on code in PR #33748: URL: https://github.com/apache/arrow/pull/33748#discussion_r1081441163 ## r/NEWS.md: ## @@ -19,6 +19,94 @@ # arrow 10.0.1.9000 +## Breaking changes + +* `map_batches()` is lazy by default; it now returns a `RecordBatchReader` + ins

[GitHub] [arrow-datafusion] jdye64 opened a new issue, #4987: Re-Export `substait` crate

2023-01-19 Thread GitBox
jdye64 opened a new issue, #4987: URL: https://github.com/apache/arrow-datafusion/issues/4987 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** We should re-export the `substrait` crate so that library consumers can ensure they are

[GitHub] [arrow] westonpace commented on a diff in pull request #14867: GH-14866: [C++] Remove internal GroupBy implementation

2023-01-19 Thread GitBox
westonpace commented on code in PR #14867: URL: https://github.com/apache/arrow/pull/14867#discussion_r1081433969 ## python/pyarrow/table.pxi: ## @@ -5383,11 +5383,9 @@ list[tuple(str, str, FunctionOptions)] for col_name, (aggr_name, _) in zip(columns, group_by_aggr

[GitHub] [arrow-datafusion] gruuya commented on a diff in pull request #4958: Expose `sql_to_statement` and `statement_to_plan` on `SessionState`

2023-01-19 Thread GitBox
gruuya commented on code in PR #4958: URL: https://github.com/apache/arrow-datafusion/pull/4958#discussion_r1081432997 ## datafusion/core/src/execution/context.rs: ## @@ -1729,6 +1741,15 @@ impl SessionState { query.statement_to_plan(statement) } +/// Creates

[GitHub] [arrow] wjones127 commented on a diff in pull request #33660: GH-33659: [Developer Tools] Add definition of Breaking Change and Critical Fix

2023-01-19 Thread GitBox
wjones127 commented on code in PR #33660: URL: https://github.com/apache/arrow/pull/33660#discussion_r1081432513 ## docs/source/developers/reviewing.rst: ## @@ -255,3 +255,43 @@ Social aspects * Like any communication, code reviews are governed by the Apache `Code of Conduct

[GitHub] [arrow-datafusion-python] andygrove opened a new issue, #142: Add documentation about releasing to conda-forge

2023-01-19 Thread GitBox
andygrove opened a new issue, #142: URL: https://github.com/apache/arrow-datafusion-python/issues/142 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** Add documentation about releasing to conda-forge **Describe the solution y

[GitHub] [arrow-datafusion-python] andygrove opened a new pull request, #141: bump version

2023-01-19 Thread GitBox
andygrove opened a new pull request, #141: URL: https://github.com/apache/arrow-datafusion-python/pull/141 # Which issue does this PR close? Closes #. # Rationale for this change # What changes are included in this PR? # Are there any user-facing c

[GitHub] [arrow] nealrichardson commented on a diff in pull request #33748: GH-33746: [R] Update NEWS.md for 11.0.0

2023-01-19 Thread GitBox
nealrichardson commented on code in PR #33748: URL: https://github.com/apache/arrow/pull/33748#discussion_r1081416116 ## r/NEWS.md: ## @@ -19,6 +19,94 @@ # arrow 10.0.1.9000 +## Breaking changes + +* `map_batches()` is lazy by default; it now returns a `RecordBatchReader` +

[GitHub] [arrow] nealrichardson commented on a diff in pull request #33748: GH-33746: [R] Update NEWS.md for 11.0.0

2023-01-19 Thread GitBox
nealrichardson commented on code in PR #33748: URL: https://github.com/apache/arrow/pull/33748#discussion_r1081414171 ## r/NEWS.md: ## @@ -19,6 +19,94 @@ # arrow 10.0.1.9000 +## Breaking changes + +* `map_batches()` is lazy by default; it now returns a `RecordBatchReader` +

[GitHub] [arrow] nealrichardson commented on a diff in pull request #33748: GH-33746: [R] Update NEWS.md for 11.0.0

2023-01-19 Thread GitBox
nealrichardson commented on code in PR #33748: URL: https://github.com/apache/arrow/pull/33748#discussion_r1081414171 ## r/NEWS.md: ## @@ -19,6 +19,94 @@ # arrow 10.0.1.9000 +## Breaking changes + +* `map_batches()` is lazy by default; it now returns a `RecordBatchReader` +

[GitHub] [arrow] nealrichardson commented on a diff in pull request #33748: GH-33746: [R] Update NEWS.md for 11.0.0

2023-01-19 Thread GitBox
nealrichardson commented on code in PR #33748: URL: https://github.com/apache/arrow/pull/33748#discussion_r1081413052 ## r/NEWS.md: ## @@ -19,6 +19,94 @@ # arrow 10.0.1.9000 +## Breaking changes + +* `map_batches()` is lazy by default; it now returns a `RecordBatchReader` +

[GitHub] [arrow] nealrichardson commented on a diff in pull request #33748: GH-33746: [R] Update NEWS.md for 11.0.0

2023-01-19 Thread GitBox
nealrichardson commented on code in PR #33748: URL: https://github.com/apache/arrow/pull/33748#discussion_r1081412618 ## r/NEWS.md: ## @@ -19,6 +19,94 @@ # arrow 10.0.1.9000 +## Breaking changes + +* `map_batches()` is lazy by default; it now returns a `RecordBatchReader` +

[GitHub] [arrow-datafusion] DataPsycho opened a new issue, #4986: DataFrame write api should accept Override option when the file exist

2023-01-19 Thread GitBox
DataPsycho opened a new issue, #4986: URL: https://github.com/apache/arrow-datafusion/issues/4986 Problem: While trying to write a csv (any type) file, I am getting the following exception if the file already exists: ``` Error: Execution("Could not create directory data/processed/di

[GitHub] [arrow] nealrichardson commented on a diff in pull request #33748: GH-33746: [R] Update NEWS.md for 11.0.0

2023-01-19 Thread GitBox
nealrichardson commented on code in PR #33748: URL: https://github.com/apache/arrow/pull/33748#discussion_r1081409996 ## r/NEWS.md: ## @@ -19,6 +19,94 @@ # arrow 10.0.1.9000 +## Breaking changes + +* `map_batches()` is lazy by default; it now returns a `RecordBatchReader` +

[GitHub] [arrow] jorisvandenbossche commented on a diff in pull request #15196: GH-15195: [C++][FlightRPC][Python] Add ToString/Equals for Flight types

2023-01-19 Thread GitBox
jorisvandenbossche commented on code in PR #15196: URL: https://github.com/apache/arrow/pull/15196#discussion_r1081408579 ## python/pyarrow/_flight.pyx: ## @@ -536,12 +545,7 @@ cdef class FlightDescriptor(_Weakrefable): return self.descriptor.path def __repr__(se

[GitHub] [arrow-datafusion] alamb merged pull request #4975: [maint-16.x] Prep for release

2023-01-19 Thread GitBox
alamb merged PR #4975: URL: https://github.com/apache/arrow-datafusion/pull/4975 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arro

[GitHub] [arrow] jorisvandenbossche commented on a diff in pull request #15196: GH-15195: [C++][FlightRPC][Python] Add ToString/Equals for Flight types

2023-01-19 Thread GitBox
jorisvandenbossche commented on code in PR #15196: URL: https://github.com/apache/arrow/pull/15196#discussion_r1081404786 ## python/pyarrow/_flight.pyx: ## @@ -536,12 +545,7 @@ cdef class FlightDescriptor(_Weakrefable): return self.descriptor.path def __repr__(se

[GitHub] [arrow-datafusion] alamb commented on pull request #4975: [maint-16.x] Prep for release

2023-01-19 Thread GitBox
alamb commented on PR #4975: URL: https://github.com/apache/arrow-datafusion/pull/4975#issuecomment-1397122516 > I filed https://github.com/apache/arrow-datafusion/issues/4985 for the changelog issue. I will look at this when I have more time. Maybe for this release we can make the c

[GitHub] [arrow-datafusion] andygrove commented on pull request #4975: [maint-16.x] Prep for release

2023-01-19 Thread GitBox
andygrove commented on PR #4975: URL: https://github.com/apache/arrow-datafusion/pull/4975#issuecomment-1397111903 I filed https://github.com/apache/arrow-datafusion/issues/4985 for the changelog issue. I will look at this when I have more time. -- This is an automated message from the A

<    1   2   3   4   5   6   7   8   9   10   >