[jira] [Created] (ARROW-17264) [Go] Function group by on table
Francisco Garcia created ARROW-17264: Summary: [Go] Function group by on table Key: ARROW-17264 URL: https://issues.apache.org/jira/browse/ARROW-17264 Project: Apache Arrow Issue Type: Wish Components: Go Affects Versions: 8.0.1 Reporter: Francisco Garcia I'm trying to find some way to group data in Apache Arrow with golang, but I couldn't do it. There's a way to do this or it is only implemented in cpp and python. Are there plans to implement this on future releases? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (ARROW-17193) [C++] Building GCS and tests on M1 MacOS 12.05 is failing.
[ https://issues.apache.org/jira/browse/ARROW-17193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17573201#comment-17573201 ] Kouhei Sutou commented on ARROW-17193: -- This is ready but I'm not sure whether we should cherry-pick this to 9.0.0 or not. (I don't opposite it.) Generally, users don't use {{ARROW_BUILD_TESTS=ON}}. I think that this isn't occurred without {{ARROW_BUILD_TESTS=ON}}. > [C++] Building GCS and tests on M1 MacOS 12.05 is failing. > -- > > Key: ARROW-17193 > URL: https://issues.apache.org/jira/browse/ARROW-17193 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Affects Versions: 8.0.0 >Reporter: Rok Mihevc >Assignee: Kouhei Sutou >Priority: Major > Labels: pull-request-available > Time Spent: 2h 50m > Remaining Estimate: 0h > > Building GCS and tests on M1 MacOS 12.05 with dependencies installed with > homebrew is failing. > {code:bash} > cmake \ > -GNinja \ > -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \ > -DCMAKE_INSTALL_LIBDIR=lib \ > -DARROW_PYTHON=ON \ > -DARROW_COMPUTE=ON \ > -DARROW_FILESYSTEM=ON \ > -DARROW_CSV=ON \ > -DARROW_GCS=ON \ > -DARROW_INSTALL_NAME_RPATH=OFF \ > -DARROW_BUILD_TESTS=ON \ > -DCMAKE_CXX_STANDARD=17 \ > .. > {code} > Env: > {code:bash} > PYARROW_WITH_PARQUET=1 > PYARROW_WITH_DATASET=1 > PYARROW_WITH_ORC=1 > PYARROW_WITH_PARQUET_ENCRYPTION=1 > PYARROW_WITH_PLASMA=1 > PYARROW_WITH_GCS=1 > {code} > Building errors with: > {noformat} > Undefined symbols for architecture arm64: > "absl::lts_20220623::FormatTime(std::__1::basic_string_view std::__1::char_traits >, absl::lts_20220623::Time, > absl::lts_20220623::TimeZone)", referenced from: > arrow::fs::(anonymous > namespace)::GcsIntegrationTest_OpenInputStreamReadMetadata_Test::TestBody() > in gcsfs_test.cc.o > > "absl::lts_20220623::FromChrono(std::__1::chrono::time_point std::__1::chrono::duration > > > const&)", referenced from: > arrow::fs::(anonymous > namespace)::GcsIntegrationTest_OpenInputStreamReadMetadata_Test::TestBody() > in gcsfs_test.cc.o > "absl::lts_20220623::RFC3339_full", referenced from: > arrow::fs::(anonymous > namespace)::GcsFileSystem_ObjectMetadataRoundtrip_Test::TestBody() in > gcsfs_test.cc.o > arrow::fs::(anonymous > namespace)::GcsIntegrationTest_OpenInputStreamReadMetadata_Test::TestBody() > in gcsfs_test.cc.o > "absl::lts_20220623::time_internal::cctz::utc_time_zone()", referenced from: > arrow::fs::(anonymous > namespace)::GcsIntegrationTest_OpenInputStreamReadMetadata_Test::TestBody() > in gcsfs_test.cc.o > "absl::lts_20220623::ToDoubleSeconds(absl::lts_20220623::Duration)", > referenced from: > arrow::fs::(anonymous > namespace)::GcsFileSystem_ObjectMetadataRoundtrip_Test::TestBody() in > gcsfs_test.cc.o > "absl::lts_20220623::Duration::operator-=(absl::lts_20220623::Duration)", > referenced from: > arrow::fs::(anonymous > namespace)::GcsFileSystem_ObjectMetadataRoundtrip_Test::TestBody() in > gcsfs_test.cc.o > "absl::lts_20220623::ParseTime(std::__1::basic_string_view std::__1::char_traits >, std::__1::basic_string_view std::__1::char_traits >, absl::lts_20220623::Time*, > std::__1::basic_string, > std::__1::allocator >*)", referenced from: > arrow::fs::(anonymous > namespace)::GcsFileSystem_ObjectMetadataRoundtrip_Test::TestBody() in > gcsfs_test.cc.o > {noformat} > Dependencies installed with: > {noformat} > brew update && brew bundle --file=cpp/Brewfile > {noformat} > See https://github.com/apache/arrow/pull/13681#issuecomment-1193241547 and > https://github.com/apache/arrow/pull/13407 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17263) [C++] Utility functions for working with RLE
Tobias Zagorni created ARROW-17263: -- Summary: [C++] Utility functions for working with RLE Key: ARROW-17263 URL: https://issues.apache.org/jira/browse/ARROW-17263 Project: Apache Arrow Issue Type: Sub-task Components: C++ Reporter: Tobias Zagorni Assignee: Tobias Zagorni -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17262) [C++] Kernel input type matcher for RLE
Tobias Zagorni created ARROW-17262: -- Summary: [C++] Kernel input type matcher for RLE Key: ARROW-17262 URL: https://issues.apache.org/jira/browse/ARROW-17262 Project: Apache Arrow Issue Type: Sub-task Components: C++ Reporter: Tobias Zagorni Assignee: Tobias Zagorni -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (ARROW-17259) [C++] Use shared_ptr less throughout arrow/compute
[ https://issues.apache.org/jira/browse/ARROW-17259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney reassigned ARROW-17259: Assignee: Wes McKinney > [C++] Use shared_ptr less throughout arrow/compute > > > Key: ARROW-17259 > URL: https://issues.apache.org/jira/browse/ARROW-17259 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Wes McKinney >Assignee: Wes McKinney >Priority: Major > Labels: pull-request-available > Fix For: 10.0.0 > > Time Spent: 40m > Remaining Estimate: 0h > > It turns out we generate a ton of code just copying and manipulating > {{shared_ptr}} throughput arrow/compute, and especially in the > configuration of the function/kernels registry. One function > {{RegisterScalarArithmetic}} generates around 300kb of code, which on looking > at disassembly contains a significant amount of inlined shared_ptr template > code. I made an attempt to refactoring things to use {{const DataType*}} for > function signatures which removes quite a bit of code bloat, and puts us on a > path to using fewer shared_ptr's in general -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (ARROW-17261) [C++] Add type ID, Type and Array classes for RLE
[ https://issues.apache.org/jira/browse/ARROW-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-17261: --- Labels: pull-request-available (was: ) > [C++] Add type ID, Type and Array classes for RLE > - > > Key: ARROW-17261 > URL: https://issues.apache.org/jira/browse/ARROW-17261 > Project: Apache Arrow > Issue Type: Sub-task > Components: C++ >Reporter: Tobias Zagorni >Assignee: Tobias Zagorni >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Mostly picking these parts from ARROW-16772 and ARROW-16781 to create an > easier order to merge things -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (ARROW-17255) Support JSON logical type in Arrow
[ https://issues.apache.org/jira/browse/ARROW-17255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17573184#comment-17573184 ] Rok Mihevc commented on ARROW-17255: This is one of the threads: https://lists.apache.org/thread/3nls3222ggnxlrp0s46rxrcmgbyhgn8t > Support JSON logical type in Arrow > -- > > Key: ARROW-17255 > URL: https://issues.apache.org/jira/browse/ARROW-17255 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, Java, Python >Reporter: Pradeep Gollakota >Priority: Major > > As a BigQuery developer, I would like the Arrow libraries to support the JSON > logical Type. This would enable us to use the JSON type in the Arrow format > of our ReadAPI. This would also enable us to use the JSON type to export data > from BigQuery to Parquet. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17261) [C++] Add type ID, Type and Array classes for RLE
Tobias Zagorni created ARROW-17261: -- Summary: [C++] Add type ID, Type and Array classes for RLE Key: ARROW-17261 URL: https://issues.apache.org/jira/browse/ARROW-17261 Project: Apache Arrow Issue Type: Sub-task Components: C++ Reporter: Tobias Zagorni Assignee: Tobias Zagorni Mostly picking these parts from ARROW-16772 and ARROW-16781 to create an easier order to merge things -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (ARROW-17259) [C++] Use shared_ptr less throughout arrow/compute
[ https://issues.apache.org/jira/browse/ARROW-17259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-17259: --- Labels: pull-request-available (was: ) > [C++] Use shared_ptr less throughout arrow/compute > > > Key: ARROW-17259 > URL: https://issues.apache.org/jira/browse/ARROW-17259 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Wes McKinney >Priority: Major > Labels: pull-request-available > Fix For: 10.0.0 > > Time Spent: 10m > Remaining Estimate: 0h > > It turns out we generate a ton of code just copying and manipulating > {{shared_ptr}} throughput arrow/compute, and especially in the > configuration of the function/kernels registry. One function > {{RegisterScalarArithmetic}} generates around 300kb of code, which on looking > at disassembly contains a significant amount of inlined shared_ptr template > code. I made an attempt to refactoring things to use {{const DataType*}} for > function signatures which removes quite a bit of code bloat, and puts us on a > path to using fewer shared_ptr's in general -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (ARROW-17260) [Release] Java jars verification pass despite that nothing has been uploaded
[ https://issues.apache.org/jira/browse/ARROW-17260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17573183#comment-17573183 ] Krisztian Szucs commented on ARROW-17260: - This is the second submission after I uploaded and closed the java release on the apache sonatype repo: https://github.com/apache/arrow/pull/13749#issuecomment-129881 > [Release] Java jars verification pass despite that nothing has been uploaded > > > Key: ARROW-17260 > URL: https://issues.apache.org/jira/browse/ARROW-17260 > Project: Apache Arrow > Issue Type: Bug > Components: Developer Tools >Reporter: Krisztian Szucs >Priority: Major > > Build do pass, despite that I forgot to upload the java binaries: > https://github.com/ursacomputing/crossbow/runs/7587084181?check_suite_focus=true > > cc [~assignUser] [~raulcd] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (ARROW-17193) [C++] Building GCS and tests on M1 MacOS 12.05 is failing.
[ https://issues.apache.org/jira/browse/ARROW-17193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17573182#comment-17573182 ] Rok Mihevc commented on ARROW-17193: I think Krisz is [probably open to it|https://ursalabs.zulipchat.com/#narrow/stream/180245-dev/topic/Status.20of.20GCS.20support.3F] if we have a fix. [~kou] please let me know if I can help testing or otherwise! > [C++] Building GCS and tests on M1 MacOS 12.05 is failing. > -- > > Key: ARROW-17193 > URL: https://issues.apache.org/jira/browse/ARROW-17193 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Affects Versions: 8.0.0 >Reporter: Rok Mihevc >Assignee: Kouhei Sutou >Priority: Major > Labels: pull-request-available > Time Spent: 2h 20m > Remaining Estimate: 0h > > Building GCS and tests on M1 MacOS 12.05 with dependencies installed with > homebrew is failing. > {code:bash} > cmake \ > -GNinja \ > -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \ > -DCMAKE_INSTALL_LIBDIR=lib \ > -DARROW_PYTHON=ON \ > -DARROW_COMPUTE=ON \ > -DARROW_FILESYSTEM=ON \ > -DARROW_CSV=ON \ > -DARROW_GCS=ON \ > -DARROW_INSTALL_NAME_RPATH=OFF \ > -DARROW_BUILD_TESTS=ON \ > -DCMAKE_CXX_STANDARD=17 \ > .. > {code} > Env: > {code:bash} > PYARROW_WITH_PARQUET=1 > PYARROW_WITH_DATASET=1 > PYARROW_WITH_ORC=1 > PYARROW_WITH_PARQUET_ENCRYPTION=1 > PYARROW_WITH_PLASMA=1 > PYARROW_WITH_GCS=1 > {code} > Building errors with: > {noformat} > Undefined symbols for architecture arm64: > "absl::lts_20220623::FormatTime(std::__1::basic_string_view std::__1::char_traits >, absl::lts_20220623::Time, > absl::lts_20220623::TimeZone)", referenced from: > arrow::fs::(anonymous > namespace)::GcsIntegrationTest_OpenInputStreamReadMetadata_Test::TestBody() > in gcsfs_test.cc.o > > "absl::lts_20220623::FromChrono(std::__1::chrono::time_point std::__1::chrono::duration > > > const&)", referenced from: > arrow::fs::(anonymous > namespace)::GcsIntegrationTest_OpenInputStreamReadMetadata_Test::TestBody() > in gcsfs_test.cc.o > "absl::lts_20220623::RFC3339_full", referenced from: > arrow::fs::(anonymous > namespace)::GcsFileSystem_ObjectMetadataRoundtrip_Test::TestBody() in > gcsfs_test.cc.o > arrow::fs::(anonymous > namespace)::GcsIntegrationTest_OpenInputStreamReadMetadata_Test::TestBody() > in gcsfs_test.cc.o > "absl::lts_20220623::time_internal::cctz::utc_time_zone()", referenced from: > arrow::fs::(anonymous > namespace)::GcsIntegrationTest_OpenInputStreamReadMetadata_Test::TestBody() > in gcsfs_test.cc.o > "absl::lts_20220623::ToDoubleSeconds(absl::lts_20220623::Duration)", > referenced from: > arrow::fs::(anonymous > namespace)::GcsFileSystem_ObjectMetadataRoundtrip_Test::TestBody() in > gcsfs_test.cc.o > "absl::lts_20220623::Duration::operator-=(absl::lts_20220623::Duration)", > referenced from: > arrow::fs::(anonymous > namespace)::GcsFileSystem_ObjectMetadataRoundtrip_Test::TestBody() in > gcsfs_test.cc.o > "absl::lts_20220623::ParseTime(std::__1::basic_string_view std::__1::char_traits >, std::__1::basic_string_view std::__1::char_traits >, absl::lts_20220623::Time*, > std::__1::basic_string, > std::__1::allocator >*)", referenced from: > arrow::fs::(anonymous > namespace)::GcsFileSystem_ObjectMetadataRoundtrip_Test::TestBody() in > gcsfs_test.cc.o > {noformat} > Dependencies installed with: > {noformat} > brew update && brew bundle --file=cpp/Brewfile > {noformat} > See https://github.com/apache/arrow/pull/13681#issuecomment-1193241547 and > https://github.com/apache/arrow/pull/13407 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17260) [Release] Java jars verification pass despite that nothing has been uploaded
Krisztian Szucs created ARROW-17260: --- Summary: [Release] Java jars verification pass despite that nothing has been uploaded Key: ARROW-17260 URL: https://issues.apache.org/jira/browse/ARROW-17260 Project: Apache Arrow Issue Type: Bug Components: Developer Tools Reporter: Krisztian Szucs Build do pass, despite that I forgot to upload the java binaries: https://github.com/ursacomputing/crossbow/runs/7587084181?check_suite_focus=true cc [~assignUser] [~raulcd] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17259) [C++] Use shared_ptr less throughout arrow/compute
Wes McKinney created ARROW-17259: Summary: [C++] Use shared_ptr less throughout arrow/compute Key: ARROW-17259 URL: https://issues.apache.org/jira/browse/ARROW-17259 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: Wes McKinney Fix For: 10.0.0 It turns out we generate a ton of code just copying and manipulating {{shared_ptr}} throughput arrow/compute, and especially in the configuration of the function/kernels registry. One function {{RegisterScalarArithmetic}} generates around 300kb of code, which on looking at disassembly contains a significant amount of inlined shared_ptr template code. I made an attempt to refactoring things to use {{const DataType*}} for function signatures which removes quite a bit of code bloat, and puts us on a path to using fewer shared_ptr's in general -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ARROW-16929) [C++] Remove ExecBatchIterator and usages thereof
[ https://issues.apache.org/jira/browse/ARROW-16929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved ARROW-16929. -- Fix Version/s: 9.0.0 Resolution: Fixed Resolved in a related PR > [C++] Remove ExecBatchIterator and usages thereof > - > > Key: ARROW-16929 > URL: https://issues.apache.org/jira/browse/ARROW-16929 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Wes McKinney >Assignee: Wes McKinney >Priority: Major > Fix For: 9.0.0 > > > The only place left using it is in GroupBy in > arrow/compute/exec/aggregate.cc. This can be refactored to use ExecSpan. > As part of this removal, we should adapt the benchmarks for ExecSpanIterator > to demonstrate the performance improvement there -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (ARROW-17258) [C++] Separate VisitTypeInline for types that can exist as a Scalar
[ https://issues.apache.org/jira/browse/ARROW-17258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-17258: --- Labels: pull-request-available (was: ) > [C++] Separate VisitTypeInline for types that can exist as a Scalar > --- > > Key: ARROW-17258 > URL: https://issues.apache.org/jira/browse/ARROW-17258 > Project: Apache Arrow > Issue Type: Sub-task > Components: C++ >Reporter: Tobias Zagorni >Assignee: Tobias Zagorni >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17258) [C++] Separate VisitTypeInline for types that can exist as a Scalar
Tobias Zagorni created ARROW-17258: -- Summary: [C++] Separate VisitTypeInline for types that can exist as a Scalar Key: ARROW-17258 URL: https://issues.apache.org/jira/browse/ARROW-17258 Project: Apache Arrow Issue Type: Sub-task Components: C++ Reporter: Tobias Zagorni Assignee: Tobias Zagorni -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ARROW-17248) [CI][Conan] Enable Zstandard
[ https://issues.apache.org/jira/browse/ARROW-17248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kouhei Sutou resolved ARROW-17248. -- Fix Version/s: 10.0.0 Resolution: Fixed Issue resolved by pull request 13742 [https://github.com/apache/arrow/pull/13742] > [CI][Conan] Enable Zstandard > > > Key: ARROW-17248 > URL: https://issues.apache.org/jira/browse/ARROW-17248 > Project: Apache Arrow > Issue Type: Test > Components: Continuous Integration, Packaging >Reporter: Kouhei Sutou >Assignee: Kouhei Sutou >Priority: Major > Labels: pull-request-available > Fix For: 10.0.0 > > Time Spent: 1h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ARROW-17249) [CI][Conan] Enable bzip2
[ https://issues.apache.org/jira/browse/ARROW-17249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kouhei Sutou resolved ARROW-17249. -- Fix Version/s: 10.0.0 Resolution: Fixed Issue resolved by pull request 13743 [https://github.com/apache/arrow/pull/13743] > [CI][Conan] Enable bzip2 > > > Key: ARROW-17249 > URL: https://issues.apache.org/jira/browse/ARROW-17249 > Project: Apache Arrow > Issue Type: Test > Components: Continuous Integration, Packaging >Reporter: Kouhei Sutou >Assignee: Kouhei Sutou >Priority: Major > Labels: pull-request-available > Fix For: 10.0.0 > > Time Spent: 1h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (ARROW-17224) [R][Doc] minor error in Linux installation documentation ('conda' option) for R on CRAN
[ https://issues.apache.org/jira/browse/ARROW-17224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17573108#comment-17573108 ] Wayne Smith edited comment on ARROW-17224 at 7/29/22 9:18 PM: -- Jacob, I concur. And doing conda -y update conda base (or similar) beforehand (as suggested quite often on StackOverflow) doesn't help (and also takes a long time). The first suggestion for installing r-arrow on Linux from the docs–i.e., upgrading directly from Rstudio (now Posit) is the fastest and works. I just don't hope the link to the binaries is brittle or unreliable (you might want to check that too). I've also gotten it to work with the 'nightly' version hosted on Apache. The compilation is much slower than the RStudio instructions (again, now Posit) approach and also needs (as the doc's say) the libcurl4-openssl-dev package. However, my experience is that some (non-sudo) users can't install that package on their distro. One more issue. The Rstudio package pull is actually for Ubuntu 18.04, not Ubuntu 20.04 (or even 22.04). It's not clear to me that is a bug or a feature over the long run. And it should be documented by Rstudio. Even it is, we might consider documenting that subtle change in the Arrow/Linux/R doc's too (just my $.02.) Best, Wayne was (Author: JIRAUSER293451): Jacob, I concur. And doing conda -y update conda base (or similar) beforehand (as suggested quite often on StackOverflow) doesn't help (and also takes a long time). The first suggestion for installing r-arrow on Linux from the docs–i.e., upgrading directly from Rstudio (now Posit) is the fastest and works. I just don't hope the link to the binaries is brittle or unreliable (you might want to check that too). I've also gotten it to work with the 'nightly' version hosted on Apache. The compilation is much slower than the RStudio instructions (again, now Posit) approach and also needs (as the doc's say) the libcurl4-openssl-dev package. However, my experience is that some (non-sudo) users can't install that package on their distro. Best, Wayne > [R][Doc] minor error in Linux installation documentation ('conda' option) for > R on CRAN > --- > > Key: ARROW-17224 > URL: https://issues.apache.org/jira/browse/ARROW-17224 > Project: Apache Arrow > Issue Type: Bug > Components: Documentation, R >Affects Versions: 8.0.1 > Environment: Ubuntu 20.04 >Reporter: Wayne Smith >Priority: Minor > Fix For: 8.0.2 > > Original Estimate: 2h > Remaining Estimate: 2h > > The documentation for the Linux installation for the r-arrow binary for R is > at: > https://cran.r-project.org/web/packages/arrow/vignettes/install.html > The documentation indicates that the 'conda' installation syntax should be: > {{}} > {code:java} > conda install -c conda-forge --strict-channel-priority r-arrow{code} > {{}} > I can't get that to work. What works for me is: > {code:java} > conda config --set channel_priority strict > conda install -c conda-forge r-arrow{code} > I'm wondering if the syntax presented in the documentation is either > deprecated or incorrect. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (ARROW-17022) [C++] Add unit tests and documentation for swiss-join
[ https://issues.apache.org/jira/browse/ARROW-17022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-17022: --- Labels: pull-request-available (was: ) > [C++] Add unit tests and documentation for swiss-join > -- > > Key: ARROW-17022 > URL: https://issues.apache.org/jira/browse/ARROW-17022 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Weston Pace >Assignee: Weston Pace >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > The swiss join utilities being added as part of ARROW-14182 are not > adequately unit tested at the moment. They have fairly decent coverage from > end-to-end random hash join testing. However, a set of basic unit tests will > help future maintenance by demonstrating basic usage and allowing for more > targeted fixes when a refactor breaks something. I'm doing some of this work > as I review ARROW-14182 anyways so that I can better understand it. Rather > than complicate the review I will open this as a separate follow-up PR. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (ARROW-12590) [C++][R] Update copies of Homebrew files to reflect recent updates
[ https://issues.apache.org/jira/browse/ARROW-12590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17573158#comment-17573158 ] Jonathan Keane commented on ARROW-12590: Yeah, that should work until the homer maintainers decide to pull it out > [C++][R] Update copies of Homebrew files to reflect recent updates > -- > > Key: ARROW-12590 > URL: https://issues.apache.org/jira/browse/ARROW-12590 > Project: Apache Arrow > Issue Type: Task > Components: C++, R >Reporter: Ian Cook >Assignee: Jacob Wujciak-Jens >Priority: Critical > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > Our copies of the Homebrew formulae at > [https://github.com/apache/arrow/tree/master/dev/tasks/homebrew-formulae] > have drifted out of sync with what's currently in > [https://github.com/Homebrew/homebrew-core/tree/master/Formula] and > [https://github.com/autobrew/homebrew-core/blob/master/Formula|https://github.com/autobrew/homebrew-core/blob/master/Formula/]. > Get them back in sync and consider automating some method of checking that > they are in sync, e.g. by failing the {{homebrew-cpp}} and > {{homebrew-r-autobrew}} nightly tests if our copies don't match what's in > the Homebrew and autobrew repos (but only if there were changes there that > weren't made in our repo, and not the inverse). > Update the instructions at > > [https://cwiki.apache.org/confluence/display/ARROW/Release+Management+Guide#ReleaseManagementGuide-UpdatingHomebrewpackages] > as needed. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (ARROW-17193) [C++] Building GCS and tests on M1 MacOS 12.05 is failing.
[ https://issues.apache.org/jira/browse/ARROW-17193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17573157#comment-17573157 ] Ian Cook commented on ARROW-17193: -- [~kou] [~rokm] do you think we could get the patch for this included in the next 9.0.0 release candidate (assuming there will be another release candidate)? > [C++] Building GCS and tests on M1 MacOS 12.05 is failing. > -- > > Key: ARROW-17193 > URL: https://issues.apache.org/jira/browse/ARROW-17193 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Affects Versions: 8.0.0 >Reporter: Rok Mihevc >Assignee: Kouhei Sutou >Priority: Major > Labels: pull-request-available > Time Spent: 2h > Remaining Estimate: 0h > > Building GCS and tests on M1 MacOS 12.05 with dependencies installed with > homebrew is failing. > {code:bash} > cmake \ > -GNinja \ > -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \ > -DCMAKE_INSTALL_LIBDIR=lib \ > -DARROW_PYTHON=ON \ > -DARROW_COMPUTE=ON \ > -DARROW_FILESYSTEM=ON \ > -DARROW_CSV=ON \ > -DARROW_GCS=ON \ > -DARROW_INSTALL_NAME_RPATH=OFF \ > -DARROW_BUILD_TESTS=ON \ > -DCMAKE_CXX_STANDARD=17 \ > .. > {code} > Env: > {code:bash} > PYARROW_WITH_PARQUET=1 > PYARROW_WITH_DATASET=1 > PYARROW_WITH_ORC=1 > PYARROW_WITH_PARQUET_ENCRYPTION=1 > PYARROW_WITH_PLASMA=1 > PYARROW_WITH_GCS=1 > {code} > Building errors with: > {noformat} > Undefined symbols for architecture arm64: > "absl::lts_20220623::FormatTime(std::__1::basic_string_view std::__1::char_traits >, absl::lts_20220623::Time, > absl::lts_20220623::TimeZone)", referenced from: > arrow::fs::(anonymous > namespace)::GcsIntegrationTest_OpenInputStreamReadMetadata_Test::TestBody() > in gcsfs_test.cc.o > > "absl::lts_20220623::FromChrono(std::__1::chrono::time_point std::__1::chrono::duration > > > const&)", referenced from: > arrow::fs::(anonymous > namespace)::GcsIntegrationTest_OpenInputStreamReadMetadata_Test::TestBody() > in gcsfs_test.cc.o > "absl::lts_20220623::RFC3339_full", referenced from: > arrow::fs::(anonymous > namespace)::GcsFileSystem_ObjectMetadataRoundtrip_Test::TestBody() in > gcsfs_test.cc.o > arrow::fs::(anonymous > namespace)::GcsIntegrationTest_OpenInputStreamReadMetadata_Test::TestBody() > in gcsfs_test.cc.o > "absl::lts_20220623::time_internal::cctz::utc_time_zone()", referenced from: > arrow::fs::(anonymous > namespace)::GcsIntegrationTest_OpenInputStreamReadMetadata_Test::TestBody() > in gcsfs_test.cc.o > "absl::lts_20220623::ToDoubleSeconds(absl::lts_20220623::Duration)", > referenced from: > arrow::fs::(anonymous > namespace)::GcsFileSystem_ObjectMetadataRoundtrip_Test::TestBody() in > gcsfs_test.cc.o > "absl::lts_20220623::Duration::operator-=(absl::lts_20220623::Duration)", > referenced from: > arrow::fs::(anonymous > namespace)::GcsFileSystem_ObjectMetadataRoundtrip_Test::TestBody() in > gcsfs_test.cc.o > "absl::lts_20220623::ParseTime(std::__1::basic_string_view std::__1::char_traits >, std::__1::basic_string_view std::__1::char_traits >, absl::lts_20220623::Time*, > std::__1::basic_string, > std::__1::allocator >*)", referenced from: > arrow::fs::(anonymous > namespace)::GcsFileSystem_ObjectMetadataRoundtrip_Test::TestBody() in > gcsfs_test.cc.o > {noformat} > Dependencies installed with: > {noformat} > brew update && brew bundle --file=cpp/Brewfile > {noformat} > See https://github.com/apache/arrow/pull/13681#issuecomment-1193241547 and > https://github.com/apache/arrow/pull/13407 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (ARROW-17256) [Python] Can't call combine_chunks on empty ChunkedArray
[ https://issues.apache.org/jira/browse/ARROW-17256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicola Crane updated ARROW-17256: - Summary: [Python] Can't call combine_chunks on empty ChunkedArray (was: Can't call combine_chunks on empty ChunkedArray) > [Python] Can't call combine_chunks on empty ChunkedArray > > > Key: ARROW-17256 > URL: https://issues.apache.org/jira/browse/ARROW-17256 > Project: Apache Arrow > Issue Type: Bug > Components: Python > Environment: pyarrow 8.0.0 > python 3.9 >Reporter: &res >Priority: Minor > > When calling: > {code:java} > pa.chunked_array([], type=pa.bool_()).combine_chunks(){code} > I get this error: > {code:java} > pyarrow/table.pxi:700: in pyarrow.lib.ChunkedArray.combine_chunks > ??? > pyarrow/array.pxi:2868: in pyarrow.lib.concat_arrays > ??? > pyarrow/error.pxi:144: in pyarrow.lib.pyarrow_internal_check_status > ??? > pyarrow/error.pxi:100: in pyarrow.lib.check_status > ??? > E pyarrow.lib.ArrowInvalid: Must pass at least one array{code} > While this works: > {code:java} > pa.chunked_array([pa.array([], pa.bool_())], type=pa.bool_()) {code} > In the first case, it should return an empty BoolArray as well. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (ARROW-17216) [C++] Support joining tables with non-key fields as list
[ https://issues.apache.org/jira/browse/ARROW-17216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17573143#comment-17573143 ] Weston Pace commented on ARROW-17216: - That'd be great. The starting point would be `src/arrow/compute/exec/hash_join_node.cc`. This is where you'll find the check itself that is currently failing, but this is not where most of the join logic lives. Fair warning: the hash-join node has been a bit of a staging ground for performance-critical arrow compute and so it relies on a number of utilities not used elsewhere. As such, this node has a pretty high learning curve at the moment (though my hope is that is more diffusely spread throughout the engine in the future). As of the 9.0.0 release (still pending) there are two implementations of hash-join. The basic implementation (HashJoinImpl) is backed by std::unordered_map and can be found in src/arrow/compute/exec/hash_join.h. A newer version (SwissJoin) extends HashJoinImpl and is backed by a custom hash map and is found in src/arrow/compute/exec/swiss_join.h. I'd recommend testing and adding support to the newer version as the work required is going to be similar between the two. Note that the basic version supports dictionary types but not the newer version (and we just fall back to the basic version if needed) so that is an option if the newer version proves to be trouble. Support for types here is mostly gated by support for some of the alternate views/encodings used by the hash join. One of these is a non-owning arraydata view called KeyColumnArray which is in src/arrow/compute/light_array.h. This view does not currently supported nested data. Note that ArraySpan is pretty similar (see ARROW-17257) and does support nested types (I think) so maybe it makes sense to tackle ARROW-17257 as part of this. The second significant thing is RowTableImpl in src/arrow/compute/row/row_internal.h. This implements a row-major encoding for Arrow data. During the hash-join operation, the build data is placed into a table in this row-major form. Then, during materialization, it is converted back to a column-major form. On top of those two key elements there are a number of other utilities like ExecBatchBuilder, RowArray (which should maybe be renamed to RowTable), RowArrayAccessor, RowArrayMerge, the hashing utilities themselves (there are two versions of this too, I'm pretty sure the older implementation uses arrow/util/hashing.h and I know the newer version uses arrow/compute/exec/key_hash.h), etc. So I would probably start by looking at the unit tests that exists for those utilities encodings (this reminded me that I had some unit tests I had forgotten to push for ARROW-17022 so I will try and get those up today) and try to get these utilities working with nested types. Some of these utilities could probably also use some more unit tests too. Once the utilities are working with nested types you can enable them for the join itself and see what breaks. CC [~michalno] and [~sakras] as they are more knowledgeable in this area and might have some additional input / advice. > [C++] Support joining tables with non-key fields as list > > > Key: ARROW-17216 > URL: https://issues.apache.org/jira/browse/ARROW-17216 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Jayjeet Chakraborty >Priority: Major > Labels: query-engine > > I am trying to join 2 Arrow tables where some columns are of {{list}} > data type. Note that my join columns/keys are primitive data types and some > my non-join columns/keys are of {{{}list{}}}. But, PyArrow {{join()}} > cannot join such as table, although pandas can. It says > {{ArrowInvalid: Data type list is not supported in join non-key > field}} > when I execute this piece of code > {{joined_table = table_1.join(table_2, ['k1', 'k2', 'k3'])}} > A > [stackoverflow|https://stackoverflow.com/questions/73071105/listitem-float-not-supported-in-join-non-key-field] > response pointed out that Arrow currently cannot handle non-fixed types for > joins. Can this be fixed ? Or is this intentional ? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (ARROW-17214) [C++] Implement Scalar CastTo from list types to String
[ https://issues.apache.org/jira/browse/ARROW-17214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Li updated ARROW-17214: - Summary: [C++] Implement Scalar CastTo from list types to String (was: [C++] Implement Scalar CastTo from all types to String) > [C++] Implement Scalar CastTo from list types to String > --- > > Key: ARROW-17214 > URL: https://issues.apache.org/jira/browse/ARROW-17214 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: David Li >Priority: Major > Labels: good-second-issue, pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > As reported on the mailing list: > https://lists.apache.org/thread/rp7vpjtt4lgtjxj35oyjyqh9b6on94jf > Some types, including LIST, LARGE_LIST, and MAP do not implement casts. > Ideally we'd implement these (implement all to-string casts?) by leveraging > the existing cast for any formattable type. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (ARROW-17214) [C++] Implement Scalar CastTo from all types to String
[ https://issues.apache.org/jira/browse/ARROW-17214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-17214: --- Labels: good-second-issue pull-request-available (was: good-second-issue) > [C++] Implement Scalar CastTo from all types to String > -- > > Key: ARROW-17214 > URL: https://issues.apache.org/jira/browse/ARROW-17214 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: David Li >Priority: Major > Labels: good-second-issue, pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > As reported on the mailing list: > https://lists.apache.org/thread/rp7vpjtt4lgtjxj35oyjyqh9b6on94jf > Some types, including LIST, LARGE_LIST, and MAP do not implement casts. > Ideally we'd implement these (implement all to-string casts?) by leveraging > the existing cast for any formattable type. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (ARROW-17224) [R][Doc] minor error in Linux installation documentation ('conda' option) for R on CRAN
[ https://issues.apache.org/jira/browse/ARROW-17224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17573131#comment-17573131 ] Jacob Wujciak-Jens commented on ARROW-17224: {quote} I just don't hope the link to the binaries is brittle or unreliable (you might want to check that too){quote} Which link do you mean the RSPM link? ("https://packagemanager.rstudio.com/all/__linux__/focal/latest";)? This is will always give you the newest version. If you want to pin a certain version you can check the RSPM docs on how to create a time stamped link. (But I would probably rather use [renv|https://rstudio.github.io/renv/index.html] or [conda-lock|https://anaconda.org/conda-forge/conda-lock] if you require a reproducible environment.) {quote} I've also gotten it to work with the 'nightly' version hosted on Apache. The compilation is much slower than the RStudio instructions{quote} Yes while we have pre-compiled libarrow binaries and a script that detects which one matches you distro best, we still need to compile the actual R package which takes ~5 minutes. While RSPM (PPM soon :D) supplies package binaries that don't require any compilation. An important note in regards to the nightlies: these are 100% brittle as we only ever keep 14 versions/days around and delete everything else. So if you require reproducibility I would advise against using them. I have talked to some conda-forge users and they all recommend using [mamba|https://github.com/mamba-org/mamba] when using conda-forge packages as it has a much faster solver. Something that might need to be added to the docs. > [R][Doc] minor error in Linux installation documentation ('conda' option) for > R on CRAN > --- > > Key: ARROW-17224 > URL: https://issues.apache.org/jira/browse/ARROW-17224 > Project: Apache Arrow > Issue Type: Bug > Components: Documentation, R >Affects Versions: 8.0.1 > Environment: Ubuntu 20.04 >Reporter: Wayne Smith >Priority: Minor > Fix For: 8.0.2 > > Original Estimate: 2h > Remaining Estimate: 2h > > The documentation for the Linux installation for the r-arrow binary for R is > at: > https://cran.r-project.org/web/packages/arrow/vignettes/install.html > The documentation indicates that the 'conda' installation syntax should be: > {{}} > {code:java} > conda install -c conda-forge --strict-channel-priority r-arrow{code} > {{}} > I can't get that to work. What works for me is: > {code:java} > conda config --set channel_priority strict > conda install -c conda-forge r-arrow{code} > I'm wondering if the syntax presented in the documentation is either > deprecated or incorrect. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17257) [C++] Unify KeyColumnArray and ArraySpan
Weston Pace created ARROW-17257: --- Summary: [C++] Unify KeyColumnArray and ArraySpan Key: ARROW-17257 URL: https://issues.apache.org/jira/browse/ARROW-17257 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: Weston Pace Both of these are essentially non-owning views into ArrayData. They were developed somewhat independently but share a pretty similar structure. I don't think we need both and we should unify on a common type for simplicity provided we can show no real performance difference. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (ARROW-17255) Support JSON logical type in Arrow
[ https://issues.apache.org/jira/browse/ARROW-17255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17573115#comment-17573115 ] David Li commented on ARROW-17255: -- Hey - I made a guess at the components, but you may want to follow up on the mailing list (d...@arrow.apache.org) with some more details (e.g. what languages you want to support, at least initially, and any capabilities such an extension type would have, beyond just wrapping a string). There have been other such discussions on 'common' extension types like UUIDs. > Support JSON logical type in Arrow > -- > > Key: ARROW-17255 > URL: https://issues.apache.org/jira/browse/ARROW-17255 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, Java, Python >Reporter: Pradeep Gollakota >Priority: Major > > As a BigQuery developer, I would like the Arrow libraries to support the JSON > logical Type. This would enable us to use the JSON type in the Arrow format > of our ReadAPI. This would also enable us to use the JSON type to export data > from BigQuery to Parquet. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ARROW-17166) [R] [CI] force_tests() cannot return TRUE
[ https://issues.apache.org/jira/browse/ARROW-17166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Keane resolved ARROW-17166. Resolution: Fixed Issue resolved by pull request 13680 [https://github.com/apache/arrow/pull/13680] > [R] [CI] force_tests() cannot return TRUE > - > > Key: ARROW-17166 > URL: https://issues.apache.org/jira/browse/ARROW-17166 > Project: Apache Arrow > Issue Type: Bug > Components: Continuous Integration, R >Reporter: Rok Mihevc >Assignee: Dragoș Moldovan-Grünfeld >Priority: Major > Labels: CI, pull-request-available > Fix For: 10.0.0 > > Time Spent: 6.5h > Remaining Estimate: 0h > > Update: the OOM has cleared up so the scope of this PR changed. > Old title: [R] [CI] Exclude large memory tests from the force-tests job on CI > = > We have noticed R CI job (AMD64 Ubuntu 20.04 R 4.2 Force-Tests true) failing > on master: > [1|https://github.com/apache/arrow/runs/7424773120?check_suite_focus=true#step:7:5547], > > [2|https://github.com/apache/arrow/runs/7431821192?check_suite_focus=true#step:7:5804], > > [3|https://github.com/apache/arrow/runs/7445803518?check_suite_focus=true#step:7:16305] > with: > {code:java} > Start test: array uses local timezone for POSIXct without timezone > test-Array.R:269:3 [success] > System has not been booted with systemd as init system (PID 1). Can't operate. > Failed to create bus connection: Host is down > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (ARROW-15693) [Dev] Update crossbow templates to use master or main
[ https://issues.apache.org/jira/browse/ARROW-15693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-15693: --- Labels: pull-request-available (was: ) > [Dev] Update crossbow templates to use master or main > - > > Key: ARROW-15693 > URL: https://issues.apache.org/jira/browse/ARROW-15693 > Project: Apache Arrow > Issue Type: Sub-task > Components: Developer Tools >Reporter: Neal Richardson >Assignee: Kevin Gurney >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (ARROW-17255) Support JSON logical type in Arrow
[ https://issues.apache.org/jira/browse/ARROW-17255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Li updated ARROW-17255: - Component/s: C++ Java Python (was: Archery) > Support JSON logical type in Arrow > -- > > Key: ARROW-17255 > URL: https://issues.apache.org/jira/browse/ARROW-17255 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, Java, Python >Reporter: Pradeep Gollakota >Priority: Major > > As a BigQuery developer, I would like the Arrow libraries to support the JSON > logical Type. This would enable us to use the JSON type in the Arrow format > of our ReadAPI. This would also enable us to use the JSON type to export data > from BigQuery to Parquet. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (ARROW-12590) [C++][R] Update copies of Homebrew files to reflect recent updates
[ https://issues.apache.org/jira/browse/ARROW-12590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacob Wujciak-Jens reassigned ARROW-12590: -- Assignee: Jacob Wujciak-Jens > [C++][R] Update copies of Homebrew files to reflect recent updates > -- > > Key: ARROW-12590 > URL: https://issues.apache.org/jira/browse/ARROW-12590 > Project: Apache Arrow > Issue Type: Task > Components: C++, R >Reporter: Ian Cook >Assignee: Jacob Wujciak-Jens >Priority: Critical > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > Our copies of the Homebrew formulae at > [https://github.com/apache/arrow/tree/master/dev/tasks/homebrew-formulae] > have drifted out of sync with what's currently in > [https://github.com/Homebrew/homebrew-core/tree/master/Formula] and > [https://github.com/autobrew/homebrew-core/blob/master/Formula|https://github.com/autobrew/homebrew-core/blob/master/Formula/]. > Get them back in sync and consider automating some method of checking that > they are in sync, e.g. by failing the {{homebrew-cpp}} and > {{homebrew-r-autobrew}} nightly tests if our copies don't match what's in > the Homebrew and autobrew repos (but only if there were changes there that > weren't made in our repo, and not the inverse). > Update the instructions at > > [https://cwiki.apache.org/confluence/display/ARROW/Release+Management+Guide#ReleaseManagementGuide-UpdatingHomebrewpackages] > as needed. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (ARROW-17224) [R][Doc] minor error in Linux installation documentation ('conda' option) for R on CRAN
[ https://issues.apache.org/jira/browse/ARROW-17224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17573108#comment-17573108 ] Wayne Smith edited comment on ARROW-17224 at 7/29/22 6:38 PM: -- Jacob, I concur. And doing conda -y update conda base (or similar) beforehand (as suggested quite often on StackOverflow) doesn't help (and also takes a long time). The first suggestion for installing r-arrow on Linux from the docs–i.e., upgrading directly from Rstudio (now Posit) is the fastest and works. I just don't hope the link to the binaries is brittle or unreliable (you might want to check that too). I've also gotten it to work with the 'nightly' version hosted on Apache. The compilation is much slower than the RStudio instructions (again, now Posit) approach and also needs (as the doc's say) the libcurl4-openssl-dev package. However, my experience is that some (non-sudo) users can't install that package on their distro. Best, Wayne was (Author: JIRAUSER293451): Jacob, I concur. And doing conda -y update conda base (or similar) beforehand (as suggested quite often on StackOverflow) doesn't help (and also takes a long time). The first suggestion for installing r-arrow on Linux from the docs–i.e., upgrading directly from Rstudio (now Posit) is the fastest and works. I just don't hope the link is brittle or unreliable. I've also gotten it to work with the 'nightly' version hosted on Apache. The compilation is much slower than the RStudio instructions (now Posit) approach and also needs (as the doc's say) the libcurl-openssl-dev package. However, my experience is that some (non-sudo) users can't install that Wayne > [R][Doc] minor error in Linux installation documentation ('conda' option) for > R on CRAN > --- > > Key: ARROW-17224 > URL: https://issues.apache.org/jira/browse/ARROW-17224 > Project: Apache Arrow > Issue Type: Bug > Components: Documentation, R >Affects Versions: 8.0.1 > Environment: Ubuntu 20.04 >Reporter: Wayne Smith >Priority: Minor > Fix For: 8.0.2 > > Original Estimate: 2h > Remaining Estimate: 2h > > The documentation for the Linux installation for the r-arrow binary for R is > at: > https://cran.r-project.org/web/packages/arrow/vignettes/install.html > The documentation indicates that the 'conda' installation syntax should be: > {{}} > {code:java} > conda install -c conda-forge --strict-channel-priority r-arrow{code} > {{}} > I can't get that to work. What works for me is: > {code:java} > conda config --set channel_priority strict > conda install -c conda-forge r-arrow{code} > I'm wondering if the syntax presented in the documentation is either > deprecated or incorrect. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (ARROW-14802) [R] [CI] Illegal opcode when installing via autobrew
[ https://issues.apache.org/jira/browse/ARROW-14802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacob Wujciak-Jens closed ARROW-14802. -- Resolution: Duplicate > [R] [CI] Illegal opcode when installing via autobrew > > > Key: ARROW-14802 > URL: https://issues.apache.org/jira/browse/ARROW-14802 > Project: Apache Arrow > Issue Type: Bug > Components: Continuous Integration, R >Reporter: Jonathan Keane >Priority: Major > > https://github.com/ursacomputing/crossbow/runs/4295761494?check_suite_focus=true#step:7:664 > {code} > > if (identical(tolower(Sys.getenv("ARROW_R_DEV", "false")), "true")) { > + arrow_reporter <- MultiReporter$new(list(CheckReporter$new(), > LocationReporter$new())) > + } else { > + arrow_reporter <- check_reporter() > + } > > test_check("arrow", reporter = arrow_reporter) > *** caught illegal operation *** > address 0x106462630, cause 'illegal opcode' > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (ARROW-17224) [R][Doc] minor error in Linux installation documentation ('conda' option) for R on CRAN
[ https://issues.apache.org/jira/browse/ARROW-17224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17573108#comment-17573108 ] Wayne Smith commented on ARROW-17224: - Jacob, I concur. And doing conda -y update conda base (or similar) beforehand (as suggested quite often on StackOverflow) doesn't help (and also takes a long time). The first suggestion for installing r-arrow on Linux from the docs–i.e., upgrading directly from Rstudio (now Posit) is the fastest and works. I just don't hope the link is brittle or unreliable. I've also gotten it to work with the 'nightly' version hosted on Apache. The compilation is much slower than the RStudio instructions (now Posit) approach and also needs (as the doc's say) the libcurl-openssl-dev package. However, my experience is that some (non-sudo) users can't install that Wayne > [R][Doc] minor error in Linux installation documentation ('conda' option) for > R on CRAN > --- > > Key: ARROW-17224 > URL: https://issues.apache.org/jira/browse/ARROW-17224 > Project: Apache Arrow > Issue Type: Bug > Components: Documentation, R >Affects Versions: 8.0.1 > Environment: Ubuntu 20.04 >Reporter: Wayne Smith >Priority: Minor > Fix For: 8.0.2 > > Original Estimate: 2h > Remaining Estimate: 2h > > The documentation for the Linux installation for the r-arrow binary for R is > at: > https://cran.r-project.org/web/packages/arrow/vignettes/install.html > The documentation indicates that the 'conda' installation syntax should be: > {{}} > {code:java} > conda install -c conda-forge --strict-channel-priority r-arrow{code} > {{}} > I can't get that to work. What works for me is: > {code:java} > conda config --set channel_priority strict > conda install -c conda-forge r-arrow{code} > I'm wondering if the syntax presented in the documentation is either > deprecated or incorrect. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (ARROW-5890) [C++][Python] Support ExtensionType arrays in more kernels
[ https://issues.apache.org/jira/browse/ARROW-5890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17573106#comment-17573106 ] Clark Zinzow commented on ARROW-5890: - Does allowing extension type implementers to register a cast function sound reasonable? I might be able to take a stab at this (just casting) in the coming months. > [C++][Python] Support ExtensionType arrays in more kernels > -- > > Key: ARROW-5890 > URL: https://issues.apache.org/jira/browse/ARROW-5890 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Joris Van den Bossche >Priority: Major > > From a quick test (through Python), it seems that {{slice}} and {{take}} > work, but the following not: > - {{cast}}: it could rely on the casting rules for the storage type. Or do we > want that you explicitly have to take the storage array before casting? > - {{dictionary_encode}} / {{unique}} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (ARROW-15481) [R] [CI] Add a crossbow job that mimics CRAN's old macOS
[ https://issues.apache.org/jira/browse/ARROW-15481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17573107#comment-17573107 ] Jacob Wujciak-Jens commented on ARROW-15481: Working on getting self-hosted 10.13 runners matching CRAN with r-release and r-oldrel for nightlies and other jobs (as-cran?) open for suggestions. > [R] [CI] Add a crossbow job that mimics CRAN's old macOS > > > Key: ARROW-15481 > URL: https://issues.apache.org/jira/browse/ARROW-15481 > Project: Apache Arrow > Issue Type: Improvement > Components: Continuous Integration, R >Reporter: Jonathan Keane >Assignee: Jacob Wujciak-Jens >Priority: Critical > > Jeroen's autobrew does this using travis: > https://github.com/autobrew/homebrew-core/blob/high-sierra/.travis.yml > It would be good to test this on our own before the release process -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (ARROW-15481) [R] [CI] Add a crossbow job that mimics CRAN's old macOS
[ https://issues.apache.org/jira/browse/ARROW-15481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacob Wujciak-Jens reassigned ARROW-15481: -- Assignee: Jacob Wujciak-Jens > [R] [CI] Add a crossbow job that mimics CRAN's old macOS > > > Key: ARROW-15481 > URL: https://issues.apache.org/jira/browse/ARROW-15481 > Project: Apache Arrow > Issue Type: Improvement > Components: Continuous Integration, R >Reporter: Jonathan Keane >Assignee: Jacob Wujciak-Jens >Priority: Critical > > Jeroen's autobrew does this using travis: > https://github.com/autobrew/homebrew-core/blob/high-sierra/.travis.yml > It would be good to test this on our own before the release process -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (ARROW-15481) [R] [CI] Add a crossbow job that mimics CRAN's old macOS
[ https://issues.apache.org/jira/browse/ARROW-15481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacob Wujciak-Jens updated ARROW-15481: --- Priority: Critical (was: Major) > [R] [CI] Add a crossbow job that mimics CRAN's old macOS > > > Key: ARROW-15481 > URL: https://issues.apache.org/jira/browse/ARROW-15481 > Project: Apache Arrow > Issue Type: Improvement > Components: Continuous Integration, R >Reporter: Jonathan Keane >Priority: Critical > > Jeroen's autobrew does this using travis: > https://github.com/autobrew/homebrew-core/blob/high-sierra/.travis.yml > It would be good to test this on our own before the release process -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (ARROW-17224) [R][Doc] minor error in Linux installation documentation ('conda' option) for R on CRAN
[ https://issues.apache.org/jira/browse/ARROW-17224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17573105#comment-17573105 ] Jacob Wujciak-Jens commented on ARROW-17224: Hello thanks for the ticket. I have replicated this on ubuntu 20.04 and while it does solve the environment at some point it takes a very long time (>1h), which is of course not acceptable. I don't know why this happens but will look into it, even if there is a fix we probably want to update the docs... > [R][Doc] minor error in Linux installation documentation ('conda' option) for > R on CRAN > --- > > Key: ARROW-17224 > URL: https://issues.apache.org/jira/browse/ARROW-17224 > Project: Apache Arrow > Issue Type: Bug > Components: Documentation, R >Affects Versions: 8.0.1 > Environment: Ubuntu 20.04 >Reporter: Wayne Smith >Priority: Minor > Fix For: 8.0.2 > > Original Estimate: 2h > Remaining Estimate: 2h > > The documentation for the Linux installation for the r-arrow binary for R is > at: > https://cran.r-project.org/web/packages/arrow/vignettes/install.html > The documentation indicates that the 'conda' installation syntax should be: > {{}} > {code:java} > conda install -c conda-forge --strict-channel-priority r-arrow{code} > {{}} > I can't get that to work. What works for me is: > {code:java} > conda config --set channel_priority strict > conda install -c conda-forge r-arrow{code} > I'm wondering if the syntax presented in the documentation is either > deprecated or incorrect. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17256) Can't call combine_chunks on empty ChunkedArray
&res created ARROW-17256: Summary: Can't call combine_chunks on empty ChunkedArray Key: ARROW-17256 URL: https://issues.apache.org/jira/browse/ARROW-17256 Project: Apache Arrow Issue Type: Bug Components: Python Environment: pyarrow 8.0.0 python 3.9 Reporter: &res When calling: {code:java} pa.chunked_array([], type=pa.bool_()).combine_chunks(){code} I get this error: {code:java} pyarrow/table.pxi:700: in pyarrow.lib.ChunkedArray.combine_chunks ??? pyarrow/array.pxi:2868: in pyarrow.lib.concat_arrays ??? pyarrow/error.pxi:144: in pyarrow.lib.pyarrow_internal_check_status ??? pyarrow/error.pxi:100: in pyarrow.lib.check_status ??? E pyarrow.lib.ArrowInvalid: Must pass at least one array{code} While this works: {code:java} pa.chunked_array([pa.array([], pa.bool_())], type=pa.bool_()) {code} In the first case, it should return an empty BoolArray as well. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17255) Support JSON logical type in Arrow
Pradeep Gollakota created ARROW-17255: - Summary: Support JSON logical type in Arrow Key: ARROW-17255 URL: https://issues.apache.org/jira/browse/ARROW-17255 Project: Apache Arrow Issue Type: Improvement Components: Archery Reporter: Pradeep Gollakota As a BigQuery developer, I would like the Arrow libraries to support the JSON logical Type. This would enable us to use the JSON type in the Arrow format of our ReadAPI. This would also enable us to use the JSON type to export data from BigQuery to Parquet. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (ARROW-17252) [R] Intermittent valgrind failure
[ https://issues.apache.org/jira/browse/ARROW-17252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17573068#comment-17573068 ] Dewey Dunnington edited comment on ARROW-17252 at 7/29/22 5:17 PM: --- I can get a similar leak locally, too using a dockerfile: {noformat} FROM ubuntu:20.04 ARG DEBIAN_FRONTEND=noninteractive ENV TZ=America/Halifax RUN apt-get update && apt-get install -y valgrind r-base cmake git libxml2-dev libcurl4-openssl-dev libssl-dev libgit2-dev libfontconfig1-dev libfreetype6-dev libharfbuzz-dev libfribidi-dev libpng-dev libtiff5-dev libjpeg-dev RUN git clone https://github.com/apache/arrow.git /arrow && mkdir /arrow-build && cd /arrow-build && cmake /arrow/cpp -DARROW_CSV=ON -DARROW_FILESYSTEM=ON -DARROW_COMPUTE=ON -DBoost_SOURCE=BUNDLED && cmake --build . && cmake --install . --prefix /arrow-dist RUN R -e 'install.packages(c("devtools", "cpp11", "R6", "assertthat", "bit64", "bit", "cli", "ellipsis", "glue", "magrittr", "purrr", "rlang", "tidyselect", "vctrs", "lubridate", "dplyr", "hms"), repos = "https://cloud.r-project.org";)' ENV ARROW_HOME /arrow-dist ENV LD_LIBRARY_PATH /arrow-dist/lib RUN cd /arrow/r && R CMD INSTALL . {noformat} Launching R with valgrind: {noformat} R -d "valgrind --tool=memcheck --leak-check=full" {noformat} ...and I get this leak: {noformat} ==387== 2,608 (72 direct, 2,536 indirect) bytes in 1 blocks are definitely lost in loss record 625 of 4,108 ==387==at 0x484A3C4: operator new(unsigned long) (in /usr/lib/aarch64-linux-gnu/valgrind/vgpreload_memcheck-arm64-linux.so) ==387==by 0x1566648F: arrow::Table::FromRecordBatches(std::shared_ptr, std::vector, std::allocator > > const&) (in /arrow-dist/lib/libarrow.so.900.0.0) ==387==by 0x15629FB7: arrow::RecordBatchReader::ToTable() (in /arrow-dist/lib/libarrow.so.900.0.0) ==387==by 0x1501C503: operator() (compute-exec.cpp:147) ==387==by 0x1501C503: std::_Function_handler > (), ExecPlan_read_table(std::shared_ptr const&, std::shared_ptr const&, cpp11::r_vector, cpp11::r_vector, long)::{lambda()#1}>::_M_invoke(std::_Any_data const&) (std_function.h:286) ==387==by 0x15023427: std::function > ()>::operator()() const (std_function.h:688) ==387==by 0x1502352F: operator() >()>&> (future.h:150) ==387==by 0x1502352F: __invoke_impl >&, std::function >()>&> (invoke.h:60) ==387==by 0x1502352F: __invoke >&, std::function >()>&> (invoke.h:95) ==387==by 0x1502352F: __call (functional:400) ==387==by 0x1502352F: operator()<> (functional:484) ==387==by 0x1502352F: arrow::internal::FnOnce::FnImpl >, std::function > ()>)> >::invoke() (functional.h:152) ==387==by 0x1579636B: std::thread::_State_impl > >::_M_run() (in /arrow-dist/lib/libarrow.so.900.0.0) ==387==by 0x71F4FAB: ??? (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.28) ==387==by 0x55F1623: start_thread (pthread_create.c:477) ==387==by 0x4DA949B: thread_start (clone.S:78) {noformat} (Although this dockerfile doesn't use r-devel...it uses R 3.6 which is a bit old). was (Author: paleolimbot): I can get a similar leak locally, too using a dockerfile: {noformat} FROM ubuntu:20.04 ARG DEBIAN_FRONTEND=noninteractive ENV TZ=America/Halifax RUN apt-get update && apt-get install -y valgrind r-base cmake git libxml2-dev libcurl4-openssl-dev libssl-dev libgit2-dev libfontconfig1-dev libfreetype6-dev libharfbuzz-dev libfribidi-dev libpng-dev libtiff5-dev libjpeg-dev RUN git clone https://github.com/apache/arrow.git /arrow && mkdir /arrow-build && cd /arrow-build && cmake /arrow/cpp -DARROW_CSV=ON -DARROW_DATASET=ON -DARROW_FILESYSTEM=ON -DARROW_COMPUTE=ON -DBoost_SOURCE=BUNDLED && cmake --build . && cmake --install . --prefix /arrow-dist RUN R -e 'install.packages(c("devtools", "cpp11", "R6", "assertthat", "bit64", "bit", "cli", "ellipsis", "glue", "magrittr", "purrr", "rlang", "tidyselect", "vctrs", "lubridate", "dplyr", "hms"), repos = "https://cloud.r-project.org";)' ENV ARROW_HOME /arrow-dist ENV LD_LIBRARY_PATH /arrow-dist/lib RUN cd /arrow/r && R CMD INSTALL . {noformat} Launching R with valgrind: {noformat} R -d "valgrind --tool=memcheck --leak-check=full" {noformat} ...and I get this leak: {noformat} ==387== 2,608 (72 direct, 2,536 indirect) bytes in 1 blocks are definitely lost in loss record 625 of 4,108 ==387==at 0x484A3C4: operator new(unsigned long) (in /usr/lib/aarch64-linux-gnu/valgrind/vgpreload_memcheck-arm64-linux.so) ==387==by 0x1566648F: arrow::Table::FromRecordBatches(std::shared_ptr, std::vector, std::allocator > > const&) (in /arrow-dist/lib/libarrow.so.900.0.0) ==387==by 0x15629FB7: arrow::RecordBatchReader::ToTable() (in /arrow-dist/lib/libarrow.so.900.0.0) ==387==by 0x1501C503: operator() (compute-exec.cpp:147) ==387==by 0x1501C503: std::_Function_handler > (), ExecPlan_read_table(std::shared_ptr const&, std::s
[jira] [Commented] (ARROW-17252) [R] Intermittent valgrind failure
[ https://issues.apache.org/jira/browse/ARROW-17252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17573068#comment-17573068 ] Dewey Dunnington commented on ARROW-17252: -- I can get a similar leak locally, too using a dockerfile: {noformat} FROM ubuntu:20.04 ARG DEBIAN_FRONTEND=noninteractive ENV TZ=America/Halifax RUN apt-get update && apt-get install -y valgrind r-base cmake git libxml2-dev libcurl4-openssl-dev libssl-dev libgit2-dev libfontconfig1-dev libfreetype6-dev libharfbuzz-dev libfribidi-dev libpng-dev libtiff5-dev libjpeg-dev RUN git clone https://github.com/apache/arrow.git /arrow && mkdir /arrow-build && cd /arrow-build && cmake /arrow/cpp -DARROW_CSV=ON -DARROW_DATASET=ON -DARROW_FILESYSTEM=ON -DARROW_COMPUTE=ON -DBoost_SOURCE=BUNDLED && cmake --build . && cmake --install . --prefix /arrow-dist RUN R -e 'install.packages(c("devtools", "cpp11", "R6", "assertthat", "bit64", "bit", "cli", "ellipsis", "glue", "magrittr", "purrr", "rlang", "tidyselect", "vctrs", "lubridate", "dplyr", "hms"), repos = "https://cloud.r-project.org";)' ENV ARROW_HOME /arrow-dist ENV LD_LIBRARY_PATH /arrow-dist/lib RUN cd /arrow/r && R CMD INSTALL . {noformat} Launching R with valgrind: {noformat} R -d "valgrind --tool=memcheck --leak-check=full" {noformat} ...and I get this leak: {noformat} ==387== 2,608 (72 direct, 2,536 indirect) bytes in 1 blocks are definitely lost in loss record 625 of 4,108 ==387==at 0x484A3C4: operator new(unsigned long) (in /usr/lib/aarch64-linux-gnu/valgrind/vgpreload_memcheck-arm64-linux.so) ==387==by 0x1566648F: arrow::Table::FromRecordBatches(std::shared_ptr, std::vector, std::allocator > > const&) (in /arrow-dist/lib/libarrow.so.900.0.0) ==387==by 0x15629FB7: arrow::RecordBatchReader::ToTable() (in /arrow-dist/lib/libarrow.so.900.0.0) ==387==by 0x1501C503: operator() (compute-exec.cpp:147) ==387==by 0x1501C503: std::_Function_handler > (), ExecPlan_read_table(std::shared_ptr const&, std::shared_ptr const&, cpp11::r_vector, cpp11::r_vector, long)::{lambda()#1}>::_M_invoke(std::_Any_data const&) (std_function.h:286) ==387==by 0x15023427: std::function > ()>::operator()() const (std_function.h:688) ==387==by 0x1502352F: operator() >()>&> (future.h:150) ==387==by 0x1502352F: __invoke_impl >&, std::function >()>&> (invoke.h:60) ==387==by 0x1502352F: __invoke >&, std::function >()>&> (invoke.h:95) ==387==by 0x1502352F: __call (functional:400) ==387==by 0x1502352F: operator()<> (functional:484) ==387==by 0x1502352F: arrow::internal::FnOnce::FnImpl >, std::function > ()>)> >::invoke() (functional.h:152) ==387==by 0x1579636B: std::thread::_State_impl > >::_M_run() (in /arrow-dist/lib/libarrow.so.900.0.0) ==387==by 0x71F4FAB: ??? (in /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.28) ==387==by 0x55F1623: start_thread (pthread_create.c:477) ==387==by 0x4DA949B: thread_start (clone.S:78) {noformat} (Although this dockerfile doesn't use r-devel...it uses R 3.6 which is a bit old). > [R] Intermittent valgrind failure > - > > Key: ARROW-17252 > URL: https://issues.apache.org/jira/browse/ARROW-17252 > Project: Apache Arrow > Issue Type: Improvement > Components: R >Reporter: Dewey Dunnington >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > A number of recent nightly builds have intermittent failures with valgrind, > which fails because of possibly leaked memory around an exec plan. This seems > related to a change in XXX that separated {{ExecPlan_prepare()}} from > {{ExecPlan_run()}} and added a {{ExecPlan_read_table()}} that uses > {{RunWithCapturedR()}}. The reported leaks vary but include ExecPlans and > ExecNodes and fields of those objects. > A failed run: > https://dev.azure.com/ursacomputing/crossbow/_build/results?buildId=30310&view=logs&j=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb&t=d9b15392-e4ce-5e4c-0c8c-b69645229181&l=24980 > Some example output: > {noformat} > ==5249== 14,112 (384 direct, 13,728 indirect) bytes in 1 blocks are > definitely lost in loss record 1,988 of 3,883 > ==5249==at 0x4849013: operator new(unsigned long) (in > /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) > ==5249==by 0x10B2902B: > std::_Function_handler > (arrow::compute::ExecPlan*, std::vector std::allocator >, arrow::compute::ExecNodeOptions > const&), > arrow::compute::internal::RegisterAggregateNode(arrow::compute::ExecFactoryRegistry*)::{lambda(arrow::compute::ExecPlan*, > std::vector std::allocator >, arrow::compute::ExecNodeOptions > const&)#1}>::_M_invoke(std::_Any_data const&, arrow::compute::ExecPlan*&&, > std::vector std::allocator >&&, > arrow::compute::ExecNodeOptions const&) (exec_plan.h:60) > ==5249==
[jira] [Updated] (ARROW-17067) Implement Substring_Index
[ https://issues.apache.org/jira/browse/ARROW-17067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-17067: Fix Version/s: (was: 9.0.0) > Implement Substring_Index > - > > Key: ARROW-17067 > URL: https://issues.apache.org/jira/browse/ARROW-17067 > Project: Apache Arrow > Issue Type: New Feature >Reporter: Sahaj Gupta >Assignee: Sahaj Gupta >Priority: Minor > Labels: pull-request-available > Time Spent: 1h 50m > Remaining Estimate: 0h > > Adding Substring_index Function. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (ARROW-17216) [C++] Support joining tables with non-key fields as list
[ https://issues.apache.org/jira/browse/ARROW-17216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17573057#comment-17573057 ] Carlos Maltzahn commented on ARROW-17216: - [~heyjc] and I are willing to help implement support for joining tables with lists in non-key values. But we might need some help on where to start. > [C++] Support joining tables with non-key fields as list > > > Key: ARROW-17216 > URL: https://issues.apache.org/jira/browse/ARROW-17216 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Jayjeet Chakraborty >Priority: Major > Labels: query-engine > > I am trying to join 2 Arrow tables where some columns are of {{list}} > data type. Note that my join columns/keys are primitive data types and some > my non-join columns/keys are of {{{}list{}}}. But, PyArrow {{join()}} > cannot join such as table, although pandas can. It says > {{ArrowInvalid: Data type list is not supported in join non-key > field}} > when I execute this piece of code > {{joined_table = table_1.join(table_2, ['k1', 'k2', 'k3'])}} > A > [stackoverflow|https://stackoverflow.com/questions/73071105/listitem-float-not-supported-in-join-non-key-field] > response pointed out that Arrow currently cannot handle non-fixed types for > joins. Can this be fixed ? Or is this intentional ? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (ARROW-17067) Implement Substring_Index
[ https://issues.apache.org/jira/browse/ARROW-17067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-17067: Fix Version/s: 9.0.0 > Implement Substring_Index > - > > Key: ARROW-17067 > URL: https://issues.apache.org/jira/browse/ARROW-17067 > Project: Apache Arrow > Issue Type: New Feature >Reporter: Sahaj Gupta >Assignee: Sahaj Gupta >Priority: Minor > Labels: pull-request-available > Fix For: 9.0.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > Adding Substring_index Function. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ARROW-17246) [Packaging][deb][RPM] Don't use system jemalloc
[ https://issues.apache.org/jira/browse/ARROW-17246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs resolved ARROW-17246. - Resolution: Fixed Issue resolved by pull request 13739 [https://github.com/apache/arrow/pull/13739] > [Packaging][deb][RPM] Don't use system jemalloc > --- > > Key: ARROW-17246 > URL: https://issues.apache.org/jira/browse/ARROW-17246 > Project: Apache Arrow > Issue Type: Bug > Components: Packaging >Reporter: Kouhei Sutou >Assignee: Kouhei Sutou >Priority: Major > Labels: pull-request-available > Fix For: 9.0.0 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > Because system jemalloc can't be used with {{dlopen()}}. If system jemalloc > can't used with {{dlopen()}}, our shared libraried can't be loaded as > bindings of script languages such as Ruby: > {noformat} > + ruby -r gi -e 'p GI.load('\''Arrow'\'')' > (null)-WARNING **: Failed to load shared library 'libarrow-glib.so.900' > referenced by the typelib: /lib64/libjemalloc.so.2: cannot allocate memory in > static TLS block > {noformat} > This is caused because system jemalloc isn't built with > {{--disable-initial-exec-tls}}. See also: > * https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=951704 > * https://github.com/jemalloc/jemalloc/issues/1237 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (ARROW-17254) [C++][FlightRPC] Flight SQL server does not implement GetSchema
[ https://issues.apache.org/jira/browse/ARROW-17254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Li updated ARROW-17254: - Issue Type: Bug (was: Improvement) > [C++][FlightRPC] Flight SQL server does not implement GetSchema > --- > > Key: ARROW-17254 > URL: https://issues.apache.org/jira/browse/ARROW-17254 > Project: Apache Arrow > Issue Type: Bug > Components: C++, FlightRPC >Reporter: David Li >Priority: Major > > This is specified, but not actually implemented! > It needs to be covered in integration tests, too. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17254) [C++][FlightRPC] Flight SQL server does not implement GetSchema
David Li created ARROW-17254: Summary: [C++][FlightRPC] Flight SQL server does not implement GetSchema Key: ARROW-17254 URL: https://issues.apache.org/jira/browse/ARROW-17254 Project: Apache Arrow Issue Type: Improvement Components: C++, FlightRPC Reporter: David Li This is specified, but not actually implemented! It needs to be covered in integration tests, too. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ARROW-17219) [Go] [IPC] Endianness Conversion for Non-native endianness
[ https://issues.apache.org/jira/browse/ARROW-17219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew Topol resolved ARROW-17219. --- Resolution: Fixed Issue resolved by pull request 13716 [https://github.com/apache/arrow/pull/13716] > [Go] [IPC] Endianness Conversion for Non-native endianness > -- > > Key: ARROW-17219 > URL: https://issues.apache.org/jira/browse/ARROW-17219 > Project: Apache Arrow > Issue Type: New Feature > Components: Go, Integration >Reporter: Matthew Topol >Assignee: Matthew Topol >Priority: Major > Labels: pull-request-available > Fix For: 10.0.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ARROW-15733) array.String offsets int32 overflow
[ https://issues.apache.org/jira/browse/ARROW-15733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew Topol resolved ARROW-15733. --- Fix Version/s: 10.0.0 Assignee: Matthew Topol Resolution: Resolved Implementation for LargeBinary also implemented LargeString allowing int64 offsets for String arrays > array.String offsets int32 overflow > --- > > Key: ARROW-15733 > URL: https://issues.apache.org/jira/browse/ARROW-15733 > Project: Apache Arrow > Issue Type: Bug > Components: Go >Affects Versions: 7.0.0 >Reporter: Andrew Strelsky >Assignee: Matthew Topol >Priority: Minor > Fix For: 10.0.0 > > > {panel} > panic: runtime error: slice bounds out of range [:-1352393031] > goroutine 1 [running]: > github.com/apache/arrow/go/v7/arrow/array.(*String).ValueBytes(...) > > C:/Users/astre/Documents/go/pkg/mod/github.com/apache/arrow/go/v7@v7.0.0/arrow/array/string.go:74 > github.com/apache/arrow/go/v7/arrow/ipc.(*recordEncoder).visit(0xc193b85c80, > 0xc193b9e060, \{0x10b5490, 0xc50820}) > > C:/Users/astre/Documents/go/pkg/mod/github.com/apache/arrow/go/v7@v7.0.0/arrow/ipc/writer.go:435 > +0x2194 > github.com/apache/arrow/go/v7/arrow/ipc.(*recordEncoder).visit(0xc193b85c80, > 0xc193b9e060, \{0x10b5288, 0xc50730}) > > C:/Users/astre/Documents/go/pkg/mod/github.com/apache/arrow/go/v7@v7.0.0/arrow/ipc/writer.go:533 > +0x1431 > github.com/apache/arrow/go/v7/arrow/ipc.(*recordEncoder).Encode(0xc193b85c80, > 0xc193b9e060, \{0x10b5838, 0xc193b8bc80}) > > C:/Users/astre/Documents/go/pkg/mod/github.com/apache/arrow/go/v7@v7.0.0/arrow/ipc/writer.go:267 > +0x98 > github.com/apache/arrow/go/v7/arrow/ipc.(*FileWriter).Write(0xc4e480, > \{0x10b5838, 0xc193b8bc80}) > > C:/Users/astre/Documents/go/pkg/mod/github.com/apache/arrow/go/v7@v7.0.0/arrow/ipc/file_writer.go:342 > +0x20d > main.main() > {panel} > I have *a lot* of strings. The offsets should not only be unsigned but should > also be larger than 4 bytes. Changing the offsets to a slice of uint32 was > sufficient in my case but may not be for others. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (ARROW-17253) Pyarrow array crashes the interpreter when encounter 0 division error
[ https://issues.apache.org/jira/browse/ARROW-17253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17573021#comment-17573021 ] Li Jin edited comment on ARROW-17253 at 7/29/22 3:02 PM: - I think in general, any exception raised by the generator would crash the python interpreter when passed to pa.array was (Author: icexelloss): I think in general, any exception raised by the generator would crash the python interpreter when passing to pa.array > Pyarrow array crashes the interpreter when encounter 0 division error > --- > > Key: ARROW-17253 > URL: https://issues.apache.org/jira/browse/ARROW-17253 > Project: Apache Arrow > Issue Type: Bug >Reporter: Li Jin >Priority: Major > > {code:java} > pa.array((1 // 0 for x in range(10)), size=10){code} > This would crash the python interpreter -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (ARROW-17253) Pyarrow array crashes the interpreter when encounter 0 division error
[ https://issues.apache.org/jira/browse/ARROW-17253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Jin updated ARROW-17253: --- Description: {code:java} pa.array((1 // 0 for x in range(10)), size=10){code} This would crash the python interpreter was: {code:java} pa.array(1 // 0 for x in range(10), size=10){code} This would crash the python interpreter > Pyarrow array crashes the interpreter when encounter 0 division error > --- > > Key: ARROW-17253 > URL: https://issues.apache.org/jira/browse/ARROW-17253 > Project: Apache Arrow > Issue Type: Bug >Reporter: Li Jin >Priority: Major > > {code:java} > pa.array((1 // 0 for x in range(10)), size=10){code} > This would crash the python interpreter -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (ARROW-17253) Pyarrow array crashes the interpreter when encounter 0 division error
[ https://issues.apache.org/jira/browse/ARROW-17253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17573021#comment-17573021 ] Li Jin commented on ARROW-17253: I think in general, any exception raised by the generator would crash the python interpreter when passing to pa.array > Pyarrow array crashes the interpreter when encounter 0 division error > --- > > Key: ARROW-17253 > URL: https://issues.apache.org/jira/browse/ARROW-17253 > Project: Apache Arrow > Issue Type: Bug >Reporter: Li Jin >Priority: Major > > {code:java} > pa.array(1 // 0 for x in range(10), size=10){code} > This would crash the python interpreter -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17253) Pyarrow array crashes the interpreter when encounter 0 division error
Li Jin created ARROW-17253: -- Summary: Pyarrow array crashes the interpreter when encounter 0 division error Key: ARROW-17253 URL: https://issues.apache.org/jira/browse/ARROW-17253 Project: Apache Arrow Issue Type: Bug Reporter: Li Jin {code:java} pa.array(1 // 0 for x in range(10), size=10){code} This would crash the python interpreter -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (ARROW-12590) [C++][R] Update copies of Homebrew files to reflect recent updates
[ https://issues.apache.org/jira/browse/ARROW-12590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17573006#comment-17573006 ] Jacob Wujciak-Jens commented on ARROW-12590: Ok. Though I thought that for [ARROW-15678] we had a workaround (setting -O2) in place that should prevent the segfault? > [C++][R] Update copies of Homebrew files to reflect recent updates > -- > > Key: ARROW-12590 > URL: https://issues.apache.org/jira/browse/ARROW-12590 > Project: Apache Arrow > Issue Type: Task > Components: C++, R >Reporter: Ian Cook >Priority: Critical > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > Our copies of the Homebrew formulae at > [https://github.com/apache/arrow/tree/master/dev/tasks/homebrew-formulae] > have drifted out of sync with what's currently in > [https://github.com/Homebrew/homebrew-core/tree/master/Formula] and > [https://github.com/autobrew/homebrew-core/blob/master/Formula|https://github.com/autobrew/homebrew-core/blob/master/Formula/]. > Get them back in sync and consider automating some method of checking that > they are in sync, e.g. by failing the {{homebrew-cpp}} and > {{homebrew-r-autobrew}} nightly tests if our copies don't match what's in > the Homebrew and autobrew repos (but only if there were changes there that > weren't made in our repo, and not the inverse). > Update the instructions at > > [https://cwiki.apache.org/confluence/display/ARROW/Release+Management+Guide#ReleaseManagementGuide-UpdatingHomebrewpackages] > as needed. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (ARROW-12590) [C++][R] Update copies of Homebrew files to reflect recent updates
[ https://issues.apache.org/jira/browse/ARROW-12590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17573003#comment-17573003 ] Jonathan Keane commented on ARROW-12590: Agreed with syncing (and the original intent of this ticket was basically to find a way to detect if and when this happens in order to alert us about it). It is ok that the autobrew and the homebrew formulae are different (since in the newest versions of the autobrew setup, if we are on a modern enough system we _just use brew_). If I'm remembering correctly, https://github.com/apache/arrow/pull/12157/files#diff-4b112dbca2ece7c78e15eb8aff3218e21dd6f4b1fab7cfc9182830488f68ca58R22-R30 was basically the operative code that fixes this. If I were you, I would take the commits on my branch there and create a new branch and push forward with that since it will let you run it in CI. Though the R tests will probably segfault with the simd issue in ARROW-15678. Maybe that's fine (since it's "only" a limited number of computers that this happens on — just so happens the GH runners are one of those, apparently) or maybe we'll need to actually resolve ARROW-15678? > [C++][R] Update copies of Homebrew files to reflect recent updates > -- > > Key: ARROW-12590 > URL: https://issues.apache.org/jira/browse/ARROW-12590 > Project: Apache Arrow > Issue Type: Task > Components: C++, R >Reporter: Ian Cook >Priority: Critical > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > Our copies of the Homebrew formulae at > [https://github.com/apache/arrow/tree/master/dev/tasks/homebrew-formulae] > have drifted out of sync with what's currently in > [https://github.com/Homebrew/homebrew-core/tree/master/Formula] and > [https://github.com/autobrew/homebrew-core/blob/master/Formula|https://github.com/autobrew/homebrew-core/blob/master/Formula/]. > Get them back in sync and consider automating some method of checking that > they are in sync, e.g. by failing the {{homebrew-cpp}} and > {{homebrew-r-autobrew}} nightly tests if our copies don't match what's in > the Homebrew and autobrew repos (but only if there were changes there that > weren't made in our repo, and not the inverse). > Update the instructions at > > [https://cwiki.apache.org/confluence/display/ARROW/Release+Management+Guide#ReleaseManagementGuide-UpdatingHomebrewpackages] > as needed. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ARROW-8226) [Go] Add binary builder that uses 64 bit offsets and make binary builders resettable
[ https://issues.apache.org/jira/browse/ARROW-8226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew Topol resolved ARROW-8226. -- Fix Version/s: 10.0.0 Resolution: Fixed Issue resolved by pull request 13719 [https://github.com/apache/arrow/pull/13719] > [Go] Add binary builder that uses 64 bit offsets and make binary builders > resettable > > > Key: ARROW-8226 > URL: https://issues.apache.org/jira/browse/ARROW-8226 > Project: Apache Arrow > Issue Type: New Feature > Components: Go >Reporter: Richard >Priority: Minor > Labels: pull-request-available > Fix For: 10.0.0 > > Time Spent: 2.5h > Remaining Estimate: 0h > > I ran into some overflow issues with the existing 32 bit binary builder. My > changes add a new binary builder that uses 64-bit offsets + tests. > I also added a panic for when the 32-bit offset binary builder overflows. > Finally I made both binary builders resettable. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (ARROW-17224) [R][Doc] minor error in Linux installation documentation ('conda' option) for R on CRAN
[ https://issues.apache.org/jira/browse/ARROW-17224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-17224: Component/s: R > [R][Doc] minor error in Linux installation documentation ('conda' option) for > R on CRAN > --- > > Key: ARROW-17224 > URL: https://issues.apache.org/jira/browse/ARROW-17224 > Project: Apache Arrow > Issue Type: Bug > Components: Documentation, R >Affects Versions: 8.0.1 > Environment: Ubuntu 20.04 >Reporter: Wayne Smith >Priority: Minor > Fix For: 8.0.2 > > Original Estimate: 2h > Remaining Estimate: 2h > > The documentation for the Linux installation for the r-arrow binary for R is > at: > https://cran.r-project.org/web/packages/arrow/vignettes/install.html > The documentation indicates that the 'conda' installation syntax should be: > {{}} > {code:java} > conda install -c conda-forge --strict-channel-priority r-arrow{code} > {{}} > I can't get that to work. What works for me is: > {code:java} > conda config --set channel_priority strict > conda install -c conda-forge r-arrow{code} > I'm wondering if the syntax presented in the documentation is either > deprecated or incorrect. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (ARROW-17224) [R][Doc] minor error in Linux installation documentation ('conda' option) for R on CRAN
[ https://issues.apache.org/jira/browse/ARROW-17224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-17224: Summary: [R][Doc] minor error in Linux installation documentation ('conda' option) for R on CRAN (was: minor error in Linux installation documentation ('conda' option) for R on CRAN) > [R][Doc] minor error in Linux installation documentation ('conda' option) for > R on CRAN > --- > > Key: ARROW-17224 > URL: https://issues.apache.org/jira/browse/ARROW-17224 > Project: Apache Arrow > Issue Type: Bug > Components: Documentation >Affects Versions: 8.0.1 > Environment: Ubuntu 20.04 >Reporter: Wayne Smith >Priority: Minor > Fix For: 8.0.2 > > Original Estimate: 2h > Remaining Estimate: 2h > > The documentation for the Linux installation for the r-arrow binary for R is > at: > https://cran.r-project.org/web/packages/arrow/vignettes/install.html > The documentation indicates that the 'conda' installation syntax should be: > {{}} > {code:java} > conda install -c conda-forge --strict-channel-priority r-arrow{code} > {{}} > I can't get that to work. What works for me is: > {code:java} > conda config --set channel_priority strict > conda install -c conda-forge r-arrow{code} > I'm wondering if the syntax presented in the documentation is either > deprecated or incorrect. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (ARROW-17252) [R] Intermittent valgrind failure
[ https://issues.apache.org/jira/browse/ARROW-17252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-17252: --- Labels: pull-request-available (was: ) > [R] Intermittent valgrind failure > - > > Key: ARROW-17252 > URL: https://issues.apache.org/jira/browse/ARROW-17252 > Project: Apache Arrow > Issue Type: Improvement > Components: R >Reporter: Dewey Dunnington >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > A number of recent nightly builds have intermittent failures with valgrind, > which fails because of possibly leaked memory around an exec plan. This seems > related to a change in XXX that separated {{ExecPlan_prepare()}} from > {{ExecPlan_run()}} and added a {{ExecPlan_read_table()}} that uses > {{RunWithCapturedR()}}. The reported leaks vary but include ExecPlans and > ExecNodes and fields of those objects. > A failed run: > https://dev.azure.com/ursacomputing/crossbow/_build/results?buildId=30310&view=logs&j=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb&t=d9b15392-e4ce-5e4c-0c8c-b69645229181&l=24980 > Some example output: > {noformat} > ==5249== 14,112 (384 direct, 13,728 indirect) bytes in 1 blocks are > definitely lost in loss record 1,988 of 3,883 > ==5249==at 0x4849013: operator new(unsigned long) (in > /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) > ==5249==by 0x10B2902B: > std::_Function_handler > (arrow::compute::ExecPlan*, std::vector std::allocator >, arrow::compute::ExecNodeOptions > const&), > arrow::compute::internal::RegisterAggregateNode(arrow::compute::ExecFactoryRegistry*)::{lambda(arrow::compute::ExecPlan*, > std::vector std::allocator >, arrow::compute::ExecNodeOptions > const&)#1}>::_M_invoke(std::_Any_data const&, arrow::compute::ExecPlan*&&, > std::vector std::allocator >&&, > arrow::compute::ExecNodeOptions const&) (exec_plan.h:60) > ==5249==by 0xFA83A0C: > std::function > (arrow::compute::ExecPlan*, std::vector std::allocator >, arrow::compute::ExecNodeOptions > const&)>::operator()(arrow::compute::ExecPlan*, > std::vector std::allocator >, arrow::compute::ExecNodeOptions > const&) const (std_function.h:622) > ==5249== 14,528 (160 direct, 14,368 indirect) bytes in 1 blocks are > definitely lost in loss record 1,989 of 3,883 > ==5249==at 0x4849013: operator new(unsigned long) (in > /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) > ==5249==by 0x10096CB7: arrow::FutureImpl::Make() (future.cc:187) > ==5249==by 0xFCB6F9A: arrow::Future::Make() > (future.h:420) > ==5249==by 0x101AE927: ExecPlanImpl (exec_plan.cc:50) > ==5249==by 0x101AE927: > arrow::compute::ExecPlan::Make(arrow::compute::ExecContext*, > std::shared_ptr) (exec_plan.cc:355) > ==5249==by 0xFA77BA2: ExecPlan_create(bool) (compute-exec.cpp:45) > ==5249==by 0xF9FAE9F: _arrow_ExecPlan_create (arrowExports.cpp:868) > ==5249==by 0x4953B60: R_doDotCall (dotcode.c:601) > ==5249==by 0x49C2C16: bcEval (eval.c:7682) > ==5249==by 0x499DB95: Rf_eval (eval.c:748) > ==5249==by 0x49A0904: R_execClosure (eval.c:1918) > ==5249==by 0x49A05B7: Rf_applyClosure (eval.c:1844) > ==5249==by 0x49B2122: bcEval (eval.c:7094) > ==5249== > ==5249== 36,322 (416 direct, 35,906 indirect) bytes in 1 blocks are > definitely lost in loss record 2,929 of 3,883 > ==5249==at 0x4849013: operator new(unsigned long) (in > /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) > ==5249==by 0x10214F92: arrow::compute::TaskScheduler::Make() > (task_util.cc:421) > ==5249==by 0x101AEA6C: ExecPlanImpl (exec_plan.cc:50) > ==5249==by 0x101AEA6C: > arrow::compute::ExecPlan::Make(arrow::compute::ExecContext*, > std::shared_ptr) (exec_plan.cc:355) > ==5249==by 0xFA77BA2: ExecPlan_create(bool) (compute-exec.cpp:45) > ==5249==by 0xF9FAE9F: _arrow_ExecPlan_create (arrowExports.cpp:868) > ==5249==by 0x4953B60: R_doDotCall (dotcode.c:601) > ==5249==by 0x49C2C16: bcEval (eval.c:7682) > ==5249==by 0x499DB95: Rf_eval (eval.c:748) > ==5249==by 0x49A0904: R_execClosure (eval.c:1918) > ==5249==by 0x49A05B7: Rf_applyClosure (eval.c:1844) > ==5249==by 0x49B2122: bcEval (eval.c:7094) > ==5249==by 0x499DB95: Rf_eval (eval.c:748) > {noformat} > We also occasionally get leaked Schemas, and in one case a leaked InputType > that seemed completely unrelated to the other leaks (ARROW-17225). > I'm wondering if these have to do with references in lambdas that get passed > by reference? Or perhaps a cache issue? There were some instances in previous > leaks where the backtrace to the {{new}} allocator was different between > reported leaks. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (ARROW-17252) [R] Intermittent valgrind failure
[ https://issues.apache.org/jira/browse/ARROW-17252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17572957#comment-17572957 ] Dewey Dunnington commented on ARROW-17252: -- Another run that had some other failures, including the {{InputType}} one: https://dev.azure.com/ursacomputing/crossbow/_build/results?buildId=30290&view=logs&j=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb&t=d9b15392-e4ce-5e4c-0c8c-b69645229181&l=25107 {noformat} ==5248== 56 bytes in 1 blocks are possibly lost in loss record 171 of 3,993 ==5248==at 0x4849013: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==5248==by 0x10547EE7: allocate (new_allocator.h:121) ==5248==by 0x10547EE7: allocate (alloc_traits.h:460) ==5248==at 0x4849013: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==5248==by 0x101AFFBA: allocate (new_allocator.h:121) ==5248==by 0x101AFFBA: allocate (alloc_traits.h:460) ==5248==by 0x101AFFBA: _M_allocate (stl_vector.h:346) ==5248==by 0x101AFFBA: void std::vector >::_M_realloc_insert(__gnu_cxx::__normal_iterator > >, arrow::compute::ExecNode*&&) (vector.tcc:440) ==5248==by 0x101AABBA: emplace_back (vector.tcc:121) ==5248==by 0x101AABBA: push_back (stl_vector.h:1204) ==5248==by 0x101AABBA: arrow::compute::ExecNode::ExecNode(arrow::compute::ExecPlan*, std::vector >, std::vector, std::allocator >, std::allocator, std::allocator > > >, std::shared_ptr, int) (exec_plan.cc:414) ==5248==by 0x101AAD22: arrow::compute::MapNode::MapNode(arrow::compute::ExecPlan*, std::vector >, std::shared_ptr, bool) (exec_plan.cc:476) ==5248==by 0x101EC290: ProjectNode (project_node.cc:46) ==5248==by 0x101EC290: EmplaceNode >, std::shared_ptr, std::vector >, bool const&> (exec_plan.h:60) ==5248==by 0x101EC290: arrow::compute::(anonymous namespace)::ProjectNode::Make(arrow::compute::ExecPlan*, std::vector >, arrow::compute::ExecNodeOptions const&) (project_node.cc:73) ==5248==by 0xFC20D83: std::_Function_handler (arrow::compute::ExecPlan*, std::vector >, arrow::compute::ExecNodeOptions const&), arrow::Result (*)(arrow::compute::ExecPlan*, std::vector >, arrow::compute::ExecNodeOptions const&)>::_M_invoke(std::_Any_data const&, arrow::compute::ExecPlan*&&, std::vector >&&, arrow::compute::ExecNodeOptions const&) (invoke.h:60) ==5248==by 0xFA838DC: std::function (arrow::compute::ExecPlan*, std::vector >, arrow::compute::ExecNodeOptions const&)>::operator()(arrow::compute::ExecPlan*, std::vector >, arrow::compute::ExecNodeOptions const&) const (std_function.h:622) ==5248==by 0xFA81047: arrow::compute::MakeExecNode(std::__cxx11::basic_string, std::allocator > const&, arrow::compute::ExecPlan*, std::vector >, arrow::compute::ExecNodeOptions const&, arrow::compute::ExecFactoryRegistry*) (exec_plan.h:438) ==5248==by 0xFA77BE8: MakeExecNodeOrStop(std::__cxx11::basic_string, std::allocator > const&, arrow::compute::ExecPlan*, std::vector >, arrow::compute::ExecNodeOptions const&) (compute-exec.cpp:53) ==5248==by 0xFA7ADF2: ExecNode_Project(std::shared_ptr const&, std::vector, std::allocator > > const&, std::vector, std::allocator >, std::allocator, std::allocator > > >) (compute-exec.cpp:307) ==5248==by 0xF9FC997: _arrow_ExecNode_Project (arrowExports.cpp:986) ==5248==by 0x4953BC4: R_doDotCall (dotcode.c:607) {noformat} > [R] Intermittent valgrind failure > - > > Key: ARROW-17252 > URL: https://issues.apache.org/jira/browse/ARROW-17252 > Project: Apache Arrow > Issue Type: Improvement > Components: R >Reporter: Dewey Dunnington >Priority: Major > > A number of recent nightly builds have intermittent failures with valgrind, > which fails because of possibly leaked memory around an exec plan. This seems > related to a change in XXX that separated {{ExecPlan_prepare()}} from > {{ExecPlan_run()}} and added a {{ExecPlan_read_table()}} that uses > {{RunWithCapturedR()}}. The reported leaks vary but include ExecPlans and > ExecNodes and fields of those objects. > A failed run: > https://dev.azure.com/ursacomputing/crossbow/_build/results?buildId=30310&view=logs&j=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb&t=d9b15392-e4ce-5e4c-0c8c-b69645229181&l=24980 > Some example output: > {noformat} > ==5249== 14,112 (384 direct, 13,728 indirect) bytes in 1 blocks are > definitely lost in loss record 1,988 of 3,883 > ==5249==at 0x4849013: operator new(unsigned long) (in > /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) > ==5249==by 0x10B2902B: > std::_Function_handler > (arrow::compute::ExecPlan*, std::vector std::allocator >, arrow::compute::ExecNodeOptions > const&), > arrow::compute::internal::RegisterAggregateNode(arrow::compute::ExecFactoryReg
[jira] [Created] (ARROW-17252) [R] Intermittent valgrind failure
Dewey Dunnington created ARROW-17252: Summary: [R] Intermittent valgrind failure Key: ARROW-17252 URL: https://issues.apache.org/jira/browse/ARROW-17252 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Dewey Dunnington A number of recent nightly builds have intermittent failures with valgrind, which fails because of possibly leaked memory around an exec plan. This seems related to a change in XXX that separated {{ExecPlan_prepare()}} from {{ExecPlan_run()}} and added a {{ExecPlan_read_table()}} that uses {{RunWithCapturedR()}}. The reported leaks vary but include ExecPlans and ExecNodes and fields of those objects. A failed run: https://dev.azure.com/ursacomputing/crossbow/_build/results?buildId=30310&view=logs&j=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb&t=d9b15392-e4ce-5e4c-0c8c-b69645229181&l=24980 Some example output: {noformat} ==5249== 14,112 (384 direct, 13,728 indirect) bytes in 1 blocks are definitely lost in loss record 1,988 of 3,883 ==5249==at 0x4849013: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==5249==by 0x10B2902B: std::_Function_handler (arrow::compute::ExecPlan*, std::vector >, arrow::compute::ExecNodeOptions const&), arrow::compute::internal::RegisterAggregateNode(arrow::compute::ExecFactoryRegistry*)::{lambda(arrow::compute::ExecPlan*, std::vector >, arrow::compute::ExecNodeOptions const&)#1}>::_M_invoke(std::_Any_data const&, arrow::compute::ExecPlan*&&, std::vector >&&, arrow::compute::ExecNodeOptions const&) (exec_plan.h:60) ==5249==by 0xFA83A0C: std::function (arrow::compute::ExecPlan*, std::vector >, arrow::compute::ExecNodeOptions const&)>::operator()(arrow::compute::ExecPlan*, std::vector >, arrow::compute::ExecNodeOptions const&) const (std_function.h:622) ==5249== 14,528 (160 direct, 14,368 indirect) bytes in 1 blocks are definitely lost in loss record 1,989 of 3,883 ==5249==at 0x4849013: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==5249==by 0x10096CB7: arrow::FutureImpl::Make() (future.cc:187) ==5249==by 0xFCB6F9A: arrow::Future::Make() (future.h:420) ==5249==by 0x101AE927: ExecPlanImpl (exec_plan.cc:50) ==5249==by 0x101AE927: arrow::compute::ExecPlan::Make(arrow::compute::ExecContext*, std::shared_ptr) (exec_plan.cc:355) ==5249==by 0xFA77BA2: ExecPlan_create(bool) (compute-exec.cpp:45) ==5249==by 0xF9FAE9F: _arrow_ExecPlan_create (arrowExports.cpp:868) ==5249==by 0x4953B60: R_doDotCall (dotcode.c:601) ==5249==by 0x49C2C16: bcEval (eval.c:7682) ==5249==by 0x499DB95: Rf_eval (eval.c:748) ==5249==by 0x49A0904: R_execClosure (eval.c:1918) ==5249==by 0x49A05B7: Rf_applyClosure (eval.c:1844) ==5249==by 0x49B2122: bcEval (eval.c:7094) ==5249== ==5249== 36,322 (416 direct, 35,906 indirect) bytes in 1 blocks are definitely lost in loss record 2,929 of 3,883 ==5249==at 0x4849013: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==5249==by 0x10214F92: arrow::compute::TaskScheduler::Make() (task_util.cc:421) ==5249==by 0x101AEA6C: ExecPlanImpl (exec_plan.cc:50) ==5249==by 0x101AEA6C: arrow::compute::ExecPlan::Make(arrow::compute::ExecContext*, std::shared_ptr) (exec_plan.cc:355) ==5249==by 0xFA77BA2: ExecPlan_create(bool) (compute-exec.cpp:45) ==5249==by 0xF9FAE9F: _arrow_ExecPlan_create (arrowExports.cpp:868) ==5249==by 0x4953B60: R_doDotCall (dotcode.c:601) ==5249==by 0x49C2C16: bcEval (eval.c:7682) ==5249==by 0x499DB95: Rf_eval (eval.c:748) ==5249==by 0x49A0904: R_execClosure (eval.c:1918) ==5249==by 0x49A05B7: Rf_applyClosure (eval.c:1844) ==5249==by 0x49B2122: bcEval (eval.c:7094) ==5249==by 0x499DB95: Rf_eval (eval.c:748) {noformat} We also occasionally get leaked Schemas, and in one case a leaked InputType that seemed completely unrelated to the other leaks (ARROW-17225). I'm wondering if these have to do with references in lambdas that get passed by reference? Or perhaps a cache issue? There were some instances in previous leaks where the backtrace to the {{new}} allocator was different between reported leaks. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (ARROW-12590) [C++][R] Update copies of Homebrew files to reflect recent updates
[ https://issues.apache.org/jira/browse/ARROW-12590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17572937#comment-17572937 ] Jacob Wujciak-Jens edited comment on ARROW-12590 at 7/29/22 11:56 AM: -- [~jonkeane] Just so I understand correctly: As far as I see we have added dependencies to our version of the formula that are missing from the upstream version on the other hand the upstream version has update the bottle tag/sha which is likely why we are having issues with that now. So these changes should clearly be synced in both directions. There are some other changes that should obviously be excluded (download url/version + sha) but also some where I am unsure if we should also sync them down: - a patch step to change the mimalloc version in versions.txt - an addition to the test step (running the cpp tests?) was (Author: JIRAUSER287549): [~jonkeane] Just so I understand correctly: As far as I see we have added dependencies to our version of the formula that are missing from the upstream version on the other hand the upstream version has update the bottle tag/sha which is likely we are having issues with that now. So these changes should clearly be synced in both directions. There are some other changes that should obviously be excluded (download url/version + sha) but also some where I am unsure if we should also sync them down: - a patch step to change the mimalloc version in versions.txt - an addition to the test step (running the cpp tests?) > [C++][R] Update copies of Homebrew files to reflect recent updates > -- > > Key: ARROW-12590 > URL: https://issues.apache.org/jira/browse/ARROW-12590 > Project: Apache Arrow > Issue Type: Task > Components: C++, R >Reporter: Ian Cook >Priority: Critical > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > Our copies of the Homebrew formulae at > [https://github.com/apache/arrow/tree/master/dev/tasks/homebrew-formulae] > have drifted out of sync with what's currently in > [https://github.com/Homebrew/homebrew-core/tree/master/Formula] and > [https://github.com/autobrew/homebrew-core/blob/master/Formula|https://github.com/autobrew/homebrew-core/blob/master/Formula/]. > Get them back in sync and consider automating some method of checking that > they are in sync, e.g. by failing the {{homebrew-cpp}} and > {{homebrew-r-autobrew}} nightly tests if our copies don't match what's in > the Homebrew and autobrew repos (but only if there were changes there that > weren't made in our repo, and not the inverse). > Update the instructions at > > [https://cwiki.apache.org/confluence/display/ARROW/Release+Management+Guide#ReleaseManagementGuide-UpdatingHomebrewpackages] > as needed. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (ARROW-12590) [C++][R] Update copies of Homebrew files to reflect recent updates
[ https://issues.apache.org/jira/browse/ARROW-12590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17572937#comment-17572937 ] Jacob Wujciak-Jens commented on ARROW-12590: [~jonkeane] Just so I understand correctly: As far as I see we have added dependencies to our version of the formula that are missing from the upstream version on the other hand the upstream version has update the bottle tag/sha which is likely we are having issues with that now. So these changes should clearly be synced in both directions. There are some other changes that should obviously be excluded (download url/version + sha) but also some where I am unsure if we should also sync them down: - a patch step to change the mimalloc version in versions.txt - an addition to the test step (running the cpp tests?) > [C++][R] Update copies of Homebrew files to reflect recent updates > -- > > Key: ARROW-12590 > URL: https://issues.apache.org/jira/browse/ARROW-12590 > Project: Apache Arrow > Issue Type: Task > Components: C++, R >Reporter: Ian Cook >Priority: Critical > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > Our copies of the Homebrew formulae at > [https://github.com/apache/arrow/tree/master/dev/tasks/homebrew-formulae] > have drifted out of sync with what's currently in > [https://github.com/Homebrew/homebrew-core/tree/master/Formula] and > [https://github.com/autobrew/homebrew-core/blob/master/Formula|https://github.com/autobrew/homebrew-core/blob/master/Formula/]. > Get them back in sync and consider automating some method of checking that > they are in sync, e.g. by failing the {{homebrew-cpp}} and > {{homebrew-r-autobrew}} nightly tests if our copies don't match what's in > the Homebrew and autobrew repos (but only if there were changes there that > weren't made in our repo, and not the inverse). > Update the instructions at > > [https://cwiki.apache.org/confluence/display/ARROW/Release+Management+Guide#ReleaseManagementGuide-UpdatingHomebrewpackages] > as needed. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ARROW-14847) [R] Implement bindings for lubridate date/time parsing functions
[ https://issues.apache.org/jira/browse/ARROW-14847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rok Mihevc resolved ARROW-14847. Resolution: Resolved > [R] Implement bindings for lubridate date/time parsing functions > > > Key: ARROW-14847 > URL: https://issues.apache.org/jira/browse/ARROW-14847 > Project: Apache Arrow > Issue Type: New Feature > Components: R >Reporter: Nicola Crane >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17251) [CI][Conan] Enable Flight
Kouhei Sutou created ARROW-17251: Summary: [CI][Conan] Enable Flight Key: ARROW-17251 URL: https://issues.apache.org/jira/browse/ARROW-17251 Project: Apache Arrow Issue Type: Test Components: Continuous Integration, Packaging Reporter: Kouhei Sutou Assignee: Kouhei Sutou -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (ARROW-17250) [CI][Conan] Enable utf8proc automatically
[ https://issues.apache.org/jira/browse/ARROW-17250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-17250: --- Labels: pull-request-available (was: ) > [CI][Conan] Enable utf8proc automatically > - > > Key: ARROW-17250 > URL: https://issues.apache.org/jira/browse/ARROW-17250 > Project: Apache Arrow > Issue Type: Test > Components: Continuous Integration, Packaging >Reporter: Kouhei Sutou >Assignee: Kouhei Sutou >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17250) [CI][Conan] Enable utf8proc automatically
Kouhei Sutou created ARROW-17250: Summary: [CI][Conan] Enable utf8proc automatically Key: ARROW-17250 URL: https://issues.apache.org/jira/browse/ARROW-17250 Project: Apache Arrow Issue Type: Test Components: Continuous Integration, Packaging Reporter: Kouhei Sutou Assignee: Kouhei Sutou -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (ARROW-17249) [CI][Conan] Enable bzip2
[ https://issues.apache.org/jira/browse/ARROW-17249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-17249: --- Labels: pull-request-available (was: ) > [CI][Conan] Enable bzip2 > > > Key: ARROW-17249 > URL: https://issues.apache.org/jira/browse/ARROW-17249 > Project: Apache Arrow > Issue Type: Test > Components: Continuous Integration, Packaging >Reporter: Kouhei Sutou >Assignee: Kouhei Sutou >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17249) [CI][Conan] Enable bzip2
Kouhei Sutou created ARROW-17249: Summary: [CI][Conan] Enable bzip2 Key: ARROW-17249 URL: https://issues.apache.org/jira/browse/ARROW-17249 Project: Apache Arrow Issue Type: Test Components: Continuous Integration, Packaging Reporter: Kouhei Sutou Assignee: Kouhei Sutou -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (ARROW-17248) [CI][Conan] Enable Zstandard
[ https://issues.apache.org/jira/browse/ARROW-17248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-17248: --- Labels: pull-request-available (was: ) > [CI][Conan] Enable Zstandard > > > Key: ARROW-17248 > URL: https://issues.apache.org/jira/browse/ARROW-17248 > Project: Apache Arrow > Issue Type: Test > Components: Continuous Integration, Packaging >Reporter: Kouhei Sutou >Assignee: Kouhei Sutou >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17248) [CI][Conan] Enable Zstandard
Kouhei Sutou created ARROW-17248: Summary: [CI][Conan] Enable Zstandard Key: ARROW-17248 URL: https://issues.apache.org/jira/browse/ARROW-17248 Project: Apache Arrow Issue Type: Test Reporter: Kouhei Sutou Assignee: Kouhei Sutou -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (ARROW-17248) [CI][Conan] Enable Zstandard
[ https://issues.apache.org/jira/browse/ARROW-17248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kouhei Sutou updated ARROW-17248: - Component/s: Continuous Integration Packaging > [CI][Conan] Enable Zstandard > > > Key: ARROW-17248 > URL: https://issues.apache.org/jira/browse/ARROW-17248 > Project: Apache Arrow > Issue Type: Test > Components: Continuous Integration, Packaging >Reporter: Kouhei Sutou >Assignee: Kouhei Sutou >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (ARROW-16027) [C++][CI] The job labeled "AMD64 MacOS 10.15 C++" runs on MacOS 11.6.5
[ https://issues.apache.org/jira/browse/ARROW-16027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kouhei Sutou closed ARROW-16027. Resolution: Duplicate > [C++][CI] The job labeled "AMD64 MacOS 10.15 C++" runs on MacOS 11.6.5 > -- > > Key: ARROW-16027 > URL: https://issues.apache.org/jira/browse/ARROW-16027 > Project: Apache Arrow > Issue Type: Improvement > Components: Continuous Integration >Reporter: Weston Pace >Priority: Major > > The workflow is configured {{runs-on: macos-latest}} which is no longer 10.15 -- This message was sent by Atlassian Jira (v8.20.10#820010)