[jira] [Created] (ARROW-16328) [Java] POC Arrow Modular (format module for example)
David Dali Susanibar Arce created ARROW-16328: - Summary: [Java] POC Arrow Modular (format module for example) Key: ARROW-16328 URL: https://issues.apache.org/jira/browse/ARROW-16328 Project: Apache Arrow Issue Type: Sub-task Reporter: David Dali Susanibar Arce POC to move Arrow Java module to Single/Multi module mode. Currently we are supporting Arrow Java JSE1.8 to be able to consume by JSE11,17,18 in *legacy mode* that is enabled when the compilation environment is defined by the {{{}--source{}}}, {{--target.}} This POC is to validate changes needed in case Arrow java decided to implement "{*}Single module mode{*}" or "{*}Multi-module mode{*}" -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (ARROW-16327) [Java][CI]: Add support for Java 17 CI process
David Dali Susanibar Arce created ARROW-16327: - Summary: [Java][CI]: Add support for Java 17 CI process Key: ARROW-16327 URL: https://issues.apache.org/jira/browse/ARROW-16327 Project: Apache Arrow Issue Type: Sub-task Components: Java Affects Versions: 9.0.0 Reporter: David Dali Susanibar Arce Currently Arrow Java code is tenting with JSE11. This ticket is to planning/mapping activities involved to also offer support JS17 -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (ARROW-16326) [C++][Python] Add GCS Timeout parameter for GCS FileSystem.
Micah Kornfield created ARROW-16326: --- Summary: [C++][Python] Add GCS Timeout parameter for GCS FileSystem. Key: ARROW-16326 URL: https://issues.apache.org/jira/browse/ARROW-16326 Project: Apache Arrow Issue Type: Improvement Components: C++, Python Reporter: Micah Kornfield Assignee: Micah Kornfield Follow-up from [https://github.com/apache/arrow/pull/12763] if gcs testbench isn't installed properly the failure mode is tests timeouts because the connection hangs. We should add a timeout parameter to prevent this -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (ARROW-16325) [R] Add task for R package with gcc12
Dewey Dunnington created ARROW-16325: Summary: [R] Add task for R package with gcc12 Key: ARROW-16325 URL: https://issues.apache.org/jira/browse/ARROW-16325 Project: Apache Arrow Issue Type: Improvement Reporter: Dewey Dunnington We now have a check for gcc11; however, gcc11 has been the default on debian/testing for some time. The CRAN debian image now uses gcc12, so we should update the gcc11 task to use gcc12 here: https://github.com/apache/arrow/blob/0e03af446c328d0ef963510c3292cb14e092b917/dev/tasks/tasks.yml#L1319-L1328 -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (ARROW-16323) [Go] Implement Dictionary Scalars
Matthew Topol created ARROW-16323: - Summary: [Go] Implement Dictionary Scalars Key: ARROW-16323 URL: https://issues.apache.org/jira/browse/ARROW-16323 Project: Apache Arrow Issue Type: New Feature Components: Go Reporter: Matthew Topol -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (ARROW-16324) [Go] Implement Dictionary Unification
Matthew Topol created ARROW-16324: - Summary: [Go] Implement Dictionary Unification Key: ARROW-16324 URL: https://issues.apache.org/jira/browse/ARROW-16324 Project: Apache Arrow Issue Type: New Feature Components: Go Reporter: Matthew Topol -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (ARROW-16322) [Release][C++] Windows source verification should not mutate source tree
Antoine Pitrou created ARROW-16322: -- Summary: [Release][C++] Windows source verification should not mutate source tree Key: ARROW-16322 URL: https://issues.apache.org/jira/browse/ARROW-16322 Project: Apache Arrow Issue Type: Bug Components: C++, Continuous Integration, Developer Tools Reporter: Antoine Pitrou {{verify-release-candidate.bat}} creates a temporary dir for verification, but for some reason it still creates its CMake build dir inside the Arrow source tree. It should instead create the CMake build dir inside the temporary verification dir. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (ARROW-16321) [Release][C++] Windows source verification hardcodes Visual Studio version
Antoine Pitrou created ARROW-16321: -- Summary: [Release][C++] Windows source verification hardcodes Visual Studio version Key: ARROW-16321 URL: https://issues.apache.org/jira/browse/ARROW-16321 Project: Apache Arrow Issue Type: Improvement Components: C++, Continuous Integration, Developer Tools Reporter: Antoine Pitrou {{verify-release-candidate.bat}} currently hardcodes the CMake generator to "Visual Studio 16 2019". Ideally it should be able to use any Visual Studio version present on the system. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (ARROW-16320) Dataset re-partitioning consumes considerable amount of memory
Zsolt Kegyes-Brassai created ARROW-16320: Summary: Dataset re-partitioning consumes considerable amount of memory Key: ARROW-16320 URL: https://issues.apache.org/jira/browse/ARROW-16320 Project: Apache Arrow Issue Type: Improvement Affects Versions: 7.0.0 Reporter: Zsolt Kegyes-Brassai A short background: I was trying to create a dataset from a big pile of csv files (couple of hundreds). In first step the csv were parsed and saved to parquet files because there were many inconsistencies between csv files. In a consequent step the dataset was re-partitioned using one column (code_key). {code:java} new_dataset <- open_dataset( temp_parquet_folder, format = "parquet", unify_schemas = TRUE ) new_dataset |> group_by(code_key) |> write_dataset( folder_repartitioned_dataset, format = "parquet" ) {code} This re-partitioning consumed a considerable amount of memory (5 GB). * Is this a normal behavior? Or a bug? * Is there any rule of thumb to estimate the memory requirement for a dataset re-partitioning? (it’s important when scaling up this approach) The drawback is that this memory space is not freed up after the re-partitioning (I am using RStudio). The {{gc()}} useless in this situation. And there is no any associated object (to the repartitioning) in the {{R}} environment which can be removed from memory (using the {{rm()}} function). * How one can regain this memory space used by re-partitioning? The rationale behind choosing the dataset re-partitioning: if my understanding is correct, in the current arrow version the append is not working when writing parquet files/datasets. (the original csv files were partly partitioned according to a different variable) Can you recommend any better approach? -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (ARROW-16319) [R] [Docs] Add a list of the lubridate functions we support in {arrow}
Dragoș Moldovan-Grünfeld created ARROW-16319: Summary: [R] [Docs] Add a list of the lubridate functions we support in {arrow} Key: ARROW-16319 URL: https://issues.apache.org/jira/browse/ARROW-16319 Project: Apache Arrow Issue Type: Improvement Components: Documentation, R Affects Versions: 8.0.0 Reporter: Dragoș Moldovan-Grünfeld Fix For: 9.0.0 Add documentation around the {{lubridate}} functionality supported in {{arrow}}. Could be made up of: * a blogpost * a more in-depth piece of documentation -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (ARROW-16318) Timezone is not supported by to_duckdb()
Zsolt Kegyes-Brassai created ARROW-16318: Summary: Timezone is not supported by to_duckdb() Key: ARROW-16318 URL: https://issues.apache.org/jira/browse/ARROW-16318 Project: Apache Arrow Issue Type: Bug Affects Versions: 7.0.0 Reporter: Zsolt Kegyes-Brassai Here is a reproducible example: {code:java} library(tidyverse) library(arrow) df1 <- tibble(time = lubridate::now(tzone = "UTC")) str(df1) #> tibble [1 x 1] (S3: tbl_df/tbl/data.frame) #> $ time: POSIXct[1:1], format: "2022-04-25 12:50:10" write_dataset(df1, here::here("temp/df1"), format = "parquet") open_dataset(here::here("temp/df1")) |> to_duckdb() #> Error: duckdb_prepare_R: Failed to prepare query SELECT * #> FROM "arrow_001" AS "q01" #> WHERE (0 = 1) #> Error: Not implemented Error: Unsupported Internal Arrow Type tsu:UTC df2 <- tibble(time = lubridate::now()) str(df2) #> tibble [1 x 1] (S3: tbl_df/tbl/data.frame) #> $ time: POSIXct[1:1], format: "2022-04-25 14:50:11" write_dataset(df2, here::here("temp/df2"), format = "parquet") open_dataset(here::here("temp/df2")) |> to_duckdb() #> # Source: table [?? x 1] #> # Database: duckdb_connection #> time #> #> 1 2022-04-25 12:50:11 {code} The timestamps without timezone information are working fine. How one can remove easily the timezone information from {{timestamp }}type column from a parquet dataset? -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (ARROW-16317) [Archery][CI] Fix possible race condition when submitting crossbow builds
Raúl Cumplido created ARROW-16317: - Summary: [Archery][CI] Fix possible race condition when submitting crossbow builds Key: ARROW-16317 URL: https://issues.apache.org/jira/browse/ARROW-16317 Project: Apache Arrow Issue Type: Bug Components: Archery, Continuous Integration Reporter: Raúl Cumplido Fix For: 9.0.0 Sometimes when trying to use github-actions to submit crossbow jobs an error is raised like: {code:java} Failed to push updated references, potentially because of credential issues: ['refs/heads/actions-1883-github-wheel-windows-cp310-amd64', 'refs/tags/actions-1883-github-wheel-windows-cp310-amd64', 'refs/heads/actions-1883-github-wheel-windows-cp39-amd64', 'refs/tags/actions-1883-github-wheel-windows-cp39-amd64', 'refs/heads/actions-1883-github-wheel-windows-cp37-amd64', 'refs/tags/actions-1883-github-wheel-windows-cp37-amd64', 'refs/heads/actions-1883-github-wheel-windows-cp38-amd64', 'refs/tags/actions-1883-github-wheel-windows-cp38-amd64', 'refs/heads/actions-1883'] The Archery job run can be found at: https://github.com/apache/arrow/actions/runs/2195038965{code} As discussed on this github comment ([https://github.com/apache/arrow/pull/12930#issuecomment-1103772507)] We should remove the auto incremented IDs entirely and use unique hashes instead, e.g.: actions--github-wheel-windows-cp310-amd64 instead of actions-1883-github-wheel-windows-cp310-amd64. Then we wouldn't need to fetch the new references either, making remote crossbow builds and local submission much quicker. The error can also be seen here: https://github.com/apache/arrow/pull/12987#issuecomment-1108516668 -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (ARROW-16316) How to round the timestamps in a mutate statement?
Zsolt Kegyes-Brassai created ARROW-16316: Summary: How to round the timestamps in a mutate statement? Key: ARROW-16316 URL: https://issues.apache.org/jira/browse/ARROW-16316 Project: Apache Arrow Issue Type: Wish Affects Versions: 7.0.0 Reporter: Zsolt Kegyes-Brassai I was trying to aggregate over time using different granularity. Usually I would use the {{lubridate::floor_date()}} , which is currently not supported for parquet datasets. Is there any comprehensive list of supported list of currently supported {{lubridate }}(or {{{}dplyr{}}}) verbs? Maybe, it’s only my fault, but except the changelog I haven’t find any relevant information. Later I found that the {{round_temporal()}} function is exposed to {{{}R{}}}. But I am struggling to find the right syntax inside a mutate statement to apply on a {{timestamp[us, tz=UTC]}} type column. {code:java} new_dataset |> mutate(time = arrow_round_temporal(time)) #> Error: Invalid: Attempted to initialize KernelState from null FunctionOptions {code} Here are some other attempts: {code:java} library(arrow) arrow_now <- Scalar$create(lubridate::now()) (arrow_now) #> Scalar #> 2022-04-25 11:44:33.805609 call_function("round_temporal", arrow_now) #> Scalar #> 2022-04-25 00:00:00.00 call_function("round_temporal", arrow_now, unit = "day") #> Error: Argument 2 is of class character but it must be one of "Array", "ChunkedArray", "RecordBatch", "Table", or "Scalar" arrow_unit <- Scalar$create("day") (arrow_unit) #> Scalar #> day call_function("round_temporal", arrow_now, unit = arrow_unit) #> Error: Invalid: Function 'round_temporal' accepts 1 arguments but attempted to look up kernel(s) with 2 {code} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (ARROW-16315) [Python] Cython api test fails with allocation error on windows
Krisztian Szucs created ARROW-16315: --- Summary: [Python] Cython api test fails with allocation error on windows Key: ARROW-16315 URL: https://issues.apache.org/jira/browse/ARROW-16315 Project: Apache Arrow Issue Type: Bug Components: Python Reporter: Krisztian Szucs Fix For: 9.0.0 Getting memory pool deallocation errors https://github.com/ursacomputing/crossbow/runs/6154173225?check_suite_focus=true#step:6:33401 -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (ARROW-16314) [Python][CI] Skip running cython tests in windows verification builds
Krisztian Szucs created ARROW-16314: --- Summary: [Python][CI] Skip running cython tests in windows verification builds Key: ARROW-16314 URL: https://issues.apache.org/jira/browse/ARROW-16314 Project: Apache Arrow Issue Type: Bug Components: Continuous Integration, Python Reporter: Krisztian Szucs Getting memory pool errors https://github.com/ursacomputing/crossbow/runs/6154173225?check_suite_focus=true#step:6:33401 -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (ARROW-16313) [R] Uninitialized options for assume_timezone kernel
Antoine Pitrou created ARROW-16313: -- Summary: [R] Uninitialized options for assume_timezone kernel Key: ARROW-16313 URL: https://issues.apache.org/jira/browse/ARROW-16313 Project: Apache Arrow Issue Type: Bug Components: R Affects Versions: 7.0.0 Reporter: Antoine Pitrou Assignee: Antoine Pitrou Fix For: 8.0.0 Found by R sanitizer builds: {code} 2022-04-24T21:14:42.6518411Z Start test: `as_datetime()` 2022-04-24T21:14:42.6519247Z /usr/bin/../include/c++/v1/memory:2113:18: runtime error: load of value 16, which is not a valid value for type 'arrow::compute::AssumeTimezoneOptions::Ambiguous' 2022-04-24T21:14:42.6521313Z #0 0x7fc8ecdafbb1 in std::__1::__compressed_pair_elem::__compressed_pair_elem, std::__1::allocator >&&, arrow::compute::AssumeTimezoneOptions::Ambiguous&, arrow::compute::AssumeTimezoneOptions::Nonexistent&, 0ul, 1ul, 2ul>(std::__1::piecewise_construct_t, std::__1::tuple, std::__1::allocator >&&, arrow::compute::AssumeTimezoneOptions::Ambiguous&, arrow::compute::AssumeTimezoneOptions::Nonexistent&>, std::__1::__tuple_indices<0ul, 1ul, 2ul>) /usr/bin/../include/c++/v1/memory:2113:18 2022-04-24T21:14:42.6526869Z #1 0x7fc8ecdaf027 in std::__1::__compressed_pair, arrow::compute::AssumeTimezoneOptions>::__compressed_pair&, std::__1::basic_string, std::__1::allocator >&&, arrow::compute::AssumeTimezoneOptions::Ambiguous&, arrow::compute::AssumeTimezoneOptions::Nonexistent&>(std::__1::piecewise_construct_t, std::__1::tuple&>, std::__1::tuple, std::__1::allocator >&&, arrow::compute::AssumeTimezoneOptions::Ambiguous&, arrow::compute::AssumeTimezoneOptions::Nonexistent&>) /usr/bin/../include/c++/v1/memory:2197:9 2022-04-24T21:14:42.6530083Z #2 0x7fc8ecdae891 in std::__1::__shared_ptr_emplace >::__shared_ptr_emplace, std::__1::allocator >, arrow::compute::AssumeTimezoneOptions::Ambiguous&, arrow::compute::AssumeTimezoneOptions::Nonexistent&>(std::__1::allocator, std::__1::basic_string, std::__1::allocator >&&, arrow::compute::AssumeTimezoneOptions::Ambiguous&, arrow::compute::AssumeTimezoneOptions::Nonexistent&) /usr/bin/../include/c++/v1/memory:3470:16 2022-04-24T21:14:42.6533732Z #3 0x7fc8ecd81777 in std::__1::enable_if::value), std::__1::shared_ptr >::type std::__1::make_shared, std::__1::allocator >, arrow::compute::AssumeTimezoneOptions::Ambiguous&, arrow::compute::AssumeTimezoneOptions::Nonexistent&>(std::__1::basic_string, std::__1::allocator >&&, arrow::compute::AssumeTimezoneOptions::Ambiguous&, arrow::compute::AssumeTimezoneOptions::Nonexistent&) /usr/bin/../include/c++/v1/memory:4291:26 2022-04-24T21:14:42.6535804Z #4 0x7fc8ecd6f6de in make_compute_options(std::__1::basic_string, std::__1::allocator >, cpp11::r_vector) /tmp/RtmpHQX0ba/R.INSTALLb28193b79cc/arrow/src/compute.cpp:406:12 {code} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (ARROW-16312) [C++][CI] Install tzdata in the windows verification builds
Krisztian Szucs created ARROW-16312: --- Summary: [C++][CI] Install tzdata in the windows verification builds Key: ARROW-16312 URL: https://issues.apache.org/jira/browse/ARROW-16312 Project: Apache Arrow Issue Type: Improvement Components: C++, Continuous Integration Reporter: Krisztian Szucs Fix For: 8.0.0 See build log https://github.com/ursacomputing/crossbow/runs/614860?check_suite_focus=true -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (ARROW-16311) [JAVA] FlightSqlExample does not always return correct schema for CommandGetTables
Tim Van Wassenhove created ARROW-16311: -- Summary: [JAVA] FlightSqlExample does not always return correct schema for CommandGetTables Key: ARROW-16311 URL: https://issues.apache.org/jira/browse/ARROW-16311 Project: Apache Arrow Issue Type: Bug Components: Java Reporter: Tim Van Wassenhove Currently, getFlightInfoTables does not consider the "include_schema" value in CommandGetTables. This means that, in case include_schema is set to false, the returned schema returns a schema with a column that is not returned (table_schema column). -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (ARROW-16310) [R] test-fedora-r-clang-sanitizer job fails - possible tzdb installation issue
Nicola Crane created ARROW-16310: Summary: [R] test-fedora-r-clang-sanitizer job fails - possible tzdb installation issue Key: ARROW-16310 URL: https://issues.apache.org/jira/browse/ARROW-16310 Project: Apache Arrow Issue Type: Bug Reporter: Nicola Crane We're seeing an error on a sanitizer build for https://dev.azure.com/ursacomputing/crossbow/_build/results?buildId=23988=logs=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb=d9b15392-e4ce-5e4c-0c8c-b69645229181=3034 I think it's something to do with tzdb installation: {code:java} make: Target 'all' not remade because of errors. * installing *source* package ‘tzdb’ ... ** package ‘tzdb’ successfully unpacked and MD5 sums checked ** using staged installation make[1]: *** [/opt/R-devel/lib64/R/etc/Makeconf:178: api.o] Error 1 make[1]: Leaving directory '/tmp/Rtmp0aqclz/R.INSTALL51cc14b8c441/tzdb/src' ERROR: compilation failed for package ‘tzdb’ * removing ‘/opt/R-devel/lib64/R/library/tzdb’ The downloaded source packages are in ‘/tmp/Rtmpg6gyGy/downloaded_packages’ Updating HTML index of packages in '.Library' Making 'packages.html' ... done Warning messages: 1: package ‘’ is not available for this version of R A version of this package for your version of R might be available elsewhere, see the ideas at https://cran.r-project.org/doc/manuals/r-devel/R-admin.html#Installing-packages 2: In i.p(...) : installation of one or more packages failed, probably ‘tzdb’ > > / + popd {code} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (ARROW-16309) [CI] [Go] [Flight] Verify release jobs are failing due to: panic: rpc error: code = NotFound desc = Unknown descriptor
Raúl Cumplido created ARROW-16309: - Summary: [CI] [Go] [Flight] Verify release jobs are failing due to: panic: rpc error: code = NotFound desc = Unknown descriptor Key: ARROW-16309 URL: https://issues.apache.org/jira/browse/ARROW-16309 Project: Apache Arrow Issue Type: Bug Components: Continuous Integration, FlightRPC, Go Reporter: Raúl Cumplido Fix For: 8.0.0 There are two verify release jobs (verify-rc-source-integration-linux-almalinux-8-amd64 and verify-rc-source-integration-linux-ubuntu-22.04-amd64) that are failing with the following error: Testing file extension == Traceback (most recent call last): # FAILURES # File "/arrow/dev/archery/archery/integration/util.py", line 139, in run_cmd output = subprocess.check_output(cmd, stderr=subprocess.STDOUT) File "/usr/lib/python3.10/subprocess.py", line 420, in check_output return run(*popenargs, stdout=PIPE, timeout=timeout, check=True, File "/usr/lib/python3.10/subprocess.py", line 524, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['/tmp/arrow-HEAD.oA0f2/go/gopath/bin/arrow-flight-integration-client', '-host', 'localhost', '-port=42937', '-path', '/tmp/arrow-integration-n3a4l1n_/generated_primitive_no_batches.json']' returned non-zero exit status 2. During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/arrow/dev/archery/archery/integration/runner.py", line 379, in _run_flight_test_case consumer.flight_request(port, **client_args) File "/arrow/dev/archery/archery/integration/tester_go.py", line 121, in flight_request run_cmd(cmd) File "/arrow/dev/archery/archery/integration/util.py", line 148, in run_cmd raise RuntimeError(sio.getvalue()) RuntimeError: Command failed: /tmp/arrow-HEAD.oA0f2/go/gopath/bin/arrow-flight-integration-client -host localhost -port=42937 -path /tmp/arrow-integration-n3a4l1n_/generated_primitive_no_batches.json With output: -- Opening JSON file ' /tmp/arrow-integration-n3a4l1n_/generated_primitive_no_batches.json ' Opening JSON file ' /tmp/arrow-integration-n3a4l1n_/generated_primitive_no_batches.json ' Opening JSON file ' /tmp/arrow-integration-n3a4l1n_/generated_primitive_no_batches.json ' panic: rpc error: code = NotFound desc = Unknown descriptor. goroutine 1 [running]: main.main() /arrow/go/arrow/internal/flight_integration/cmd/arrow-flight-integration-client/main.go:52 +0x31a See the job failures: [https://github.com/ursacomputing/crossbow/runs/6147131844?check_suite_focus=true] [https://github.com/ursacomputing/crossbow/runs/6147124267?check_suite_focus=true] -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (ARROW-16308) [CI] Upgrade windows runner version as windows-2016 is deprecated.
Jacob Wujciak-Jens created ARROW-16308: -- Summary: [CI] Upgrade windows runner version as windows-2016 is deprecated. Key: ARROW-16308 URL: https://issues.apache.org/jira/browse/ARROW-16308 Project: Apache Arrow Issue Type: Task Reporter: Jacob Wujciak-Jens Assignee: Jacob Wujciak-Jens Fix For: 8.0.0 "The windows-2016 environment is deprecated and will be removed on April 1st, 2022" So we need to upgrade all runners to at least windows-2019 (or 2022/latest ?) -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (ARROW-16307) [CI][Java][Flight] Verify release candidate fails on org.apache.arrow.flight.TestFlightService
Raúl Cumplido created ARROW-16307: - Summary: [CI][Java][Flight] Verify release candidate fails on org.apache.arrow.flight.TestFlightService Key: ARROW-16307 URL: https://issues.apache.org/jira/browse/ARROW-16307 Project: Apache Arrow Issue Type: Bug Components: Continuous Integration, FlightRPC, Java Reporter: Raúl Cumplido Fix For: 8.0.0 Currently our nightly verify release is failing on verify-rc-source-java-macos-amd64 with the following error (https://github.com/ursacomputing/crossbow/runs/6147103836?check_suite_focus=true): {code:java} Warning: Tests run: 6, Failures: 0, Errors: 0, Skipped: 4, Time elapsed: 0.021 s - in org.apache.arrow.flight.TestApplicationMetadata [INFO] Running org.apache.arrow.flight.TestFlightService [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0 s - in org.apache.arrow.flight.TestFlightService [INFO] [INFO] Results: [INFO] Error: Errors: Error: TestDoExchange.tearDown:74 » IllegalState Memory was leaked by query. Memory l... [INFO] Error: Tests run: 108, Failures: 0, Errors: 1, Skipped: 19 [INFO] [INFO] [INFO] Reactor Summary for Apache Arrow Java Root POM 8.0.0-SNAPSHOT: [INFO] [INFO] Apache Arrow Java Root POM . SUCCESS [ 6.956 s] [INFO] Arrow Format ... SUCCESS [ 1.402 s] [INFO] Arrow Memory ... SUCCESS [ 1.021 s] [INFO] Arrow Memory - Core SUCCESS [ 5.303 s] [INFO] Arrow Memory - Unsafe .. SUCCESS [ 3.614 s] [INFO] Arrow Memory - Netty ... SUCCESS [ 3.989 s] [INFO] Arrow Vectors .. SUCCESS [ 29.436 s] [INFO] Arrow Compression .. SUCCESS [ 4.271 s] [INFO] Arrow Tools SUCCESS [ 22.850 s] [INFO] Arrow JDBC Adapter . SUCCESS [ 9.101 s] [INFO] Arrow Plasma Client SUCCESS [ 1.068 s] [INFO] Arrow Flight ... SUCCESS [ 1.072 s] [INFO] Arrow Flight Core .. FAILURE [ 37.941 s] [INFO] Arrow Flight GRPC .. SKIPPED [INFO] Arrow Flight SQL ... SKIPPED [INFO] Arrow Flight Integration Tests . SKIPPED [INFO] Arrow AVRO Adapter . SKIPPED [INFO] Arrow Algorithms ... SKIPPED [INFO] Arrow Performance Benchmarks ... SKIPPED [INFO] [INFO] BUILD FAILURE [INFO] [INFO] Total time: 02:08 min [INFO] Finished at: 2022-04-24T14:17:58Z [INFO] Error: Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M3:test (default-test) on project flight-core: There are test failures. Error: Error: Please refer to /Users/runner/work/crossbow/crossbow/arrow/java/flight/flight-core/target/surefire-reports for the individual test results. Error: Please refer to dump files (if any exist) [date].dump, [date]-jvmRun[N].dump and [date].dumpstream. Error: -> [Help 1] Error: Error: To see the full stack trace of the errors, re-run Maven with the -e switch. Error: Re-run Maven using the -X switch to enable full debug logging. Error: Error: For more information about the errors and possible solutions, please read the following articles: Error: [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException Error: Error: After correcting the problems, you can resume the build with the command Error: mvn -rf :flight-core Failed to verify release candidate. See /var/folders/24/8k48jl6d249_n_qfxwsl6xvmgn/T/arrow-HEAD.X.q7EbeuIy for details. {code} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (ARROW-16306) [CI] Nightly verify rc on ubuntu is failing due to setuptools scm unable to find version
Raúl Cumplido created ARROW-16306: - Summary: [CI] Nightly verify rc on ubuntu is failing due to setuptools scm unable to find version Key: ARROW-16306 URL: https://issues.apache.org/jira/browse/ARROW-16306 Project: Apache Arrow Issue Type: Bug Components: Continuous Integration Reporter: Raúl Cumplido Assignee: Raúl Cumplido Fix For: 8.0.0 These current jobs: - verify-rc-source-python-linux-ubuntu-18.04-amd64: - verify-rc-source-python-linux-ubuntu-20.04-amd64: Are failing due to: {code:java} Traceback (most recent call last): File "setup.py", line 607, in setup( File "/tmp/arrow-HEAD.7Wo1N/venv-source/lib/python3.8/site-packages/setuptools/__init__.py", line 129, in setup return distutils.core.setup(**attrs) File "/usr/lib/python3.8/distutils/core.py", line 108, in setup _setup_distribution = dist = klass(attrs) File "/tmp/arrow-HEAD.7Wo1N/venv-source/lib/python3.8/site-packages/setuptools/dist.py", line 372, in __init__ _Distribution.__init__(self, attrs) File "/usr/lib/python3.8/distutils/dist.py", line 292, in __init__ self.finalize_options() File "/tmp/arrow-HEAD.7Wo1N/venv-source/lib/python3.8/site-packages/setuptools/dist.py", line 528, in finalize_options ep.load()(self, ep.name, value) File "/tmp/arrow-HEAD.7Wo1N/venv-source/lib/python3.8/site-packages/setuptools_scm/integration.py", line 75, in version_keyword _assign_version(dist, config) File "/tmp/arrow-HEAD.7Wo1N/venv-source/lib/python3.8/site-packages/setuptools_scm/integration.py", line 51, in _assign_version _version_missing(config) File "/tmp/arrow-HEAD.7Wo1N/venv-source/lib/python3.8/site-packages/setuptools_scm/__init__.py", line 106, in _version_missing raise LookupError( LookupError: setuptools-scm was unable to detect version for /arrow.Make sure you're either building from a fully intact git repository or PyPI tarballs. Most other sources (such as GitHub's tarballs, a git checkout without the .git folder) don't contain the necessary metadata and will not work.For example, if you're using pip, instead of https://github.com/user/proj/archive/master.zip use git+https://github.com/user/proj.git#egg=proj Failed to verify release candidate. See /tmp/arrow-HEAD.7Wo1N for details. 1 Error: `docker-compose --file /home/runner/work/crossbow/crossbow/arrow/docker-compose.yml run --rm -e VERIFY_VERSION= -e VERIFY_RC= -e TEST_DEFAULT=0 -e TEST_PYTHON=1 ubuntu-verify-rc` exited with a non-zero exit code 1, see the process log above.The docker-compose command was invoked with the following parameters:{code} This was fixed for the verify-conda-rc but the ubuntu ones have started failing. -- This message was sent by Atlassian Jira (v8.20.7#820007)