[jira] [Created] (ARROW-16328) [Java] POC Arrow Modular (format module for example)

2022-04-25 Thread David Dali Susanibar Arce (Jira)
David Dali Susanibar Arce created ARROW-16328:
-

 Summary: [Java] POC Arrow Modular (format module for example)
 Key: ARROW-16328
 URL: https://issues.apache.org/jira/browse/ARROW-16328
 Project: Apache Arrow
  Issue Type: Sub-task
Reporter: David Dali Susanibar Arce


POC to move Arrow Java module to Single/Multi module mode.

Currently we are supporting Arrow Java JSE1.8 to be able to consume by 
JSE11,17,18 in *legacy mode* that is enabled when the compilation environment 
is defined by the {{{}--source{}}}, {{--target.}}

This POC is to validate changes needed in case Arrow java decided to implement 
"{*}Single module mode{*}" or "{*}Multi-module mode{*}"



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (ARROW-16327) [Java][CI]: Add support for Java 17 CI process

2022-04-25 Thread David Dali Susanibar Arce (Jira)
David Dali Susanibar Arce created ARROW-16327:
-

 Summary: [Java][CI]: Add support for Java 17 CI process
 Key: ARROW-16327
 URL: https://issues.apache.org/jira/browse/ARROW-16327
 Project: Apache Arrow
  Issue Type: Sub-task
  Components: Java
Affects Versions: 9.0.0
Reporter: David Dali Susanibar Arce


Currently Arrow Java code is tenting with JSE11.

This ticket is to planning/mapping activities involved to also offer support 
JS17



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (ARROW-16326) [C++][Python] Add GCS Timeout parameter for GCS FileSystem.

2022-04-25 Thread Micah Kornfield (Jira)
Micah Kornfield created ARROW-16326:
---

 Summary: [C++][Python] Add GCS Timeout parameter for GCS 
FileSystem.
 Key: ARROW-16326
 URL: https://issues.apache.org/jira/browse/ARROW-16326
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++, Python
Reporter: Micah Kornfield
Assignee: Micah Kornfield


Follow-up from [https://github.com/apache/arrow/pull/12763] if gcs testbench 
isn't installed properly the failure mode is tests timeouts because the 
connection hangs.  We should add a timeout parameter to prevent this



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (ARROW-16325) [R] Add task for R package with gcc12

2022-04-25 Thread Dewey Dunnington (Jira)
Dewey Dunnington created ARROW-16325:


 Summary: [R] Add task for R package with gcc12
 Key: ARROW-16325
 URL: https://issues.apache.org/jira/browse/ARROW-16325
 Project: Apache Arrow
  Issue Type: Improvement
Reporter: Dewey Dunnington


We now have a check for gcc11; however, gcc11 has been the default on 
debian/testing for some time. The CRAN debian image now uses gcc12, so we 
should update the gcc11 task to use gcc12 here: 
https://github.com/apache/arrow/blob/0e03af446c328d0ef963510c3292cb14e092b917/dev/tasks/tasks.yml#L1319-L1328



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (ARROW-16323) [Go] Implement Dictionary Scalars

2022-04-25 Thread Matthew Topol (Jira)
Matthew Topol created ARROW-16323:
-

 Summary: [Go] Implement Dictionary Scalars
 Key: ARROW-16323
 URL: https://issues.apache.org/jira/browse/ARROW-16323
 Project: Apache Arrow
  Issue Type: New Feature
  Components: Go
Reporter: Matthew Topol






--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (ARROW-16324) [Go] Implement Dictionary Unification

2022-04-25 Thread Matthew Topol (Jira)
Matthew Topol created ARROW-16324:
-

 Summary: [Go] Implement Dictionary Unification
 Key: ARROW-16324
 URL: https://issues.apache.org/jira/browse/ARROW-16324
 Project: Apache Arrow
  Issue Type: New Feature
  Components: Go
Reporter: Matthew Topol






--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (ARROW-16322) [Release][C++] Windows source verification should not mutate source tree

2022-04-25 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-16322:
--

 Summary: [Release][C++] Windows source verification should not 
mutate source tree
 Key: ARROW-16322
 URL: https://issues.apache.org/jira/browse/ARROW-16322
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++, Continuous Integration, Developer Tools
Reporter: Antoine Pitrou


{{verify-release-candidate.bat}} creates a temporary dir for verification, but 
for some reason it still creates its CMake build dir inside the Arrow source 
tree. It should instead create the CMake build dir inside the temporary 
verification dir.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (ARROW-16321) [Release][C++] Windows source verification hardcodes Visual Studio version

2022-04-25 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-16321:
--

 Summary: [Release][C++] Windows source verification hardcodes 
Visual Studio version
 Key: ARROW-16321
 URL: https://issues.apache.org/jira/browse/ARROW-16321
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++, Continuous Integration, Developer Tools
Reporter: Antoine Pitrou


{{verify-release-candidate.bat}} currently hardcodes the CMake generator to 
"Visual Studio 16 2019". Ideally it should be able to use any Visual Studio 
version present on the system.




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (ARROW-16320) Dataset re-partitioning consumes considerable amount of memory

2022-04-25 Thread Zsolt Kegyes-Brassai (Jira)
Zsolt Kegyes-Brassai created ARROW-16320:


 Summary: Dataset re-partitioning consumes considerable amount of 
memory
 Key: ARROW-16320
 URL: https://issues.apache.org/jira/browse/ARROW-16320
 Project: Apache Arrow
  Issue Type: Improvement
Affects Versions: 7.0.0
Reporter: Zsolt Kegyes-Brassai


A short background: I was trying to create a dataset from a big pile of csv 
files (couple of hundreds). In first step the csv were parsed and saved to 
parquet files because there were many inconsistencies between csv files. In a 
consequent step the dataset was re-partitioned using one column (code_key).

 
{code:java}
new_dataset <- open_dataset(
  temp_parquet_folder, 
  format = "parquet",
  unify_schemas = TRUE
  )
new_dataset |> 
  group_by(code_key) |> 
  write_dataset(
    folder_repartitioned_dataset, 
    format = "parquet"
  )
{code}
 

This re-partitioning consumed a considerable amount of memory (5 GB). 
 * Is this a normal behavior?  Or a bug?
 * Is there any rule of thumb to estimate the memory requirement for a dataset 
re-partitioning? (it’s important when scaling up this approach)

The drawback is that this memory space is not freed up after the 
re-partitioning  (I am using RStudio). 
The {{gc()}} useless in this situation. And there is no any associated object 
(to the repartitioning) in the {{R}} environment which can be removed from 
memory (using the {{rm()}} function).
 * How one can regain this memory space used by re-partitioning?

The rationale behind choosing the dataset re-partitioning: if my understanding 
is correct,  in the current arrow version the append is not working when 
writing parquet files/datasets. (the original csv files were partly partitioned 
according to a different variable)

Can you recommend any better approach?



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (ARROW-16319) [R] [Docs] Add a list of the lubridate functions we support in {arrow}

2022-04-25 Thread Jira
Dragoș Moldovan-Grünfeld created ARROW-16319:


 Summary: [R] [Docs] Add a list of the lubridate functions we 
support in {arrow}
 Key: ARROW-16319
 URL: https://issues.apache.org/jira/browse/ARROW-16319
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Documentation, R
Affects Versions: 8.0.0
Reporter: Dragoș Moldovan-Grünfeld
 Fix For: 9.0.0


Add documentation around the {{lubridate}} functionality supported in 
{{arrow}}. Could be made up of:
* a blogpost 
* a more in-depth piece of documentation



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (ARROW-16318) Timezone is not supported by to_duckdb()

2022-04-25 Thread Zsolt Kegyes-Brassai (Jira)
Zsolt Kegyes-Brassai created ARROW-16318:


 Summary: Timezone is not supported by to_duckdb()
 Key: ARROW-16318
 URL: https://issues.apache.org/jira/browse/ARROW-16318
 Project: Apache Arrow
  Issue Type: Bug
Affects Versions: 7.0.0
Reporter: Zsolt Kegyes-Brassai


Here is a reproducible example:

 
{code:java}
library(tidyverse)
library(arrow)

df1 <- tibble(time = lubridate::now(tzone = "UTC"))
str(df1)
#> tibble [1 x 1] (S3: tbl_df/tbl/data.frame)
#>  $ time: POSIXct[1:1], format: "2022-04-25 12:50:10"
write_dataset(df1, here::here("temp/df1"), format = "parquet")
open_dataset(here::here("temp/df1")) |> 
  to_duckdb()
#> Error: duckdb_prepare_R: Failed to prepare query SELECT *
#> FROM "arrow_001" AS "q01"
#> WHERE (0 = 1)
#> Error: Not implemented Error: Unsupported Internal Arrow Type tsu:UTC

df2 <- tibble(time = lubridate::now())
str(df2)
#> tibble [1 x 1] (S3: tbl_df/tbl/data.frame)
#>  $ time: POSIXct[1:1], format: "2022-04-25 14:50:11"
write_dataset(df2, here::here("temp/df2"), format = "parquet")
open_dataset(here::here("temp/df2")) |> 
  to_duckdb()
#> # Source:   table [?? x 1]
#> # Database: duckdb_connection
#>   time               
#>                
#> 1 2022-04-25 12:50:11
{code}
 

The timestamps without timezone information are working fine.

How one can remove easily the timezone information from {{timestamp }}type 
column from a parquet dataset?



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (ARROW-16317) [Archery][CI] Fix possible race condition when submitting crossbow builds

2022-04-25 Thread Jira
Raúl Cumplido created ARROW-16317:
-

 Summary: [Archery][CI] Fix possible race condition when submitting 
crossbow builds
 Key: ARROW-16317
 URL: https://issues.apache.org/jira/browse/ARROW-16317
 Project: Apache Arrow
  Issue Type: Bug
  Components: Archery, Continuous Integration
Reporter: Raúl Cumplido
 Fix For: 9.0.0


Sometimes when trying to use github-actions to submit crossbow jobs an error is 
raised like:
{code:java}
Failed to push updated references, potentially because of credential issues: 
['refs/heads/actions-1883-github-wheel-windows-cp310-amd64', 
'refs/tags/actions-1883-github-wheel-windows-cp310-amd64', 
'refs/heads/actions-1883-github-wheel-windows-cp39-amd64', 
'refs/tags/actions-1883-github-wheel-windows-cp39-amd64', 
'refs/heads/actions-1883-github-wheel-windows-cp37-amd64', 
'refs/tags/actions-1883-github-wheel-windows-cp37-amd64', 
'refs/heads/actions-1883-github-wheel-windows-cp38-amd64', 
'refs/tags/actions-1883-github-wheel-windows-cp38-amd64', 
'refs/heads/actions-1883']
The Archery job run can be found at: 
https://github.com/apache/arrow/actions/runs/2195038965{code}
As discussed on this github comment 
([https://github.com/apache/arrow/pull/12930#issuecomment-1103772507)]

We should remove the auto incremented IDs entirely and use unique hashes 
instead, e.g.: actions--github-wheel-windows-cp310-amd64 instead of 
actions-1883-github-wheel-windows-cp310-amd64. Then we wouldn't need to fetch 
the new references either, making remote crossbow builds and local submission 
much quicker.

The error can also be seen here: 
https://github.com/apache/arrow/pull/12987#issuecomment-1108516668



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (ARROW-16316) How to round the timestamps in a mutate statement?

2022-04-25 Thread Zsolt Kegyes-Brassai (Jira)
Zsolt Kegyes-Brassai created ARROW-16316:


 Summary: How to round the timestamps in a mutate statement?
 Key: ARROW-16316
 URL: https://issues.apache.org/jira/browse/ARROW-16316
 Project: Apache Arrow
  Issue Type: Wish
Affects Versions: 7.0.0
Reporter: Zsolt Kegyes-Brassai


I was trying to aggregate over time using different granularity. Usually I 
would use the {{lubridate::floor_date()}} , which is currently not supported 
for parquet datasets.


Is there any comprehensive list of supported list of currently supported 
{{lubridate }}(or {{{}dplyr{}}}) verbs? Maybe, it’s only my fault, but except 
the changelog I haven’t find any relevant information.

 

Later I found that the {{round_temporal()}} function is exposed to {{{}R{}}}. 
But I am struggling to find the right syntax inside a mutate statement to apply 
on a {{timestamp[us, tz=UTC]}} type column.
{code:java}
new_dataset |>
  mutate(time = arrow_round_temporal(time))
#>  Error: Invalid: Attempted to initialize KernelState from null 
FunctionOptions
{code}
 

 

Here are some other attempts:
{code:java}
library(arrow)

arrow_now <- Scalar$create(lubridate::now())
(arrow_now)
#> Scalar
#> 2022-04-25 11:44:33.805609
call_function("round_temporal", arrow_now)
#> Scalar
#> 2022-04-25 00:00:00.00
call_function("round_temporal", arrow_now, unit = "day")
#> Error: Argument 2 is of class character but it must be one of "Array", 
"ChunkedArray", "RecordBatch", "Table", or "Scalar"
arrow_unit <- Scalar$create("day")
(arrow_unit)
#> Scalar
#> day
call_function("round_temporal", arrow_now, unit = arrow_unit)
#> Error: Invalid: Function 'round_temporal' accepts 1 arguments but attempted 
to look up kernel(s) with 2
{code}
 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (ARROW-16315) [Python] Cython api test fails with allocation error on windows

2022-04-25 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-16315:
---

 Summary: [Python] Cython api test fails with allocation error on 
windows
 Key: ARROW-16315
 URL: https://issues.apache.org/jira/browse/ARROW-16315
 Project: Apache Arrow
  Issue Type: Bug
  Components: Python
Reporter: Krisztian Szucs
 Fix For: 9.0.0


Getting memory pool deallocation errors 
https://github.com/ursacomputing/crossbow/runs/6154173225?check_suite_focus=true#step:6:33401



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (ARROW-16314) [Python][CI] Skip running cython tests in windows verification builds

2022-04-25 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-16314:
---

 Summary: [Python][CI] Skip running cython tests in windows 
verification builds
 Key: ARROW-16314
 URL: https://issues.apache.org/jira/browse/ARROW-16314
 Project: Apache Arrow
  Issue Type: Bug
  Components: Continuous Integration, Python
Reporter: Krisztian Szucs


Getting memory pool errors 
https://github.com/ursacomputing/crossbow/runs/6154173225?check_suite_focus=true#step:6:33401




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (ARROW-16313) [R] Uninitialized options for assume_timezone kernel

2022-04-25 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-16313:
--

 Summary: [R] Uninitialized options for assume_timezone kernel
 Key: ARROW-16313
 URL: https://issues.apache.org/jira/browse/ARROW-16313
 Project: Apache Arrow
  Issue Type: Bug
  Components: R
Affects Versions: 7.0.0
Reporter: Antoine Pitrou
Assignee: Antoine Pitrou
 Fix For: 8.0.0


Found by R sanitizer builds:
{code}

2022-04-24T21:14:42.6518411Z Start test: `as_datetime()`
2022-04-24T21:14:42.6519247Z /usr/bin/../include/c++/v1/memory:2113:18: runtime 
error: load of value 16, which is not a valid value for type 
'arrow::compute::AssumeTimezoneOptions::Ambiguous'
2022-04-24T21:14:42.6521313Z #0 0x7fc8ecdafbb1 in 
std::__1::__compressed_pair_elem::__compressed_pair_elem, std::__1::allocator >&&, 
arrow::compute::AssumeTimezoneOptions::Ambiguous&, 
arrow::compute::AssumeTimezoneOptions::Nonexistent&, 0ul, 1ul, 
2ul>(std::__1::piecewise_construct_t, 
std::__1::tuple, 
std::__1::allocator >&&, 
arrow::compute::AssumeTimezoneOptions::Ambiguous&, 
arrow::compute::AssumeTimezoneOptions::Nonexistent&>, 
std::__1::__tuple_indices<0ul, 1ul, 2ul>) 
/usr/bin/../include/c++/v1/memory:2113:18
2022-04-24T21:14:42.6526869Z #1 0x7fc8ecdaf027 in 
std::__1::__compressed_pair,
 
arrow::compute::AssumeTimezoneOptions>::__compressed_pair&,
 std::__1::basic_string, 
std::__1::allocator >&&, 
arrow::compute::AssumeTimezoneOptions::Ambiguous&, 
arrow::compute::AssumeTimezoneOptions::Nonexistent&>(std::__1::piecewise_construct_t,
 std::__1::tuple&>, 
std::__1::tuple, 
std::__1::allocator >&&, 
arrow::compute::AssumeTimezoneOptions::Ambiguous&, 
arrow::compute::AssumeTimezoneOptions::Nonexistent&>) 
/usr/bin/../include/c++/v1/memory:2197:9
2022-04-24T21:14:42.6530083Z #2 0x7fc8ecdae891 in 
std::__1::__shared_ptr_emplace 
>::__shared_ptr_emplace, std::__1::allocator >, 
arrow::compute::AssumeTimezoneOptions::Ambiguous&, 
arrow::compute::AssumeTimezoneOptions::Nonexistent&>(std::__1::allocator,
 std::__1::basic_string, 
std::__1::allocator >&&, 
arrow::compute::AssumeTimezoneOptions::Ambiguous&, 
arrow::compute::AssumeTimezoneOptions::Nonexistent&) 
/usr/bin/../include/c++/v1/memory:3470:16
2022-04-24T21:14:42.6533732Z #3 0x7fc8ecd81777 in 
std::__1::enable_if::value), 
std::__1::shared_ptr >::type 
std::__1::make_shared, 
std::__1::allocator >, arrow::compute::AssumeTimezoneOptions::Ambiguous&, 
arrow::compute::AssumeTimezoneOptions::Nonexistent&>(std::__1::basic_string, std::__1::allocator >&&, 
arrow::compute::AssumeTimezoneOptions::Ambiguous&, 
arrow::compute::AssumeTimezoneOptions::Nonexistent&) 
/usr/bin/../include/c++/v1/memory:4291:26
2022-04-24T21:14:42.6535804Z #4 0x7fc8ecd6f6de in 
make_compute_options(std::__1::basic_string, 
std::__1::allocator >, cpp11::r_vector) 
/tmp/RtmpHQX0ba/R.INSTALLb28193b79cc/arrow/src/compute.cpp:406:12
{code}




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (ARROW-16312) [C++][CI] Install tzdata in the windows verification builds

2022-04-25 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-16312:
---

 Summary: [C++][CI] Install tzdata in the windows verification 
builds
 Key: ARROW-16312
 URL: https://issues.apache.org/jira/browse/ARROW-16312
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++, Continuous Integration
Reporter: Krisztian Szucs
 Fix For: 8.0.0


See build log 
https://github.com/ursacomputing/crossbow/runs/614860?check_suite_focus=true



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (ARROW-16311) [JAVA] FlightSqlExample does not always return correct schema for CommandGetTables

2022-04-25 Thread Tim Van Wassenhove (Jira)
Tim Van Wassenhove created ARROW-16311:
--

 Summary: [JAVA] FlightSqlExample does not always return correct 
schema for CommandGetTables
 Key: ARROW-16311
 URL: https://issues.apache.org/jira/browse/ARROW-16311
 Project: Apache Arrow
  Issue Type: Bug
  Components: Java
Reporter: Tim Van Wassenhove


Currently, getFlightInfoTables does not consider the "include_schema" value in 
CommandGetTables.

 

This means that, in case include_schema is set to false, the returned schema 
returns a schema with a column that is not returned (table_schema column).



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (ARROW-16310) [R] test-fedora-r-clang-sanitizer job fails - possible tzdb installation issue

2022-04-25 Thread Nicola Crane (Jira)
Nicola Crane created ARROW-16310:


 Summary: [R] test-fedora-r-clang-sanitizer job fails - possible 
tzdb installation issue
 Key: ARROW-16310
 URL: https://issues.apache.org/jira/browse/ARROW-16310
 Project: Apache Arrow
  Issue Type: Bug
Reporter: Nicola Crane


We're seeing an error on a sanitizer build for 

https://dev.azure.com/ursacomputing/crossbow/_build/results?buildId=23988=logs=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb=d9b15392-e4ce-5e4c-0c8c-b69645229181=3034

I think it's something to do with tzdb installation:


{code:java}
make: Target 'all' not remade because of errors.
* installing *source* package ‘tzdb’ ...
** package ‘tzdb’ successfully unpacked and MD5 sums checked
** using staged installation
make[1]: *** [/opt/R-devel/lib64/R/etc/Makeconf:178: api.o] Error 1
make[1]: Leaving directory '/tmp/Rtmp0aqclz/R.INSTALL51cc14b8c441/tzdb/src'
ERROR: compilation failed for package ‘tzdb’
* removing ‘/opt/R-devel/lib64/R/library/tzdb’

The downloaded source packages are in
‘/tmp/Rtmpg6gyGy/downloaded_packages’
Updating HTML index of packages in '.Library'
Making 'packages.html' ... done
Warning messages:
1: package ‘’ is not available for this version of R

A version of this package for your version of R might be available elsewhere,
see the ideas at
https://cran.r-project.org/doc/manuals/r-devel/R-admin.html#Installing-packages 
2: In i.p(...) : installation of one or more packages failed,
  probably ‘tzdb’
> 
> 
/
+ popd

{code}




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (ARROW-16309) [CI] [Go] [Flight] Verify release jobs are failing due to: panic: rpc error: code = NotFound desc = Unknown descriptor

2022-04-25 Thread Jira
Raúl Cumplido created ARROW-16309:
-

 Summary: [CI] [Go] [Flight] Verify release jobs are failing due 
to: panic: rpc error: code = NotFound desc = Unknown descriptor
 Key: ARROW-16309
 URL: https://issues.apache.org/jira/browse/ARROW-16309
 Project: Apache Arrow
  Issue Type: Bug
  Components: Continuous Integration, FlightRPC, Go
Reporter: Raúl Cumplido
 Fix For: 8.0.0


There are two verify release jobs 
(verify-rc-source-integration-linux-almalinux-8-amd64 and 
verify-rc-source-integration-linux-ubuntu-22.04-amd64) that are failing with 
the following error:
 Testing file extension
==
Traceback (most recent call last):
# FAILURES #
  File "/arrow/dev/archery/archery/integration/util.py", line 139, in run_cmd
output = subprocess.check_output(cmd, stderr=subprocess.STDOUT)
  File "/usr/lib/python3.10/subprocess.py", line 420, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "/usr/lib/python3.10/subprocess.py", line 524, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 
'['/tmp/arrow-HEAD.oA0f2/go/gopath/bin/arrow-flight-integration-client', 
'-host', 'localhost', '-port=42937', '-path', 
'/tmp/arrow-integration-n3a4l1n_/generated_primitive_no_batches.json']' 
returned non-zero exit status 2.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/arrow/dev/archery/archery/integration/runner.py", line 379, in 
_run_flight_test_case
consumer.flight_request(port, **client_args)
  File "/arrow/dev/archery/archery/integration/tester_go.py", line 121, in 
flight_request
run_cmd(cmd)
  File "/arrow/dev/archery/archery/integration/util.py", line 148, in run_cmd
raise RuntimeError(sio.getvalue())
RuntimeError: Command failed: 
/tmp/arrow-HEAD.oA0f2/go/gopath/bin/arrow-flight-integration-client -host 
localhost -port=42937 -path 
/tmp/arrow-integration-n3a4l1n_/generated_primitive_no_batches.json
With output:
 --
Opening JSON file ' 
/tmp/arrow-integration-n3a4l1n_/generated_primitive_no_batches.json '
Opening JSON file ' 
/tmp/arrow-integration-n3a4l1n_/generated_primitive_no_batches.json '
Opening JSON file ' 
/tmp/arrow-integration-n3a4l1n_/generated_primitive_no_batches.json '
panic: rpc error: code = NotFound desc = Unknown descriptor.

goroutine 1 [running]:
main.main()

/arrow/go/arrow/internal/flight_integration/cmd/arrow-flight-integration-client/main.go:52
 +0x31a
See the job failures:
[https://github.com/ursacomputing/crossbow/runs/6147131844?check_suite_focus=true]

[https://github.com/ursacomputing/crossbow/runs/6147124267?check_suite_focus=true]

 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (ARROW-16308) [CI] Upgrade windows runner version as windows-2016 is deprecated.

2022-04-25 Thread Jacob Wujciak-Jens (Jira)
Jacob Wujciak-Jens created ARROW-16308:
--

 Summary: [CI] Upgrade windows runner version as windows-2016 is 
deprecated.
 Key: ARROW-16308
 URL: https://issues.apache.org/jira/browse/ARROW-16308
 Project: Apache Arrow
  Issue Type: Task
Reporter: Jacob Wujciak-Jens
Assignee: Jacob Wujciak-Jens
 Fix For: 8.0.0


"The windows-2016 environment is deprecated and will be removed on April 1st, 
2022"

So we need to upgrade all runners to at least windows-2019 (or 2022/latest ?) 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (ARROW-16307) [CI][Java][Flight] Verify release candidate fails on org.apache.arrow.flight.TestFlightService

2022-04-25 Thread Jira
Raúl Cumplido created ARROW-16307:
-

 Summary: [CI][Java][Flight] Verify release candidate fails on 
org.apache.arrow.flight.TestFlightService
 Key: ARROW-16307
 URL: https://issues.apache.org/jira/browse/ARROW-16307
 Project: Apache Arrow
  Issue Type: Bug
  Components: Continuous Integration, FlightRPC, Java
Reporter: Raúl Cumplido
 Fix For: 8.0.0


Currently our nightly verify release is failing on 
verify-rc-source-java-macos-amd64 with the following error 
(https://github.com/ursacomputing/crossbow/runs/6147103836?check_suite_focus=true):
{code:java}
  Warning:  Tests run: 6, Failures: 0, Errors: 0, Skipped: 4, Time elapsed: 
0.021 s - in org.apache.arrow.flight.TestApplicationMetadata
[INFO] Running org.apache.arrow.flight.TestFlightService
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0 s - in 
org.apache.arrow.flight.TestFlightService
[INFO] 
[INFO] Results:
[INFO] 
Error:  Errors: 
Error:    TestDoExchange.tearDown:74 » IllegalState Memory was leaked by query. 
Memory l...
[INFO] 
Error:  Tests run: 108, Failures: 0, Errors: 1, Skipped: 19
[INFO] 
[INFO] 
[INFO] Reactor Summary for Apache Arrow Java Root POM 8.0.0-SNAPSHOT:
[INFO] 
[INFO] Apache Arrow Java Root POM . SUCCESS [  6.956 s]
[INFO] Arrow Format ... SUCCESS [  1.402 s]
[INFO] Arrow Memory ... SUCCESS [  1.021 s]
[INFO] Arrow Memory - Core  SUCCESS [  5.303 s]
[INFO] Arrow Memory - Unsafe .. SUCCESS [  3.614 s]
[INFO] Arrow Memory - Netty ... SUCCESS [  3.989 s]
[INFO] Arrow Vectors .. SUCCESS [ 29.436 s]
[INFO] Arrow Compression .. SUCCESS [  4.271 s]
[INFO] Arrow Tools  SUCCESS [ 22.850 s]
[INFO] Arrow JDBC Adapter . SUCCESS [  9.101 s]
[INFO] Arrow Plasma Client  SUCCESS [  1.068 s]
[INFO] Arrow Flight ... SUCCESS [  1.072 s]
[INFO] Arrow Flight Core .. FAILURE [ 37.941 s]
[INFO] Arrow Flight GRPC .. SKIPPED
[INFO] Arrow Flight SQL ... SKIPPED
[INFO] Arrow Flight Integration Tests . SKIPPED
[INFO] Arrow AVRO Adapter . SKIPPED
[INFO] Arrow Algorithms ... SKIPPED
[INFO] Arrow Performance Benchmarks ... SKIPPED
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time:  02:08 min
[INFO] Finished at: 2022-04-24T14:17:58Z
[INFO] 
Error:  Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M3:test (default-test) on 
project flight-core: There are test failures.
Error:  
Error:  Please refer to 
/Users/runner/work/crossbow/crossbow/arrow/java/flight/flight-core/target/surefire-reports
 for the individual test results.
Error:  Please refer to dump files (if any exist) [date].dump, 
[date]-jvmRun[N].dump and [date].dumpstream.
Error:  -> [Help 1]
Error:  
Error:  To see the full stack trace of the errors, re-run Maven with the -e 
switch.
Error:  Re-run Maven using the -X switch to enable full debug logging.
Error:  
Error:  For more information about the errors and possible solutions, please 
read the following articles:
Error:  [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
Error:  
Error:  After correcting the problems, you can resume the build with the command
Error:    mvn  -rf :flight-core
Failed to verify release candidate. See 
/var/folders/24/8k48jl6d249_n_qfxwsl6xvmgn/T/arrow-HEAD.X.q7EbeuIy for 
details.
{code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (ARROW-16306) [CI] Nightly verify rc on ubuntu is failing due to setuptools scm unable to find version

2022-04-25 Thread Jira
Raúl Cumplido created ARROW-16306:
-

 Summary: [CI] Nightly verify rc on ubuntu is failing due to 
setuptools scm unable to find version
 Key: ARROW-16306
 URL: https://issues.apache.org/jira/browse/ARROW-16306
 Project: Apache Arrow
  Issue Type: Bug
  Components: Continuous Integration
Reporter: Raúl Cumplido
Assignee: Raúl Cumplido
 Fix For: 8.0.0


These current jobs:
- verify-rc-source-python-linux-ubuntu-18.04-amd64:
- verify-rc-source-python-linux-ubuntu-20.04-amd64:

Are failing due to:
{code:java}
Traceback (most recent call last):
  File "setup.py", line 607, in 
    setup(
  File 
"/tmp/arrow-HEAD.7Wo1N/venv-source/lib/python3.8/site-packages/setuptools/__init__.py",
 line 129, in setup
    return distutils.core.setup(**attrs)
  File "/usr/lib/python3.8/distutils/core.py", line 108, in setup
    _setup_distribution = dist = klass(attrs)
  File 
"/tmp/arrow-HEAD.7Wo1N/venv-source/lib/python3.8/site-packages/setuptools/dist.py",
 line 372, in __init__
    _Distribution.__init__(self, attrs)
  File "/usr/lib/python3.8/distutils/dist.py", line 292, in __init__
    self.finalize_options()
  File 
"/tmp/arrow-HEAD.7Wo1N/venv-source/lib/python3.8/site-packages/setuptools/dist.py",
 line 528, in finalize_options
    ep.load()(self, ep.name, value)
  File 
"/tmp/arrow-HEAD.7Wo1N/venv-source/lib/python3.8/site-packages/setuptools_scm/integration.py",
 line 75, in version_keyword
    _assign_version(dist, config)
  File 
"/tmp/arrow-HEAD.7Wo1N/venv-source/lib/python3.8/site-packages/setuptools_scm/integration.py",
 line 51, in _assign_version
    _version_missing(config)
  File 
"/tmp/arrow-HEAD.7Wo1N/venv-source/lib/python3.8/site-packages/setuptools_scm/__init__.py",
 line 106, in _version_missing
    raise LookupError(
LookupError: setuptools-scm was unable to detect version for /arrow.Make sure 
you're either building from a fully intact git repository or PyPI tarballs. 
Most other sources (such as GitHub's tarballs, a git checkout without the .git 
folder) don't contain the necessary metadata and will not work.For example, if 
you're using pip, instead of https://github.com/user/proj/archive/master.zip 
use git+https://github.com/user/proj.git#egg=proj
Failed to verify release candidate. See /tmp/arrow-HEAD.7Wo1N for details.
1
Error: `docker-compose --file 
/home/runner/work/crossbow/crossbow/arrow/docker-compose.yml run --rm -e 
VERIFY_VERSION= -e VERIFY_RC= -e TEST_DEFAULT=0 -e TEST_PYTHON=1 
ubuntu-verify-rc` exited with a non-zero exit code 1, see the process log 
above.The docker-compose command was invoked with the following 
parameters:{code}
This was fixed for the verify-conda-rc but the ubuntu ones have started failing.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)