hemidark opened a new issue, #41090:
URL: https://github.com/apache/arrow/issues/41090
### Describe the bug, including details regarding any error messages,
version, and platform.
`RunEndEncodedArray.from_arrays` is documented to accept `Array` instances,
but it does not:
```
>>> import pyarrow as pa
>>> pa.__version__
'15.0.2'
>>> pa.RunEndEncodedArray.from_arrays(pa.array([1]), pa.array([100]))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "pyarrow/array.pxi", line 3382, in
pyarrow.lib.RunEndEncodedArray.from_arrays
File "pyarrow/array.pxi", line 3342, in
pyarrow.lib.RunEndEncodedArray._from_arrays
TypeError: an integer is required
```
Also confirmed in main.
The problem is that `from_arrays` extracts the last value from `run_ends`
assuming that it will be a native integer type, just as in the base case for
the empty sequence:
```
logical_length = run_ends[-1] if len(run_ends) > 0 else 0
```
When passed as `logical_length` to `_from_arrays`, this assumed-native value
gets cast directly to a `uint64_t`:
```
_logical_length = <int64_t>logical_length
```
This cast fails when `logical_length` is a pyarrow scalar obtained from an
`Array`.
This wasn't caught because the test suite currently only tests `from_arrays`
with python lists.
The simplest backwards-compatible fix would be to cast the final value of
`run_ends` to a scalar and then unwrap it as a native python type, which will
work for both python lists and `Array` instances (the rest of the code seems
agnostic):
```
logical_length = scalar(run_ends[-1]).as_py() if len(run_ends) > 0
else 0
```
This would imply the documentation needs to be updated to reflect that the
method will accept any array-like, and not only `Array` instances.
Extra question: is `from_arrays` meant to be zero-copy when given `Array`
instances?
### Component(s)
Python
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]