jorisvandenbossche commented on code in PR #40661:
URL: https://github.com/apache/arrow/pull/40661#discussion_r1531617792
##########
python/pyarrow/src/arrow/python/arrow_to_pandas.cc:
##########
@@ -2320,6 +2328,11 @@ class ConsolidatedBlockCreator : public
PandasBlockCreator {
if (arrays_[column_index]->type()->id() == Type::EXTENSION) {
arrays_[column_index] = GetStorageChunkedArray(arrays_[column_index]);
}
+ // In case of a RunEndEncodedArray default to the storage type
Review Comment:
```suggestion
// In case of a RunEndEncodedArray default to the values type
```
##########
python/pyarrow/src/arrow/python/arrow_to_pandas.cc:
##########
@@ -2554,6 +2567,14 @@ Status ConvertChunkedArrayToPandas(const PandasOptions&
options,
if (arr->type()->id() == Type::EXTENSION) {
arr = GetStorageChunkedArray(arr);
}
+ // In case of a RunEndEncodedArray decode the array
+ else if (arr->type()->id() == Type::RUN_END_ENCODED) {
+ ARROW_ASSIGN_OR_RAISE(arr, GetDecodedChunkedArray(arr));
Review Comment:
We probably should check here for `options.zero_copy_only` and raise an
error if set to true (like above is done in case of `strings_to_categorical`)
##########
python/pyarrow/tests/test_array.py:
##########
@@ -3580,6 +3580,26 @@ def test_run_end_encoded_from_buffers():
1, offset, children)
+def _generate_ree_array():
+ run_ends = [1, 3, 6]
+ values = [1, 2, 3]
+ ree_type = pa.run_end_encoded(pa.int32(), pa.int64())
+ return pa.RunEndEncodedArray.from_arrays(run_ends, values,
Review Comment:
Since we are not testing the constructor here (and now I merged your other
PR), I think doing something simpler as `pa.array([1, 2, 2, 3, 3, 3],
pa.run_end_encoded(pa.int32(), pa.int64()))` is also fine.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]