jorisvandenbossche commented on code in PR #40341:
URL: https://github.com/apache/arrow/pull/40341#discussion_r1524478032


##########
python/pyarrow/array.pxi:
##########
@@ -336,11 +338,21 @@ def array(object obj, type=None, mask=None, size=None, 
from_pandas=None,
             if pandas_api.have_pandas:
                 values, type = pandas_api.compat.get_datetimetz_type(
                     values, obj.dtype, type)
-            result = _ndarray_to_array(values, mask, type, c_from_pandas, safe,
-                                       pool)
+            if type and type.id == _Type_RUN_END_ENCODED:
+                arr = _ndarray_to_array(
+                    values, mask, type.value_type, c_from_pandas, safe, pool)
+                result = run_end_encode(arr, run_end_type=type.run_end_type)

Review Comment:
   You can pass the `pool` here as well



##########
python/pyarrow/array.pxi:
##########
@@ -21,6 +21,8 @@ import os
 import warnings
 from cython import sizeof
 
+from pyarrow.compute import run_end_encode

Review Comment:
   Instead of this import here (which isn't guaranteed to work, as one can 
build pyarrow without compute support), you can use `_pc()` to get the compute 
module where calling `run_end_encode` (see the other examples of that in this 
file)



##########
python/pyarrow/tests/test_array.py:
##########
@@ -3579,6 +3579,42 @@ def test_run_end_encoded_from_buffers():
                                            1, offset, children)
 
 
+def test_run_end_encoded_from_array_with_type():
+    run_ends = pa.array([1, 3, 6], type=pa.int32())
+    values = pa.array([1, 2, 3], type=pa.int64())
+
+    ree_type = pa.run_end_encoded(pa.int32(), pa.int64())
+
+    arr = [1, 2, 2, 3, 3, 3]
+    result = pa.array(arr, type=ree_type)
+
+    assert result.run_ends.equals(run_ends)
+    assert result.values.equals(values)

Review Comment:
   Instead of each time checking both run ends and values, I think you could 
also once created the expected result (`expected = 
pa.RunEndEncodedArray.from_arrays(run_ends, values, ree_type)`), and then just 
check `assert result.equals(expected)`



##########
python/pyarrow/tests/test_array.py:
##########
@@ -3579,6 +3579,42 @@ def test_run_end_encoded_from_buffers():
                                            1, offset, children)
 
 
+def test_run_end_encoded_from_array_with_type():
+    run_ends = pa.array([1, 3, 6], type=pa.int32())
+    values = pa.array([1, 2, 3], type=pa.int64())
+
+    ree_type = pa.run_end_encoded(pa.int32(), pa.int64())
+
+    arr = [1, 2, 2, 3, 3, 3]
+    result = pa.array(arr, type=ree_type)
+
+    assert result.run_ends.equals(run_ends)
+    assert result.values.equals(values)
+
+    result = pa.array(np.array(arr), type=ree_type)

Review Comment:
   Can you also do one more version of this with `pa.array(arr)` as input?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to