jorisvandenbossche commented on code in PR #40341:
URL: https://github.com/apache/arrow/pull/40341#discussion_r1515784637


##########
python/pyarrow/array.pxi:
##########
@@ -336,11 +336,27 @@ def array(object obj, type=None, mask=None, size=None, 
from_pandas=None,
             if pandas_api.have_pandas:
                 values, type = pandas_api.compat.get_datetimetz_type(
                     values, obj.dtype, type)
-            result = _ndarray_to_array(values, mask, type, c_from_pandas, safe,
-                                       pool)
+            if type and type.id == _Type_RUN_END_ENCODED:
+                if mask is not None:
+                    raise ValueError("Cannot pass a mask for Run-End Encoded 
arrays.")

Review Comment:
   The discussion we had before was mostly about when you passed _pyarrow_ 
arrays, while here we are treating generic Python input. So for this case, we 
ideally handle that keyword, I think, or otherwise at least error about not yet 
supporting it (as you did, but maybe can then be a NotImplementedError?)



##########
python/pyarrow/array.pxi:
##########
@@ -118,6 +118,13 @@ def _handle_arrow_array_protocol(obj, type, mask, size):
     return res
 
 
+def _handle_run_end_encoded_arrays(obj, type):
+    from pyarrow.compute import run_end_encode
+    ree_arr = run_end_encode(obj)

Review Comment:
   The kernel should take an option about which run_end type to use, so based 
on `type.run_end_type` you should be able to specify this up front (to avoid 
another conversion in the next step)



##########
python/pyarrow/array.pxi:
##########
@@ -336,11 +336,27 @@ def array(object obj, type=None, mask=None, size=None, 
from_pandas=None,
             if pandas_api.have_pandas:
                 values, type = pandas_api.compat.get_datetimetz_type(
                     values, obj.dtype, type)
-            result = _ndarray_to_array(values, mask, type, c_from_pandas, safe,
-                                       pool)
+            if type and type.id == _Type_RUN_END_ENCODED:
+                if mask is not None:
+                    raise ValueError("Cannot pass a mask for Run-End Encoded 
arrays.")

Review Comment:
   If we keep the error, you can also move it into the helper function



##########
python/pyarrow/array.pxi:
##########
@@ -118,6 +118,13 @@ def _handle_arrow_array_protocol(obj, type, mask, size):
     return res
 
 
+def _handle_run_end_encoded_arrays(obj, type):
+    from pyarrow.compute import run_end_encode
+    ree_arr = run_end_encode(obj)
+    return RunEndEncodedArray.from_arrays(
+        ree_arr.run_ends.to_pylist(), ree_arr.values.to_pylist(), type)

Review Comment:
   Converting to a python list means we do the conversion "python object -> 
arrow data" twice. It doesn't work to just pass the pyarrow arrays?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to