Copilot commented on code in PR #50327:
URL: https://github.com/apache/arrow/pull/50327#discussion_r3509608761


##########
python/pyarrow/array.pxi:
##########
@@ -3974,6 +4155,45 @@ cdef class StringArray(Array):
     Concrete class for Arrow arrays of string (or utf8) data type.
     """
 
+    def to_pylist(self, *, maps_as_pydicts=None):
+        """
+        Convert to a list of native Python objects.
+
+        Parameters
+        ----------
+        maps_as_pydicts : str, optional, default `None`
+            Valid values are `None`, 'lossy', or 'strict'.
+            This parameter is ignored for non-nested Arrays.
+
+        Returns
+        -------
+        lst : list
+        """
+        cdef:
+            CStringArray* arr = <CStringArray*> self.ap
+            int64_t i, n
+            int32_t length
+            const uint8_t* data
+        self._assert_cpu()
+        n = arr.length()
+        result = []
+        # Decode values straight from the data buffer instead of creating
+        # a C++ Scalar and a Python Scalar wrapper per value (see GH-28694).
+        if arr.null_count() == 0:
+            for i in range(n):
+                data = arr.GetValue(i, &length)
+                result.append(
+                    cp.PyUnicode_DecodeUTF8(<const char*> data, length, NULL))
+        else:
+            for i in range(n):
+                if arr.IsNull(i):
+                    result.append(None)
+                else:
+                    data = arr.GetValue(i, &length)
+                    result.append(
+                        cp.PyUnicode_DecodeUTF8(<const char*> data, length, 
NULL))
+        return result

Review Comment:
   Using `arr.null_count() == 0` can force an extra full validity-bitmap scan 
when `null_count` is unknown (e.g., slices of arrays with any nulls), because 
`null_count()` computes and caches the count. Since the else-branch already 
calls `IsNull(i)` per element, this can add an avoidable extra pass; consider 
branching on `arr.null_bitmap_data() == NULL` (definitely no nulls) and 
otherwise doing per-element `IsNull(i)` without calling `null_count()` (apply 
similarly to LargeStringArray / list bulk paths).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to