[GitHub] [arrow] pitrou commented on a change in pull request #8805: ARROW-10725: [Python][Compute] Expose sort options in Python bindings

GitBox Mon, 14 Dec 2020 06:50:13 -0800


pitrou commented on a change in pull request #8805:
URL: https://github.com/apache/arrow/pull/8805#discussion_r542441007




##########
File path: python/pyarrow/tests/test_compute.py
##########
@@ -1017,3 +1017,51 @@ def test_partition_nth():
                for i in range(pivot))
     assert all(data[indices[i]] >= data[indices[pivot]]
                for i in range(pivot, len(data)))
+
+
+def test_array_sort_indices():
+    arr = pa.array([1, 2, None, 0])
+    result = pc.array_sort_indices(arr)
+    assert result.to_pylist() == [3, 0, 1, 2]
+    result = pc.array_sort_indices(arr, order="ascending")
+    assert result.to_pylist() == [3, 0, 1, 2]
+    result = pc.array_sort_indices(arr, order="descending")
+    assert result.to_pylist() == [1, 0, 3, 2]
+
+    with pytest.raises(ValueError, match="not a valid order"):
+        pc.array_sort_indices(arr, order="nonscending")
+
+
+def test_sort_indices_array():
+    arr = pa.array([1, 2, None, 0])
+    result = pc.sort_indices(arr)
+    assert result.to_pylist() == [3, 0, 1, 2]
+    result = pc.sort_indices(arr, sort_keys=[("dummy", "ascending")])

Review comment:
       Just thinking out loud, but another possibility would be 
`sort_keys=["a", "-b"]` for `[("a", "ascending"), ("b", "descending")]`. Not 
sure how that feels...
   
   cc @xhochy for opinions.

##########
File path: python/pyarrow/includes/libarrow.pxd
##########
@@ -1788,6 +1788,25 @@ cdef extern from "arrow/compute/api.h" namespace 
"arrow::compute" nogil:
         CPartitionNthOptions(int64_t pivot)
         int64_t pivot
 
+    enum CSortOrder" arrow::compute::SortOrder":

Review comment:
       I think this should be `ctypedef enum` because 
`arrow::compute::SortOrder` is a scoped num (it's declared as `enum class 
SortOrder`.
   
   Hopefully that'd fix the compile error on CI.

##########
File path: python/pyarrow/tests/test_compute.py
##########
@@ -1017,3 +1017,51 @@ def test_partition_nth():
                for i in range(pivot))
     assert all(data[indices[i]] >= data[indices[pivot]]
                for i in range(pivot, len(data)))
+
+
+def test_array_sort_indices():
+    arr = pa.array([1, 2, None, 0])
+    result = pc.array_sort_indices(arr)
+    assert result.to_pylist() == [3, 0, 1, 2]
+    result = pc.array_sort_indices(arr, order="ascending")
+    assert result.to_pylist() == [3, 0, 1, 2]
+    result = pc.array_sort_indices(arr, order="descending")
+    assert result.to_pylist() == [1, 0, 3, 2]
+
+    with pytest.raises(ValueError, match="not a valid order"):
+        pc.array_sort_indices(arr, order="nonscending")
+
+
+def test_sort_indices_array():
+    arr = pa.array([1, 2, None, 0])
+    result = pc.sort_indices(arr)
+    assert result.to_pylist() == [3, 0, 1, 2]
+    result = pc.sort_indices(arr, sort_keys=[("dummy", "ascending")])

Review comment:
       It's a bit of a pity to have such a verbose convention (having to type 
"ascending" for every sort key). Apparently 
[pandas](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.sort_values.html)
 uses a separate `ascending` argument with is a list of bools, though that's 
not very pretty either...




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] pitrou commented on a change in pull request #8805: ARROW-10725: [Python][Compute] Expose sort options in Python bindings

Reply via email to