[
https://issues.apache.org/jira/browse/ARROW-9367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17302404#comment-17302404
]
Uwe Korn commented on ARROW-9367:
---------------------------------
This is now working in 3.0:
{code}
import pyarrow as pa
import pyarrow.compute as pc
table = pa.table({
"a": [1, 2, 3, -1],
"b": [10, 9, 9, 7],
"c": [10, 10, 8, 11]
})
indices = pc.sort_indices(table, sort_keys=[("b", "ascending"), ("c",
"ascending")])
table = pc.take(table, indices)
table.to_pydict()
# {'a': [-1, 3, 2, 1], 'b': [7, 9, 9, 10], 'c': [11, 8, 10, 10]}
{code}
> [Python] Sorting on pyarrow data structures ?
> ---------------------------------------------
>
> Key: ARROW-9367
> URL: https://issues.apache.org/jira/browse/ARROW-9367
> Project: Apache Arrow
> Issue Type: Wish
> Components: Python
> Reporter: Athanassios Hatzis
> Priority: Major
> Labels: sort
>
> Hi, I consider sorting a fundamental operation for any in-memory data
> structures, including those of PyArrow.
> It would be nice if pa.array, pa.table, etc had sorting methods but I did not
> find any. One has to pass sorting indices calculated from some other library,
> such as numpy, to sort them. Sorting indices could have been calculated
> directly from PyArrow. Am I missing something here ? That increases
> significantly complexity for the developer.
> Do you have any plans on implementing such a feature in the near future ?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)