[
https://issues.apache.org/jira/browse/ARROW-10899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17360115#comment-17360115
]
Kirill Lykov edited comment on ARROW-10899 at 6/9/21, 2:56 PM:
---------------------------------------------------------------
LSD sort is stable.
Do you want to try with Travis' LSD sort implementation?
If yes, we can discuss what shall be done. Like I can change the style of the
code, make it templated, support negative, etc.
Making perfect adaptive stable generic radix looks like too time consuming task
for me, but adapting LSD is simpler.
Yet about bandwidth.
Travis wrote me that he originally wanted to have two posts -- one about LSD
and another about MSD. And he prefers MSD due to lower bandwidth consumption.
It sounds logical by looking to algorithms themselfs -- LSD will visit all the
radixes anyway while MSD will build a tree of recursive calls which generally
speaking is not balanced so makes less memory operations.
I tried to estimate bandwidth by using LLC-misses in perf, results are in the
table there https://github.com/KirillLykov/int-sort-bmk#msd-vs-lsd-radix-sort
Yet MSD is more complicated to implement and less performant, so if doing
MSD-like approach I would go with adaptive MSD like one from boost
What do you think about this?
was (Author: klykov):
LSD sort is stable.
Do you want to try with Travis' LSD sort implementation?
> [C++] Investigate radix sort for integer arrays
> -----------------------------------------------
>
> Key: ARROW-10899
> URL: https://issues.apache.org/jira/browse/ARROW-10899
> Project: Apache Arrow
> Issue Type: Wish
> Components: C++
> Reporter: Antoine Pitrou
> Priority: Major
> Attachments: Screen Shot 2021-02-09 at 17.48.13.png, Screen Shot
> 2021-02-10 at 10.58.23.png, all_random_wholeRange.png,
> all_random_wholeRange.png, all_random_wholeRange.png, differentdistrib.png
>
>
> For integer arrays with a non-tiny range of values, we currently use a stable
> sort. It may be faster to use a radix sort instead.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)