On 2021-11-08, Otto Moerbeek <[email protected]> wrote:
> On Sun, Nov 07, 2021 at 08:13:36PM -0600, Luke Small wrote:
>
>> https://bugs.llvm.org/show_bug.cgi?id=50026
>>
>> I reported it to the llvm people. it is two slightly different quicksort
>> algorithms which perform radically differently. The one which you could
>> assume would take more time, performs MUCH better.
>>
>> I made a custom quicksort algorithm which outperforms qsort by A LOT for
>> sorting an array of around 300 randomly created unsigned characters, which
>> is what I use it for.
>>
>> if you read the report a guy said there's a 10% difference for sorting 3
>> million characters on freebsd, but there's about 40% performance difference
>> on OpenBSD. maybe it's also how the OpenBSD team modified clang to prevent
>> rop chain stuff or something? I'm using a westmere based intell server.
>>
>> it's the same for clang 11.
>>
>> What's the deal?
>
> 1. Your test is too small. I increased the test size to get less error
> in measurement. I also changed the code to either run quicksort or
> quicksort depending on an argument.
>
> 2. I confirmed your observation using time on amd64, arm64 however
> shows a difference more in line with expectations:
>
> [otto@mc:8]$ time ./a.out A
> quicksort
> time = 0.335777021
> 0m00.35s real 0m00.35s user 0m00.01s system
> [otto@mc:9]$ time ./a.out B
> quicksort1
> time = 0.345703033
> 0m00.36s real 0m00.36s user 0m00.01s system
>
>
> I suspect some non-optimal decision of the optimizer on amd64 for the
> first quicksort.
>
> A next step could be somebody looking at the code produced by the
> compiler. That is not going to be me.
This is another example of code quite badly affected by fixup-gadgets.
With an increased test size and building separate objects for each of
the two tests and with/without CFLAGS=-fno-fixup-gadgets:
$ for i in quicksort*; do echo $i; time ./$i; done
quicksort
0m02.33s real 0m02.32s user 0m00.00s system
quicksort-no-fixup-gadgets
0m01.29s real 0m01.27s user 0m00.00s system
quicksort1
0m01.06s real 0m00.94s user 0m00.02s system
quicksort1-no-fixup-gadgets
0m01.08s real 0m00.87s user 0m00.01s system