On 03/08/2020 13:08, Iain Buclaw via Digitalmars-d-announce wrote:
> 
> 
> On 15/05/2020 12:28, Joseph Rushton Wakeling via Digitalmars-d-announce wrote:
>> On Thursday, 14 May 2020 at 13:26:23 UTC, Mike Parker wrote:
>>> After reading a paper that grabbed his curiosity and wouldn't let go, 
>>> Andrei set out to determine if Lomuto partitioning should still be 
>>> considered inferior to Hoare for quicksort on modern hardware. This blog 
>>> post details his results.
>>>
>>> Blog:
>>> https://dlang.org/blog/2020/05/14/lomutos-comeback/
>>
>> Nice stuff!
>>
>> One curious question -- unless I've misread things horribly, it looks like 
>> the D benchmarks for Lomuto branch-free are consistently slower than for 
>> C++.  Any idea why that is?  I would expect gcc/gdc and clang/ldc to produce 
>> effectively identical results for code like this.
> 
> Sorry for the belated response, as far as I can see, gdc and g++ only differ 
> on one line.
> 
>     auto delta = smaller & (read - first);
> 
> This is lowered as:
> 
>     delta = smaller & (read - first) / 8;
> 
> However, g++ uses an exact divide operator (semantically that ignores 
> rounding), whereas gdc uses a truncating divide operator (semantically rounds 
> the quotient towards zero).
> 
> I'm willing to bet a beer on tweaking pointer subtraction will get gdc in 
> lockstep with g++.
> 

I doubt Andrei will re-run the benchmarks now, but here's the PR (problem 
reference) with patch attached.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96429

Reply via email to