On Sunday, 21 April 2013 at 13:30:32 UTC, bearophile wrote:
dsimcha:
I abandoned because I was disappointed at how poorly most of
it was scaling in practice, probably due to memory bandwidth.
Then do you know why the Java version seems to be advantageous
(with four cores)?
Bye,
bearophile
I don't know Java very well, but possiblities include:
1. Sorting using a virtual or otherwise non-inlined comparison
function. This makes the sorting require much more CPU time but
not a lot more memory bandwidth. It does beg the question,
though, of why the comparison function isn't inlined, especially
since modern JITs can sometimes inline virtual functions.
2. Different hardware than I tested on, maybe with better memory
bandwidth.
3. Expensive comparison functions. I didn't test this in D
either because I couldn't think of a good use case. I tested the
D parallel sort using small primitive types (ints and floats and
stuff).