Re: Slow performance compared to C++, ideas?

Dicebot Fri, 31 May 2013 06:05:44 -0700

On Friday, 31 May 2013 at 11:49:05 UTC, Manu wrote:

I find that using templates actually makes it more likely forthe compilerto properly inline. But I think the totally generic expressionsproducecases where the compiler is considering too many possibilitiesthat inhibit
many optimisations.
It might also be that the optimisations get a lot more complexwhen the
code fragments span across a complex call tree with optimisation
dependencies on non-deterministic inlining.
One of the most important jobs for the optimiser is codere-ordering.Generic code is often written in such a way that makes ithard/impossible
for the optimiser to reorder the flattened code properly.
Hand written code can have branches and memory accessescarefully placed at
the appropriate locations.
Generic code will usually package those sorts of operationsbehind little
templates that often flatten out in a different order.
The optimiser is rarely able to re-order code across ifstatements, orpointer accesses. __restrict is very important in generic codeto allow theoptimiser to reorder across any indirection, otherwisecompilers typicallyhave to be conservative and presume that something somewheremay havechanged the destination of a pointer, and leave the order asthe templateexpanded. Sadly, D doesn't even support __restrict, and nobodyever uses it
in C++ anyway.
I've always has better results with writing precisely what Iintend thecompiler to do, and using __forceinline where it needs a littleextra
encouragement.

Thanks for valuable input. Have never had a pleasure to actuallytry templates in performance-critical code and this a good stuffto remember about. Have added to notes.

Re: Slow performance compared to C++, ideas?

Reply via email to