On Thursday, 23 February 2017 at 16:25:34 UTC, Johan Engelen wrote:
On Wednesday, 22 February 2017 at 23:49:43 UTC, Dušan Pavkov wrote:

If the function is outside of class code runs much faster. I'm obviously doing something wrong and would appreciate any help with this.

Interesting test case, thanks :-)
Adding "final" to the class method nullifies the speed difference. Somehow, LDC does not devirtualize the call in your testcase. Without the for-loops the call is nicely devirtualized, so no performance difference.

We're in good company: both clang and gcc also do not devirtualize the call when the loopcount is too large (when the loop count is 4, the indirect calls are gone, when it is 160, they are back).

Btw, with PGO, the performance is 4 ms(direct call) vs 6 ms (virtual call). Pathological, but still.

I am submitting a DConf talk on optimization and the cost of D idioms. This gave me some new ideas to present, thanks :)

-Johan

Reply via email to