On Sunday, 4 January 2015 at 04:43:17 UTC, Jonathan Marler wrote:
The problem I see is that in almost all cases the opEquals(Object) method will have to perform a cast back to the original type at runtime. The problem is this isn't doing any useful work. The current '==' operator passes the class as an Object to a generic opEquals method which eventually gets passed to a method that must cast it back to the original type. Why not just have the == operator rewrite the code to call a "typed" opEquals method? Then no casting is necessary.
[...]
run 1 (loopcount 10000000)
  x.x == y.x               : 6629 microseconds
  x.opEquals(y)            : 6680 microseconds
  x.opEquals(cast(Object)y): 89290 microseconds
  x == y                   : 138572 microseconds
run 2 (loopcount 10000000)
  x.x == y.x               : 6124 microseconds
  x.opEquals(y)            : 6263 microseconds
  x.opEquals(cast(Object)y): 90918 microseconds
  x == y                   : 132807 microseconds

I made made opEquals(Object) final and tried with ldc. Gives me these times:

run 1 (loopcount 10000000)
  x.x == y.x               : 0 microseconds
  x.opEquals(y)            : 0 microseconds
  x.opEquals(cast(Object)y): 0 microseconds
  x == y                   : 108927 microseconds
run 2 (loopcount 10000000)
  x.x == y.x               : 0 microseconds
  x.opEquals(y)            : 0 microseconds
  x.opEquals(cast(Object)y): 0 microseconds
  x == y                   : 106700 microseconds

Threw some `asm {}`s in there to make it less hyper-optimized:

run 1 (loopcount 10000000)
  x.x == y.x               : 4996 microseconds
  x.opEquals(y)            : 3932 microseconds
  x.opEquals(cast(Object)y): 3924 microseconds
  x == y                   : 109300 microseconds
run 2 (loopcount 10000000)
  x.x == y.x               : 3068 microseconds
  x.opEquals(y)            : 2931 microseconds
  x.opEquals(cast(Object)y): 2963 microseconds
  x == y                   : 108093 microseconds

I think (final) opEquals(Object) itself is ok.

A final opEquals(Object) is faster with dmd, too. But it's nowhere near the others. So apparently dmd misses some optimization there, presumably inlining.

Reply via email to