Re: [PATCH] vformatter cleanups (related to PR 42250)

Jonathan Gilbert Fri, 27 Apr 2007 23:28:49 -0700

At 01:11 AM 4/28/2007 -0500, I wrote:
>At 06:37 AM 4/27/2007 -0500, William A. Rowe wrote:
>>nope - the proposed change is a bit more expensive.  (magnitude % 10 in
>>any case being the unavoidably most expensive bit.)
[snip]
>  /* Eat two digits at a time.
>  while (magnitude > 9) {
>    *(short *)--p = two_digit_lut[magnitude % 100];
>    --p, magnitude /= 100;
>  }
[snip]


Incidentally, I fixed up this loop (it should be "*(short *)(p -= 2)",
rather than splitting the subtraction like that, and the alignment
comparison should be inverted) and ran a little test, and apparently
Microsoft's compiler does not use an IDIV. Instead, it uses a bizarre
multiplication trick to obtain the values of "magnitude % 100" and
"magnitude / 100":

  ; magnitude in ecx
  mov  eax, 1374389535
  imul ecx
  sar  edx, 5
  mov  eax, edx
  shr  eax, 31
  add  eax, edx           ; eax is now equal to ecx / 100!
  mov  edx, eax
  imul edx, 100
  sub  ecx, edx           ; ecx is now equal to magnitude % 100
                          ; (magnitude - 100 * (magnitude / 100))

Intuitively, I wouldn't expect this long sequence to be faster, but it must
be or they wouldn't emit it. (I guess I'm underestimating the cost of a
division significantly!) They apparently use it whenever they see both
"value % n" and "value / n" near to one another. Also note how in this
case, since it is in a loop where the LCV is overwritten, their register
allocator has no qualms about overwriting the original magnitude value in
ecx, since it already has (magnitude / 100) for the next loop in eax. I'm
not sure how the magic number 1374389535 is computed, but I'm sure it's not
rocket science once you know the trick.

If this is significantly faster than an IDIV or two, then other parts of
the loop become more significant.

Jonathan Gilbert

Re: [PATCH] vformatter cleanups (related to PR 42250)

Reply via email to