I wrote:

>Example: a typical construct in this part of the
>algorithm is an integer-arithmetic sequence like
>
>x = a + b
>if(x > c) x = x - c
>
>On machines with a conditional move instruction one can use that
>(i.e. calculate both a + b and a + b - c and pick one, based on the
> result of the conditional), but more portable and often faster is
>to use the properties of twos-complement arithmetic (here I assume
>signed 32-bit ints) like so:
>
>x = a + b
>x = x - (-(int)((unsigned)x >> 31)) & c


Sorry, that last part should read

x = a + b - c
x = x + (-(int)((unsigned)x >> 31)) & c

i.e. one subtracts c immediately and then adds it back 
in if the result is negative.

Upshot: one replaces either of

A) add, conditional branch, possibly another add

or

B) 2 add, conditional move (if available)

with

C) 3 add, negate, shift, logical and.

Interestingly, (C) is often faster than (B) even on
hardware which has a CMOV instruction - the Alpha 21264
architecture reference manual makes this point when
discussing code generation involving CMOV.

-Ernst

_________________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

Reply via email to