I wrote: >Example: a typical construct in this part of the >algorithm is an integer-arithmetic sequence like > >x = a + b >if(x > c) x = x - c > >On machines with a conditional move instruction one can use that >(i.e. calculate both a + b and a + b - c and pick one, based on the > result of the conditional), but more portable and often faster is >to use the properties of twos-complement arithmetic (here I assume >signed 32-bit ints) like so: > >x = a + b >x = x - (-(int)((unsigned)x >> 31)) & c
Sorry, that last part should read x = a + b - c x = x + (-(int)((unsigned)x >> 31)) & c i.e. one subtracts c immediately and then adds it back in if the result is negative. Upshot: one replaces either of A) add, conditional branch, possibly another add or B) 2 add, conditional move (if available) with C) 3 add, negate, shift, logical and. Interestingly, (C) is often faster than (B) even on hardware which has a CMOV instruction - the Alpha 21264 architecture reference manual makes this point when discussing code generation involving CMOV. -Ernst _________________________________________________________________________ Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers
