will recognize that the cases are 1,2,3,...N and optimize
with a look up table accordingly. I checked only once, about 15
compiler writers do try hard to stay abreast of actual hardware
behavior. this is a good example: the look-up-table approach is
clearly not the fastest for some (many?) cases.
years ago, and the resultant code was doing the equivalent of this:
if(i==1){}
else if(i==2){}
etc.
did you compile with feedback-directed optimization? bear in mind that
predicted branches are cheap, probably cheaper than a LUT in L2, perhaps
even one in L1. _anything_ is cheaper than a LUT that's all the way out
in memory...
_______________________________________________
Beowulf mailing list, [email protected]
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf