> I guess we should put out a call for anyone of a speed-obsessed
> inclination to let us know if they notice any overhead problems with
> these changes.
Does the code have to be so obscure? Is it recognized that it's
byte-order dependent? Is it intentional?
On little-endians it is
while(loop--)
{
*(p++) = cleanse_ctr;
cleanse_ctr += (17 + (((int)p)>>24)&0xF);
}
where (((int)p)>>24)&0xF most likely remains constant.
On big-endians it is
while(loop--)
{
*(p++) = cleanse_ctr;
cleanse_ctr += (17 + p&0xF);
}
> I seriously doubt it, but it's not impossible if a
> "cleanse" found its way into a really tight loop and the compilers are
> generating slow "cleanse" functions.
But does it have to be slower than it has to? Note that gcc takes
((unsigned char *)&p)[sizeof(unsigned char *)-1] literally which [on
e.g. SPARC] results in 4 loads and 3 stores per loop span, which is 4
loads and 2 stores more than it has to be (see above).
A.
______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List [EMAIL PROTECTED]
Automated List Manager [EMAIL PROTECTED]