> I guess we should put out a call for anyone of a speed-obsessed
> inclination to let us know if they notice any overhead problems with
> these changes.

Does the code have to be so obscure? Is it recognized that it's
byte-order dependent? Is it intentional?

On little-endians it is

        while(loop--)
                {
                *(p++) = cleanse_ctr;
                cleanse_ctr += (17 + (((int)p)>>24)&0xF);
                }

where (((int)p)>>24)&0xF most likely remains constant.

On big-endians it is

        while(loop--)
                {
                *(p++) = cleanse_ctr;
                cleanse_ctr += (17 + p&0xF);
                }

> I seriously doubt it, but it's not impossible if a
> "cleanse" found its way into a really tight loop and the compilers are
> generating slow "cleanse" functions.

But does it have to be slower than it has to? Note that gcc takes
((unsigned char *)&p)[sizeof(unsigned char *)-1] literally which [on
e.g. SPARC] results in 4 loads and 3 stores per loop span, which is 4
loads and 2 stores more than it has to be (see above).

A.
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [EMAIL PROTECTED]
Automated List Manager                           [EMAIL PROTECTED]

Reply via email to