In crypto/md5/md5_dgst.c, there is lots of code as follows,

        /* Round 0 */
        R0(A,B,C,D,X[ 0], 7,0xd76aa478L);
        R0(D,A,B,C,X[ 1],12,0xe8c7b756L);
        R0(C,D,A,B,X[ 2],17,0x242070dbL);
        R0(B,C,D,A,X[ 3],22,0xc1bdceeeL);
...

This expands to the following on an ARM processor (gcc 2.7.2, 2.9.5)

00000588 <.L100>:
     588:       e5970000        ldr     r0, [r7]
     58c:       e028300a        eor     r3, r8, r10
     590:       e003300b        and     r3, r3, r11
     594:       e0892000        add     r2, r9, r0
     598:       e023300a        eor     r3, r3, r10
     59c:       e0822003        add     r2, r2, r3
     5a0:       e24295a2        sub     r9, r2, #679477248      ; 0x28800000
     5a4:       e2499955        sub     r9, r9, #1392640        ; 0x154000
     5a8:       e2499d6e        sub     r9, r9, #7040   ; 0x1b80
     5ac:       e2499008        sub     r9, r9, #8      ; 0x8
     5b0:       e1a09ce9        mov     r9, r9, ror #25


This assembler is for the first R0 with the following defines, 

#define ROTATE(a,n)     (((a)<<(n))|(((a)&0xffffffff)>>(32-(n))))

#define F(b,c,d)        ((((c) ^ (d)) & (b)) ^ (d))
#define G(b,c,d)        ((((b) ^ (c)) & (d)) ^ (c))
#define H(b,c,d)        ((b) ^ (c) ^ (d))
#define I(b,c,d)        (((~(d)) | (b)) ^ (c))

#define R0(a,b,c,d,k,s,t) { \
        a+=((k)+(t)+F((b),(c),(d))); \
        a=ROTATE(a,s); \
        a+=b; };\

Things are going great with the rotate.  It has been translated to this 
line,
     5b0:       e1a09ce9        mov     r9, r9, ror #25

The other assembler is quite good as well.  However, the ARM suffers
with 8 bit constants.  The value 0xd76aa478 gets translated to (well,
at least according to me),

     5a0:       e24295a2        sub     r9, r2, #679477248      ; 0x28800000
     5a4:       e2499955        sub     r9, r9, #1392640        ; 0x154000
     5a8:       e2499d6e        sub     r9, r9, #7040   ; 0x1b80
     5ac:       e2499008        sub     r9, r9, #8      ; 0x8

I know that gcc would produce better code if the hash constants were
stored in a static const array.  A pointer could then move along and
retrieve the constants.  This would also save space (and time??) on
most architectures that I know.  The same array can be shared with the
two md5 functions.

   void md5_block_host_order (MD5_CTX *c, const void *data, int num);
   void md5_block_data_order (MD5_CTX *c, const void *data_, int num);

... This seems too good when I tell the story.  What harsh part of
reality comes and messes things up?  The other assembler versions of
the same macros?  I can implement ARM version that use a constant load
like this "mov %3,=#0xd76aa478".  But this makes the compiler put the
constants willy-nilly and cache effects wouldn't work as well as with
an array.

Regards,
Bill Pringlemeir.


______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [EMAIL PROTECTED]
Automated List Manager                           [EMAIL PROTECTED]

Reply via email to