Bill Pringlemeir wrote: > > In crypto/md5/md5_dgst.c, there is lots of code as follows, > > /* Round 0 */ > R0(A,B,C,D,X[ 0], 7,0xd76aa478L); > R0(D,A,B,C,X[ 1],12,0xe8c7b756L); > R0(C,D,A,B,X[ 2],17,0x242070dbL); > R0(B,C,D,A,X[ 3],22,0xc1bdceeeL); > ... > > This expands to the following on an ARM processor (gcc 2.7.2, 2.9.5) > > 00000588 <.L100>: > 588: e5970000 ldr r0, [r7] > 58c: e028300a eor r3, r8, r10 > 590: e003300b and r3, r3, r11 > 594: e0892000 add r2, r9, r0 > 598: e023300a eor r3, r3, r10 > 59c: e0822003 add r2, r2, r3 > 5a0: e24295a2 sub r9, r2, #679477248 ; 0x28800000 > 5a4: e2499955 sub r9, r9, #1392640 ; 0x154000 > 5a8: e2499d6e sub r9, r9, #7040 ; 0x1b80 > 5ac: e2499008 sub r9, r9, #8 ; 0x8 > 5b0: e1a09ce9 mov r9, r9, ror #25 > > This assembler is for the first R0 with the following defines, > > #define ROTATE(a,n) (((a)<<(n))|(((a)&0xffffffff)>>(32-(n)))) > > #define F(b,c,d) ((((c) ^ (d)) & (b)) ^ (d)) > #define G(b,c,d) ((((b) ^ (c)) & (d)) ^ (c)) > #define H(b,c,d) ((b) ^ (c) ^ (d)) > #define I(b,c,d) (((~(d)) | (b)) ^ (c)) > > #define R0(a,b,c,d,k,s,t) { \ > a+=((k)+(t)+F((b),(c),(d))); \ > a=ROTATE(a,s); \ > a+=b; };\ > > Things are going great with the rotate. It has been translated to this > line, > 5b0: e1a09ce9 mov r9, r9, ror #25 > > The other assembler is quite good as well. However, the ARM suffers > with 8 bit constants. The value 0xd76aa478 gets translated to (well, > at least according to me), > > 5a0: e24295a2 sub r9, r2, #679477248 ; 0x28800000 > 5a4: e2499955 sub r9, r9, #1392640 ; 0x154000 > 5a8: e2499d6e sub r9, r9, #7040 ; 0x1b80 > 5ac: e2499008 sub r9, r9, #8 ; 0x8 > > I know that gcc would produce better code if the hash constants were > stored in a static const array. A pointer could then move along and > retrieve the constants. This would also save space (and time??) on > most architectures that I know. The same array can be shared with the > two md5 functions. > > void md5_block_host_order (MD5_CTX *c, const void *data, int num); > void md5_block_data_order (MD5_CTX *c, const void *data_, int num); > > ... This seems too good when I tell the story. What harsh part of > reality comes and messes things up? The other assembler versions of > the same macros? I can implement ARM version that use a constant load > like this "mov %3,=#0xd76aa478". But this makes the compiler put the > constants willy-nilly and cache effects wouldn't work as well as with > an array.
The short answer is: benchmark it. If it works better, then it works better ;-) "openssl speed md5" is your friend. Cheers, Ben. -- http://www.apache-ssl.org/ben.html http://www.thebunker.net/ "There is no limit to what a man can do or how far he can go if he doesn't mind who gets the credit." - Robert Woodruff ______________________________________________________________________ OpenSSL Project http://www.openssl.org Development Mailing List [EMAIL PROTECTED] Automated List Manager [EMAIL PROTECTED]