BTW, 272MBps at 3.6GHz? I get 262MBps out of [as just mentioned virtually identical] 32-bit code at 2.4GHz P4...

In fact, Your implement on EM64t isn't that slow if we change the inc and dec to add and sub. :)


With that change the throughput boost from 272Mb/s to 396Mb/s.

Huh? And what if you replace inc/add with lea 1(%reg),%reg to eliminate even possibility of contention for %eflag?


I have not investigated the 32 bit P4 path yet, But you should see performance gain on P4 with this change.

I see >10% slow-down, even on Prescott core... A. ______________________________________________________________________ OpenSSL Project http://www.openssl.org Development Mailing List [email protected] Automated List Manager [EMAIL PROTECTED]

Reply via email to