Re: RC4 optimize for em64t

2005-04-06 Thread Zou Nan hai
On Wed, 2005-04-06 at 08:08, Zou Nan hai wrote: On Tue, 2005-04-05 at 18:17, Andy Polyakov wrote: Current OpenSSL (0.9.8-dev) rc4speed throughput on a Nocona (Em64t, b4bit) 3.6GHz is 272Mb/s, while this version of RC4 code can archive 536Mb/s in RC4Speed. Would you please

Re: RC4 optimize for em64t

2005-04-06 Thread Andy Polyakov
BTW, 272MBps at 3.6GHz? I get 262MBps out of [as just mentioned virtually identical] 32-bit code at 2.4GHz P4... In fact, Your implement on EM64t isn't that slow if we change the inc and dec to add and sub. :) With that change the throughput boost from 272Mb/s to 396Mb/s. Huh? And

OpenSSL use of DCLP may not be thread-safe on multiple processors

2005-04-06 Thread Steven Reddie
Hi All, OpenSSL makes use of the DCLP (double-checked locking pattern) in a number of places (rsa_eay.c and at least one engine; I haven't done an exhaustive search), with code that usually looks like this: if (x == NULL) { CRYPTO_w_lock(CRYPTO_LOCK_XXX); /* Avoid a

Re: RC4 optimize for em64t

2005-04-06 Thread Andy Polyakov
Or how about moving mozb (%rdi,%r10),%r8d upwards as movzb (%rdi,%r10),%r14b and make inter-register move between r8 and r14 conditional? I will try it. I have tried it, not performance gain. Does it mean that it's same or does it mean that it's slower? Was it cmov or was it jump over mov

Re: [openssl.org #1034] bug report (and fix): PKCS12_parse returns incorrect cert

2005-04-06 Thread Paul V Ford-Hutchinson via RT
OK,I'd like to report this as a bug to the IBM ikeyman folks. However, when I look at PKCS#12 v1 (http://www.rsasecurity.com/rsalabs/node.asp?id=2138) I don't see any discussion of this limitation of the localKeyID field. Is there a newer spec I should be looking at? BTW - the link on your

RE: RC4 optimize for em64t

2005-04-06 Thread Zou, Nanhai
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Andy Polyakov Sent: Wednesday, April 06, 2005 5:34 PM To: openssl-dev@openssl.org Subject: Re: RC4 optimize for em64t Or how about moving mozb (%rdi,%r10),%r8d upwards as movzb (%rdi,%r10),%r14b

RE: OpenSSL use of DCLP may not be thread-safe on multiple processors

2005-04-06 Thread David C. Partridge
oops ... First test should of course read: Singleton* Singleton::instance() { if (!initialised) // 1st test -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of David C. Partridge Sent: 06 April 2005 14:08 To: openssl-dev@openssl.org Cc: [EMAIL PROTECTED]

RE: OpenSSL use of DCLP may not be thread-safe on multiple processors

2005-04-06 Thread Steven Reddie
Hi David, I know that Scott is very busy at the moment, so he may not respond. I'll drop his address on the next reply. The implementations below suffer from the same general problem as the implementations I've been playing around with recently. On processors that reorder memory accesses it is

RE: OpenSSL use of DCLP may not be thread-safe on multiple processors

2005-04-06 Thread David C. Partridge
ARGH! Are you absolutely sure that this is the case - that's scary - I thought that the whole issue of SMP cache coherency and write order was solved years ago. I mean that if the order of memory write visibility between processors can't be g'teed, than a whole lot MORE than just DCLP

RE: OpenSSL use of DCLP may not be thread-safe on multiple processors

2005-04-06 Thread Steven Reddie
Check out A Formal Specification of Intel Itanium Processor Family Memory Ordering (http://www.intel.com/design/itanium/downloads/25142901.pdf). It describes in excruciating detail how reordering of memory operations can be observed by other processors. Example A.1 (in Appendix A) is a simple

p2q

2005-04-06 Thread Marius Schilder
Anyone ever contemplated coding openssl support for p2q 'rsa' moduli with Hensel lifting? Or does this already live in the codebase somewhere? marius __ OpenSSL Project http://www.openssl.org

RE: OpenSSL use of DCLP may not be thread-safe on multiple processors

2005-04-06 Thread David Schwartz
I mean that if the order of memory write visibility between processors can't be g'teed, than a whole lot MORE than just DCLP crashes and burns ... How in that case can anyone write safe MP code? D. The only correct and safe way to do it is with mutexes or their equivalent. DCLP

Certificate date validation

2005-04-06 Thread Bommareddy, Satish (Satish)
How do I check to see how many days are left for the validity of a certificate. Is there a openssl command which tells me the days or time left? X509_cmd_current_time returns a positive integer if a certificate is till valid? What does this signify? Is there a way to convert this to

[openssl.org #1039]

2005-04-06 Thread via RT
__ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]

Re: OpenSSL use of DCLP may not be thread-safe on multiple processors

2005-04-06 Thread Andrew Mann
David Schwartz wrote: I mean that if the order of memory write visibility between processors can't be g'teed, than a whole lot MORE than just DCLP crashes and burns ... How in that case can anyone write safe MP code? D. The only correct and safe way to do it is with mutexes or their

[openssl.org #1040] ctrls of type NO_INPUT don't work

2005-04-06 Thread via RT
Please see proposed patch for crypto/engine/eng_cnf.c. __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager

RE: OpenSSL use of DCLP may not be thread-safe on multiple processors

2005-04-06 Thread David Schwartz
Since aquiring the mutex is already on the 'slow' track, couldn't you just aquire a second (pointless) mutex inside the first around only the 'initialized=1;' assignment? If mutexes resolve the initial situation then they must be implemented with a memory fence (in the itanium model), and