RE: 0.9.8: cfb_enc.c bug? and AES speed on Win64/x64
Instead of doing what was intended, moving the string up one place, the code has different behaviour. Yes, it will fill the buffer with H which is what I would expect to happen - not immediately obvious, but sensible. (any 370 assembler guys will recognise MVC as doing this). If you want to copy from one mem location to another even if they overlap *and* preserve the contents, then you should use memmove and pay the overhead of the temporary buffer it probably allocates. Dave __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
AES CTR mode implementation
I've been looking at the AES CTR mode implementation in 0.9.7 The counter increment function blindly assumes that the counter value can be incremented across the whole 128 bits of the counter block. If you look at (e.g.) RFC3686 or the NIST 800-38A publication, then they both envisage a counter block that incorporates a nonce and a block counter. e.g. RFC 3686 specifies a counter block like: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Nonce | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Initialization Vector (IV) | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Block Counter | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ then when the low order 32 bits overflows, the IV value will be overwritten in the current implementation. Shouldn't the AES CTR mode operation specify the number of bits to be used for the block counter and keep track to ensure the no more than 2^(block counter bits) are encrypted for this session? I've not had any chance to look at the 0.9.8 code yet, so apologies if this is fixed in the new release. Regards, David C. Partridge Technical Products Director Primeur Security Services Tel: +44 (0)1926 511058 Mobile: +44 (0)7713 880197 __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: AES CTR mode implementation
800-38A essentially says up to impementator, doesn't it? The standard incrementing function can apply either to an entire block or to a part of a block. Hmmm OK I do see you point here. I was sure I'd seen a discussion on the net about this saying that it was dangerous to (e.g.) start the counter at zero, and that a nonce should be built in, and that this part should remain constant. But, now that I've gone searching for it again I can't find it :-( I wonder why RFC3686 goes to the lengths it does to specify such a complex counter block with only the low order 32 bits being incremented??? Dave -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Andy Polyakov Sent: 08 July 2005 13:23 To: openssl-dev@openssl.org Subject: Re: AES CTR mode implementation The counter increment function blindly assumes that the counter value can be incremented across the whole 128 bits of the counter block. Correct, which is why it's called AES_ctr128_*. If you look at (e.g.) RFC3686 or the NIST 800-38A publication, then they both envisage a counter block that incorporates a nonce and a block counter. 800-38A essentialy says up to impementator, doesn't it? The standard incrementing function can apply either to an entire block or to a part of a block. e.g. RFC 3686 specifies a counter block like: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Nonce | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Initialization Vector (IV) | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Block Counter | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ then when the low order 32 bits overflows, the IV value will be overwritten in the current implementation. Shouldn't the AES CTR mode operation specify the number of bits to be used for the block counter and keep track to ensure the no more than 2^(block counter bits) are encrypted for this session? One can discuss additional function[s], AES_ctr_ipsec perhaps or AES_ctr_variable, which would provide for this, but it would be inappropriate to modify AES_ctr128_*. In other words it's not a matter for fixing present code, but extending functionality with new code. Is there broader interesent for ipsec-specific function than for variable? BTW I have AES_CCM_ipsec implementation pending. A. __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED] __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: [openssl.org #1096] Minor documentation bugs
The problem with pthread_self() is that the value it returns is defined to be opaque, and isn't necessarily (e.g.) and unsigned long (32 bit), though many Unix and Unix like systems do use a 32 bit value ... Dave __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: OpenSSL use of DCLP may not be thread-safe on multiple processors
Thanks all. It strikes me that the H/W designers have played a bit fast and loose with the cache consistency issue here - I believe I understand the C/C++ optimisation issues, and these CAN be worked around (IMHO) within the rules of the standard by using bool in some cases. However I've notified our dev folks to remove the few cases where we've used this technique as it is certainly dangerous. Dave __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: OpenSSL use of DCLP may not be thread-safe on multiple processors
oops ... First test should of course read: Singleton* Singleton::instance() { if (!initialised) // 1st test -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of David C. Partridge Sent: 06 April 2005 14:08 To: openssl-dev@openssl.org Cc: [EMAIL PROTECTED] Subject: RE: OpenSSL use of DCLP may not be thread-safe on multiple processors I've just read the paper, and I believe that the following variation on the code would work and would avoid the MP unsafe issues raised because bool is defined to be a single byte. Further-more, I'm pretty certain that it also resolves the issues with the order of construction and setting of the pointer in the singleton case, and probably resolves all the other over smart optimisation issues as well static volatile bool initialised=false; if (!initialised) { CRYPTO_w_lock(CRYPTO_LOCK_XXX); /* Avoid a race condition by checking again inside this lock */ if (!initialised) { x = ...; initialised=true; // Atomic operation } CRYPTO_w_unlock(CRYPTO_LOCK_XXX); } /* Now, make use of x */ Or expressed in terms of the Singleton pattern: in the header for the Singleton class file: static volatile bool initialised; in the Source file: static volatile bool Singleton::initialised=false; Singleton* Singleton::instance() { if (!initialised == 0) // 1st test { Lock lock; if (!initialised) // 2nd test { pInstance = new Singleton; initialised=true; // Atomic } } return pInstance; } I've been using this approach for absolutely YEARS, and didn't realise someone had honoured it with a design pattern name!!! I've copied this to Scott Meyers for him to comment on whether I've got this right ... Dave Partridge -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Steven Reddie Sent: 06 April 2005 10:02 To: openssl-dev@openssl.org Subject: OpenSSL use of DCLP may not be thread-safe on multiple processors Hi All, OpenSSL makes use of the DCLP (double-checked locking pattern) in a number of places (rsa_eay.c and at least one engine; I haven't done an exhaustive search), with code that usually looks like this: if (x == NULL) { CRYPTO_w_lock(CRYPTO_LOCK_XXX); /* Avoid a race condition by checking again inside this lock */ if (x == NULL) { x = ...; } CRYPTO_w_unlock(CRYPTO_LOCK_XXX); } /* Now, make use of x */ Some recent research I've done in this area, prompted by Scott Meyers' and Andrei Alexandrescu's article C++ and the Perils of Double-Checked Locking at http://www.aristeia.com/Papers/DDJ_Jul_Aug_2004_revised.pdf, makes me wonder whether this code is thread-safe on multi-processor machines. As the article points out, DCLP is dangerous in general, however it is most likely safe if the thing being tested and set is accessed atomically. On most 32-bit machines a 32-bit quantity will generally be accessed in a single bus transaction, making it inherently atomic. However, there may be cases where it is not atomic. An example could be on a machine that allows unaligned accesses, such as the x86. It may be possible for half of the value to be updated in another processors cache, and used (since the value is therefore not NULL), before the other half is updated. It seems that in fact the race condition that is trying to be avoided may have been reduced rather than eliminated. While it may be true that the code generated by the compiler doesn't typically result in unaligned accesses it is still a possibility that exists, and there may be other ways for non-atomic access to occur without unalignment being the cause. I've tried some elaborate workarounds to maintain the optimisation that DCLP provides, but they turn out to be not entirely safe on other processors such as the Itanium. The easiest way to fix this would seem to be always obtaining the lock before using the variables in question, but this could have an impact on performance. A more involved alternative is to use locked instructions, such as the Interlocked... Functions on Windows, and some hand-rolled assembler on other platforms, to ensure that the values are updated atomically. I'm not offering patches at this point in case there is too much resistance to a performance hit, so I'm interested to know thoughts either way. I agree that the margin for error is very, very small, and I don't know how much of an impact on performance the necessary changes would have, so I'm partly sending this so that if nothing is done and a future race-condition is reported it may assist with locating the problem. Regards, Steven __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List
RE: OpenSSL use of DCLP may not be thread-safe on multiple processors
ARGH! Are you absolutely sure that this is the case - that's scary - I thought that the whole issue of SMP cache coherency and write order was solved years ago. I mean that if the order of memory write visibility between processors can't be g'teed, than a whole lot MORE than just DCLP crashes and burns ... How in that case can anyone write safe MP code? D. __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: query: Private Key generation using OpenSSL
Any random data that is shared with the recipient will do as a key for HMAC Dave __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: ENGINE issues
IIRC the Luna CA3 is FIPS140-2 LEVEL 3 which means it won't allow you under nay circumstances to extract the private key from the device (non-extractable, sensitive in PKCS#11 parlance). What this means is that you need to send the data to the device to be signed (don't know how to do this using openssl), rather than extracting the key and using openssl to do the crypto in software. Dave __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: OS/2 support
Gosh there's a blast from the past! I remember your name from *way* back when I used to work on OS/2. How are you? Anyway to the chase: IIRC you just need to patch to replace the strncasecmp with strnicmp, and strcasecmp with stricmp ( or do conditional compilation). Dave Partridge -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of John Poltorak Sent: 09 January 2005 11:47 To: openssl-dev@openssl.org Subject: OS/2 support Is there an OS/2 maintainer involved in developing OpenSSL? The reason I ask is that up until v0.9.7c came out, it compiled out of the box. Since then it doesn't. The problem seems to have arisen since the introduction (or change) of ./crypto/o_str.c and results in these errors:- tmp_dll\o_str.obj(o_str.obj) : error L2029: 'strncasecmp' : unresolved external tmp_dll\o_str.obj(o_str.obj) : error L2029: 'strcasecmp' : unresolved external -- John __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED] __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: [openssl.org #502] TXT_DB error number 2
The renaming of the serial file is a known bug. See my recent post to openssl-dev Dave __ OpenSSL Project http://www.openssl.org Development Mailing List [EMAIL PROTECTED] Automated List Manager [EMAIL PROTECTED]