The locks aren’t being held for long.

I have noticed that the lock in the OPENSSL_CTX only seems to be locked for 
reading and only in one place.  crypto/context.c in openssl_ctx_get_data().  
Writes seem to go via CRYPTO_alloc_ex_data() which uses a global write lock (on 
a different lock).  Would it be feasible to drop the OPENSSL_CTX lock 
completely?  This would address the issue.  Matt?  I think you know this code 
the best.

CRYPTO_alloc_ex_data() is calling the new_func pointer under the read lock and 
the new function can in turn cause a search of the providers for 
implementations which grabs the provider lock.  Conversely, searching the 
providers for implementations grabs the provider lock and can lead to 
openssl_ctx_get_data() grabbing the context lock.

We don’t appear to have recursion at the moment, but the setup seems a little 
fragile.  We do have two different ordering for grabbing locks which is also 
bad.


Pauli
-- 
Dr Paul Dale | Cryptographer | Network Security & Encryption 
Phone +61 7 3031 7217
Oracle Australia



> On 31 Jul 2019, at 2:10 pm, Viktor Dukhovni <openssl-us...@dukhovni.org> 
> wrote:
> 
>> On Jul 30, 2019, at 10:02 PM, Dr Paul Dale <paul.d...@oracle.com> wrote:
>> 
>> The #9454 description includes thread sanitisizer logs showing different 
>> lock orderings — this has the potential to dead lock.  Agreed with Rich that 
>> giving up the lock would make sense, but I don’t see a way for this to be 
>> easily done.
> 
> My take is that we should never hold any lock long enough to even consider
> acquiring another lock.  No more than one lock should be held at any one
> time, and only long enough to bump a reference count to ensure that the
> object of interest is no deallocated before we (or our caller in a "get1"
> type interface) is done with it.
> 
> I don't know what "long-term" locks we're holding, but it would be great
> if it were possible to never (or never recursively) hold any such locks.
> 
> -- 
>       Viktor.
> 

Reply via email to