The problem with only locking during the assignment is that potentially
mutliple threads will be doing [extensive] work that will be thrown away
when they discover that another thread beat them to it.  The result could be
that the lock is held for less time, but all threads do more work and
therefore the system is slower anyway.

I'm not sure that I understand your alternative to per-object locks.  In the
case of the new locks in rsa_eay.c, it is necessary to make all threads that
are trying to initialize and then use the montgomery values wait until the
values have been initialized.  Therefore competing threads still need to
wait on something, so if it's not a class-lock, it would have to be a
per-object lock.  Or maybe I'm missing something here.

Until per-object locking (or similar) is implemented, I think it's best to
allocate a new lock for this particular operation, say
CRYPTO_LOCK_RSA_LAZY_INIT.  Then at least it wont delay the reference
counting that CRYPTO_LOCK_RSA is used for.

Better to be safe and potentially slower than fast and out of control.

Steven

> -----Original Message-----
> From: Geoff Thorpe [SMTP:[EMAIL PROTECTED]]
> Sent: Tuesday, December 19, 2000 7:56 AM
> To:   [EMAIL PROTECTED]
> Subject:      Re: cvs commit: openssl/crypto/rsa rsa_eay.c
> 
> Hi Bodo (and anyone else interested),
> 
> Just a thought I was having about locking and things. Rather than us
> worrying so much about how to do per-object locking (as opposed to our
> current per-class locking), I wonder if it's worth considering how to
> minimize the number and complexity of operations that happen under this
> global locking. Anyway, "per-object" locking can be manufactured out of
> per-class locking by simply using the class locks to sychronise access to
> integers in the objects that themselves measure on/off mutexing, semaphore
> counters, or whatever. I won't say it's a great idea, but it's possible.
> My idea though is that the same principle could at least gain us some
> better mileage with the per-class locking if we use those locks only to
> flick switches and the like, but otherwise try and put all the meaty
> structural operations *outside* such locking.
> 
> This seems to be one such case;
> 
> On Mon, 18 Dec 2000 [EMAIL PROTECTED] wrote:
> 
> >   Obtain lock CRYPTO_LOCK_RSA before creating BN_MONT_CTX
> >   structures and setting rsa->_method_mod_{n,p,q}.
> 
> Rather than creating the BN_MONT_CTX structures inside a lock, surely it's
> a better idea to compete via a lock for *who* will assign the per-computed
> data to the structures. Doing the actual computations, allocations, etc
> should not need to happen inside a lock. I think ... perhaps you or anyone
> else can put me straight if I'm missing something salient.
> 
> Eg.
> 
> >   Index: rsa_eay.c
> >   ===================================================================
> >   RCS file: /e/openssl/cvs/openssl/crypto/rsa/rsa_eay.c,v
> >   retrieving revision 1.14
> >   retrieving revision 1.14.2.1
> >   diff -u -r1.14 -r1.14.2.1
> >   --- rsa_eay.c     2000/06/01 22:18:44     1.14
> >   +++ rsa_eay.c     2000/12/18 16:36:07     1.14.2.1
> >   @@ -138,9 +138,24 @@
> >     
> >     if ((rsa->_method_mod_n == NULL) && (rsa->flags &
> RSA_FLAG_CACHE_PUBLIC))
> >             {
> >   -         if ((rsa->_method_mod_n=BN_MONT_CTX_new()) != NULL)
> >   -                 if (!BN_MONT_CTX_set(rsa->_method_mod_n,rsa->n,ctx))
> >   -                     goto err;
> >   +         CRYPTO_w_lock(CRYPTO_LOCK_RSA);
> >   +         if (rsa->_method_mod_n == NULL)
> >   +                 {
> >   +                 BN_MONT_CTX* bn_mont_ctx;
> >   +                 if ((bn_mont_ctx=BN_MONT_CTX_new()) == NULL)
> >   +                         {
> >   +                         CRYPTO_w_unlock(CRYPTO_LOCK_RSA);
> >   +                         goto err;
> >   +                         }
> >   +                 if (!BN_MONT_CTX_set(bn_mont_ctx,rsa->n,ctx))
> >   +                         {
> >   +                         BN_MONT_CTX_free(bn_mont_ctx);
> >   +                         CRYPTO_w_unlock(CRYPTO_LOCK_RSA);
> >   +                         goto err;
> >   +                         }
> >   +                 rsa->_method_mod_n = bn_mont_ctx;
> >   +                 }
> >   +         CRYPTO_w_unlock(CRYPTO_LOCK_RSA);
> >             }
> 
> [snip]
> 
> What if the code was structured as follows;
> 
>     if((rsa->_method_mod_n == NULL) && [etc])
>         {
>         BN_MONT_CTX *bn_mont_ctx;
>         int bailout;
>         if((bn_mont_ctx=BN_MONT_CTX_new()) == NULL)
>             [do error stuff]
>         if(!BN_MONT_CTX_set(....))
>             [do error stuff]
>         bailout = 0;
>         /* Only now do we grab the lock to ensure threads don't race to
>          * assign montgomery stuff to the RSA structure. */
>         CRYPTO_w_lock(CRYPTO_LOCK_RSA);
>         /* Now we check the _method_mod_n and stuff *inside* the lock */
>         if((rsa->_method_mod_n == NULL) & [etc])
>              rsa->_method_mod_n = bn_mont_ctx;
>         else
>              bailout = 1;
>         CRYPTO_w_unlock(CRYPTO_LOCK_RSA);
>         if(bailout)
>              /* Release the pre-calculated montgomery stuff, we had
>               * threads race on *this* RSA structure so we wasted some
>               * effort in this case. */
>              BN_MONT_CTX_free(bn_mont_ctx);
>         }
> 
> In this case, (as with your existing code), the initial test doesn't
> require a lock - it will test again inside a lock if it gets that far.
> However, the penalty in my variant is that if threads race then some may
> end up creating montgomery stuff only to find out inside the lock that the
> work has just been done by another thread. The worst that can happen is
> that more than one thread does that same work in the event of a race (the
> race is specific to *this* RSA object). However, that's a risk of wasted
> effort local to just this particular RSA structure - the global lock is
> only being used to seal off the final check & assign of the precalculated
> data to the structure. Conversely, the risk in the existing code seems to
> be that all such operations on *all* RSA objects block.
> 
> In multi-threading apps (where one assumes threading is being used to
> maintain high concurrency), it seems (by thinking of classical SSL
> examples) that there's far less likely to be a race to initialise the same
> RSA structure than there is to be initialising montgomery stuff for
> distinct RSA structures. Eg. ephemeral keys, temporary keys, per-session
> keys, whatever. IMHO it'd be better to take a start-up hit in the form of
> a risk of redundant calculation on an internal key (used across all
> threads)  than to jeopardise concurrency in the long-term becase the
> threads are doing calculations on their own RSA keys inside global locks.
> 
> Thoughts?
> 
> Cheers,
> Geoff
> 
> 
> ______________________________________________________________________
> OpenSSL Project                                 http://www.openssl.org
> Development Mailing List                       [EMAIL PROTECTED]
> Automated List Manager                           [EMAIL PROTECTED]
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [EMAIL PROTECTED]
Automated List Manager                           [EMAIL PROTECTED]

Reply via email to