OK, I've taken a look at this, and scratched my head a little. It's a 
touch complicated by the fact that thread-ids have changed in the head 
of development relative to what you're looking at in 0.9.8. But I'm now 
wondering if you haven't misunderstood the nature of openssl's threading 
support;

On Tuesday 31 March 2009 04:52:19 Marc Haisenko wrote:
> Hi folks,
> last week I described a problem with RSA blinding and its locking[1].
> Coincidentally another user ran into the same problem in a totally
> different scenario, the work-around of disabling the blinding fixed
> the issue for him[2].
>
> So here's a patch to fix this issue.
>
> The issue is that the rsa_get_blinding method checks whether there is
> concurrent access to the blinding and tells its caller whether locking
> needs to be done. The check is done by comparing thread/process IDs.
> If they differ locking needs to be done. The problem is that locking
> is not done if the IDs are same. It is assumed that the creator
> doesn't need to do locking which is correct for a single-threaded
> application but is wrong in a multi- threaded/processed environment
> once another process has accessed the blinding.

We need to separate multi-threading from multi-processing here. Once 
you've forked, there should be no locking relationship whatsoever 
between processes, irrespective of any threads (and thread-ids) within 
those processes. Do you have some reason to believe this isn't the case? 
The fork has essentially duplicated the process and created isolation 
between the two (copy-on-write or otherwise). So our only concern 
*should* be between threads within the same process.

So ... multi-threading. My understanding of the RSA blinding (which I 
didn't write) is that rsa->blinding is lazy-initialised by whichever 
thread gets to it first. The call to RSA_setup_blinding() creates the 
blinding struct and tags it with the current thread-id. Subsequent 
accesses to RSA blinding will use rsa->blinding if and only if the 
caller is the thread that lazy-initialised it (ipso facto, if processing 
of a given RSA struct is always in one thread - this is optimal). 
Accesses to RSA blinding from other threads should use rsa->mt_blinding 
(which is lazy-initialised by the first such thread) and this places 
obligations on the caller - ie. do your own locking, don't use a common 
blinding factor, etc.

This should work. I'm wondering if I understood what you meant by "The 
problem is that locking is not done if the IDs are same." ... If the IDs 
are the same, that means you're in the same thread, period!! If that's 
not the case, then the problem is that your thread-id callback isn't set 
up correctly. For threading support to work correctly, you need to 
provide hooks in order to make openssl compatible with your thread model 
(pthreads or otherwise). For 0.9.8, the APIs include;

void CRYPTO_set_locking_callback(void (*func)(int mode,int type,
                                              const char *file,int line));
void CRYPTO_set_id_callback(unsigned long (*func)(void));

Locking callbacks default to nops, and id callbacks default to different 
things depending on platform, but one possibility is "&errno" (ie. the 
address of the errno var, in the hopes that it's thread-local) and 
another is getpid() (ie. in the hopes that the threads have distinct 
pids too). It may be that the default behaviour is degenerating on your 
platform and not correctly differentiating between threads.

>
> As a fix I've introduced a new flag for struct rsa_st,
> RSA_FLAG_BLINDING_NEEDS_LOCKING. Once it has been noticed that locking
> has to be done the flag is set and from then on we always do locking,
> whether we are the creator or not.
>
> However, I'm scratching my head why this fixes the issue given the
> fact that there are actually two blindings: one when doing no locking
> (rsa->blinding) and another one when doing locking (rsa->mt_blinding).
> Yet our load test indicates that this really does fix the issue:
> without it we get about a dozen Bad Record MAC per day due to the
> blinding. Since I've implemented the attached patch none has occured
> (which is about 1.5 days of load test now).

As before, the thread detection ('id_callback') might be failing you such 
that all threads appear to be "the same thread"? Try calling 
CRYPTO_thread_id() from different threads in the same process and see 
whether they return the same value. This might explain why you "fixed" 
the problem by locking all callers even when they all have the same 
thread ID...

Cheers,
Geoff

-- 
Un terrien, c'est un singe avec des clefs de char...
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       openssl-dev@openssl.org
Automated List Manager                           majord...@openssl.org

Reply via email to