Darryl Mile wrote:
> A compiler will not generate a store instruction to put back a
> "cached_copy" into the variable location. Principally because there was
> no assignment operation in the original code and because even a
> non-optimizing compiler knows it can just dump the "cached_copy"
> temporary register on the floor.
That may be true, but it doesn't help. The compiler generating a store
instruction is only one way a cached copy can be written back to memory.
Anything a compiler can do could also be done by the CPU or memory hardware.
Only operations with specific guaranteed semantics solve that problem.
> The conversation stemmed from the code:
>
> 1289 if (ssl_comp_methods == NULL)
> 1290 return;
> 1291 CRYPTO_w_lock(CRYPTO_LOCK_SSL);
> 1292 if (ssl_comp_methods != NULL)
> 1993 {
> ...SNIP...
> 1296 }
> 1297 CRYPTO_w_unlock(CRYPTO_LOCK_SSL);
>
>
> If ssl_comp_methods is NULL, then no store instruction to the memory
> location of ssl_comp_methods will ever happen. So nothing is subject to
> the so called "lost write" concurrency problem.
You are correct that no store *instruction* will happen, but that doesn't mean
the CPU won't generate a store anyway. While it's true that current caches keep
track of whether data is modified and don't write back data unless it is, that
doesn't mean future CPUs might not find it more efficient to avoid having to
keep the flag and write back in all cases. What if the flag gets to be
expensive but write operations are cheap?
There is precedent. On the x86, for example, at least some compare-exchange
operations will write back the original data if the compare fails.
In other words, if you ask for:
if(a==7) a=3;
You might get (in *microcode* even though the instructions say what you want):
a = (a==7) ? 3 : a;
This can be an optimization because the the code as specified requires a
micro-branch internal to the compare-exchange instruction, which can be
expensive if mispredicted. The unconditional store avoids the branch and can be
implemented mathematically.
If the compiler was designed before anyone knew a CPU might do this, it could
easily generate a compare-exchange instruction for a volatile variable that had
an access of this form. The compiler author "knows" the CPU follows the
requested code precisely. The next CPU, however, makes the store unconditional
as an optimization because a branch is cheaper than a store. So sorry, your new
CPU breaks your old executables.
The code we are talking about is of this precise form. It compares the value of
a pointer and changes the value of that pointer only if the pointer was not
NULL to begin with. Now the code we are talking about contains locks inside the
conditional, so it will not fail this exact way. (I believe the x86 will only
do this for bus-locked instructions anyway, but there is no reason why the next
x86 couldn't do the same thing for non-bus-locked transactions too.)
You argument is of the form "I can't think of any way it could fail". This says
more about your imagination than the validity of the code. ;)
I agree none of the way I can think of that it can fail seem particularly
likely. The problem is the ways it can fail that neither of us can think of --
until it fails. Experience has shown me (and I cited a few examples in this
thread) that will be ways it can fail you can't think of.
That is why standards provide guarantees.
> So marking it 'volatile' will not gain the above code anything. But it
> will inhibit valid optimizations the compiler might make to all code
> that uses the variable 'ssl_comp_methods' that are inside the locked
> regions.
I agree with that.
DS
______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List [email protected]
Automated List Manager [EMAIL PROTECTED]