Darryl Mile wrote: > A compiler will not generate a store instruction to put back a > "cached_copy" into the variable location. Principally because there was > no assignment operation in the original code and because even a > non-optimizing compiler knows it can just dump the "cached_copy" > temporary register on the floor.
That may be true, but it doesn't help. The compiler generating a store instruction is only one way a cached copy can be written back to memory. Anything a compiler can do could also be done by the CPU or memory hardware. Only operations with specific guaranteed semantics solve that problem. > The conversation stemmed from the code: > > 1289 if (ssl_comp_methods == NULL) > 1290 return; > 1291 CRYPTO_w_lock(CRYPTO_LOCK_SSL); > 1292 if (ssl_comp_methods != NULL) > 1993 { > ...SNIP... > 1296 } > 1297 CRYPTO_w_unlock(CRYPTO_LOCK_SSL); > > > If ssl_comp_methods is NULL, then no store instruction to the memory > location of ssl_comp_methods will ever happen. So nothing is subject to > the so called "lost write" concurrency problem. You are correct that no store *instruction* will happen, but that doesn't mean the CPU won't generate a store anyway. While it's true that current caches keep track of whether data is modified and don't write back data unless it is, that doesn't mean future CPUs might not find it more efficient to avoid having to keep the flag and write back in all cases. What if the flag gets to be expensive but write operations are cheap? There is precedent. On the x86, for example, at least some compare-exchange operations will write back the original data if the compare fails. In other words, if you ask for: if(a==7) a=3; You might get (in *microcode* even though the instructions say what you want): a = (a==7) ? 3 : a; This can be an optimization because the the code as specified requires a micro-branch internal to the compare-exchange instruction, which can be expensive if mispredicted. The unconditional store avoids the branch and can be implemented mathematically. If the compiler was designed before anyone knew a CPU might do this, it could easily generate a compare-exchange instruction for a volatile variable that had an access of this form. The compiler author "knows" the CPU follows the requested code precisely. The next CPU, however, makes the store unconditional as an optimization because a branch is cheaper than a store. So sorry, your new CPU breaks your old executables. The code we are talking about is of this precise form. It compares the value of a pointer and changes the value of that pointer only if the pointer was not NULL to begin with. Now the code we are talking about contains locks inside the conditional, so it will not fail this exact way. (I believe the x86 will only do this for bus-locked instructions anyway, but there is no reason why the next x86 couldn't do the same thing for non-bus-locked transactions too.) You argument is of the form "I can't think of any way it could fail". This says more about your imagination than the validity of the code. ;) I agree none of the way I can think of that it can fail seem particularly likely. The problem is the ways it can fail that neither of us can think of -- until it fails. Experience has shown me (and I cited a few examples in this thread) that will be ways it can fail you can't think of. That is why standards provide guarantees. > So marking it 'volatile' will not gain the above code anything. But it > will inhibit valid optimizations the compiler might make to all code > that uses the variable 'ssl_comp_methods' that are inside the locked > regions. I agree with that. DS ______________________________________________________________________ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]