> A read of a 'volatile uint64_t', btw, is supposed to make sure that it > reads from the original memory locations, not cached copies of it in > register or spread across multiple registers.
Which it doesn't do on any platform I know of. On every platform, 'volatile' reads through the caches and 'volatile' writes are writes to a cache to be flushed to main memory when it's convenient. The rationale is this simple -- people use 'volatile' in very important cases, such as longjmp and signals. Anything that is very expensive to do that isn't needed in those cases simply will not be put into 'volatile'. If that means you don't get the multi-thread semantics you erroneously thought you guaranteed, that's your problem. > Even if it can't be > accessed atomically, the guarantee is that it will fetch from memory, > so that it can be changed outside the current program flow (signal > handlers, shm, mmap, and the like). The problem is that on modern machines "fetch from memory" is ill-defined because memory is not all in one place. What is a "fetch from memory" on a ccNUMA system? The 'volatile' keyword comes from the C standard. The C standard doesn't say anything about sharing memory between processes or threads. There are standards for doing that, and 'volatile' only means anything in those contexts if the standards that let you share memory or create threads say so. A multi-threaded standard could say that 'volatile' has some particular set of semantics. However, as far as I know, *none* *does*. At one time, people thought ordering was part of those semantics. Now there are significant platforms where it isn't. At one point, people thought some types of atomicity were part of those semantics. Now there are significant platforms where it isn't. The 'volatile' keyword is only required to work for signals and longjmp. It may or may not work in other cases. If you try to state precisely what semantics you think 'volatile' assures you, I think you'll find that it's impossible. And even if you did, next year it could go on the list of semantics people thought they had. In this case, the semantic you need is basically that a concurrent read/write will not get a value other than one in the variable before the write or the one after. We know for a fact 'volatile' doesn't provide this guarantee, since it doesn't provide it for 'uint64_t' on platforms that don't have atomic 64-bit writes. And it's not because it's impossible to provide it. Atomic 64-bit writes can be faked with spinlocks around all 64-bit atomic accesses. You know why those platforms don't provide those spinlocks? Because 'volatile' does not provide the read/write guarantee you need, so there is no reason to put them there. > Yes, it needs to be locked (all > possible concurrent access needs to be locked, ideally, even though > that's a performance hog). Since a lock is both necessary and sufficient, what help is 'volatile'? > I'd not doubt that there are platforms where uint32_t can't be > accessed atomically. Or even uint16_t. The question becomes, "at > what point does it become not-cost-effective to support platforms that > cannot guarantee things that can be guaranteed by the platforms used > by a majority of the user base?" No, that's not the question. The question is, "are we going to follow the standards and use the guarantees we have or senselessly write code that relies on behavior that is explicitly undefined by the standards and risk having our code break on new CPUs?" > So, here's a novel idea: why don't you write a patch to clear the > compression structs that adheres to your view of appropriate design, > and submit it? I suggested either removing the locking entirely or simply removing the unlocked check for NULL. I think cutting/pasting a patch would be more work than simply making the change. DS ______________________________________________________________________ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]