Hi Hans, On Thu, Mar 3, 2016 at 4:08 AM, Hans Boehm <hbo...@google.com> wrote:
> > On Wed, Mar 2, 2016 at 12:09 AM, Thomas Stüfe <thomas.stu...@gmail.com> > wrote: > > > > Hi Hans, > > > > thanks for the hint! > > > > But how would I do this for my problem: > > > > Allocate memory, zero it out and then store the pointer into a variable > seen by other threads, while preventing the other threads from seeing . I > do not understand how atomics would help: I can make the pointer itself an > atomic, but that only guarantees memory ordering in regard to this > variable, not to the allocated memory. > > > > Kind Regards, Thomas > > C11 atomics work essentially like Java volatiles: They order other memory > accesses as well. If you declare the pointer to be atomic, and store into > it, then another thread reading the newly assigned value will also see the > stores preceding the pointer store. Since the pointer is the only value > that can be accessed concurrently by multiple threads (with not all > accesses reads), it's the only object that needs to be atomic. In this > case, it's sufficient to store into the pointer with > > atomic_store_explicit(&ptr, <new_value>, memory_order_release); > > and read it with > > atomic_load_explicit(&ptr, memory_order_acquire); > > which are a bit cheaper. > > However, this is C11 specific, and I don't know whether that's acceptable > to use in this context. > > If you can't assume C11, the least incorrect workaround is generally to > make the pointer volatile, precede the store with a fence, and follow the > load with a fence. On x86, both fences just need to prevent compiler > reordering. > Thank you for that excellent explanation! This may be just my ignorance, but I actually did not know that atomics are now a part of the C standard. I took this occasion to look up all other C11 features and this is quite neat :) Nice to see that C continues to live. I am very hesitant though about introducing C11 features into the JDK. We deal with notoriously oldish compilers, especially on AIX, and I do not want to be the first to force C11, especially not on such a side issue. The more I look at this, the more I think that the costs for a pthread mutex lock are acceptable in this case: we are about to do a blocking IO operation anyway, which is already flanked by two mutex locking calls (in startOp and endOp). I doubt that a third mutex call will be the one making the costs suddenly unacceptable. Especially since they can be avoided altogether for low value mutex numbers (the optimization Roger suggested). I will do some performance tests and check whether the added locking calls are even measurable. Thomas