I think I've seen stuff like that too when writing the replication code, and it only seems to happen on OSX.
@Ryan, did you try running the same test a couple of times on a linux box? J-D On Fri, Apr 16, 2010 at 2:29 AM, Ryan Rawson <ryano...@gmail.com> wrote: > At least it's in test code right? > > *sigh* > > On Thu, Apr 15, 2010 at 5:27 PM, Todd Lipcon <t...@cloudera.com> wrote: >> On Thu, Apr 15, 2010 at 5:26 PM, Ryan Rawson <ryano...@gmail.com> wrote: >> >>> Looking at the code to AtomicBoolean it uses an atomic int to >>> accomplish it's task on OSX. >>> >>> So just what the heck is going on here? >>> >>> >> I think you're fooling yourself, and the bug isn't gone, just hiding :) >> >> -Todd >> >> >>> On Thu, Apr 15, 2010 at 5:24 PM, Ryan Rawson <ryano...@gmail.com> wrote: >>> > I doubt it's case #2, there is a lot of complex code that runs between >>> > putThread.start() and putThread.done(). >>> > >>> > In terms of JVMs, I'm using Java 6 on OSX x64. HBase effectively >>> > requires Java 6 (and if we dont explicitly require it, we should) and >>> > it also specifically cannot use certain broken JVM pushes (eg: >>> > jdk6u18) so much that we are adding in code to prevent ourselves from >>> > running on it and warning the user. >>> > >>> > But just for a moment, I think it's inappropriate for the JVM to be >>> > specifying caching or non-caching of variables in the systems cache. >>> > That is way too much abstraction leakage up to the language level. >>> > Most SMP systems have cache coherency control that allow you to read >>> > from cache yet get invalidations when other processors (on other dies) >>> > write to that memory entry. >>> > >>> > But nevertheless, the problem no longer exists with AtomicBoolean :-) >>> > >>> > >>> > >>> > On Thu, Apr 15, 2010 at 5:05 PM, Paul Cowan <co...@aconex.com> wrote: >>> >> On -9/01/37 05:59, Ryan Rawson wrote: >>> >>> >>> >>> So the previous use of volatile for a boolean seems like a textbook >>> >>> case, but the situation i discovered was pretty clear cut. I have no >>> >>> other explanation than a highly delayed volatile read (which are >>> >>> allowed). >>> >> >>> >> I don't see that they are allowed, actually. >>> >> >>> >> Section 17.4.5 of the JLS says that: >>> >> >>> >>> * An unlock on a monitor happens-before every subsequent lock on that >>> >>> monitor. >>> >>> * A write to a volatile field (§8.3.1.4) happens-before every >>> subsequent >>> >>> read of that field. >>> >> >>> >> IOW, the situations (unlock-then-lock) and (volatile-write then >>> >> volatile-read) have the same visibility guarantees. >>> >> >>> >> Section 8.3.1.4 says: >>> >> >>> >>> A field may be declared volatile, in which case the Java memory model >>> >>> (§17) ensures that all threads see a consistent value for the >>> variable. >>> >> >>> >> In your case, the thread calling done() is not seeing the same value as >>> the >>> >> thread calling run(), which is not consistent. >>> >> >>> >> And for good measure Java Concurrency in Practice makes it much more >>> >> explicit (emphasis mine): >>> >> >>> >>> Volatile variables are not cached in registers or in caches where they >>> are >>> >>> hidden from other processors, so *a read of a volatile variable always >>> >>> returns the most recent write by any thread*. >>> >> >>> >> And finally, on changing to an AtomicBoolean fixing the problem, JCIP >>> says: >>> >> >>> >>> Atomic variables offer the same memory semantics as volatile variables >>> >> >>> >> So this doesn't really make sense either. >>> >> >>> >> All that's a long way of saying that the only ways I can see your >>> situation >>> >> happening are: >>> >> >>> >> * pre-Java-1.5 (and hence pre-JSR-133) JVM >>> >> * JVM with a bug >>> >> * ordering is not as you expect, i.e. the actual chronological order is >>> not: >>> >> >>> >> THREAD 1 THREAD 2 >>> >> spawn new thread >>> >> run() >>> >> done() >>> >> join() >>> >> >>> >> but rather: >>> >> >>> >> THREAD 1 THREAD 2 >>> >> spawn new thread >>> >> done() >>> >> run() >>> >> join() >>> >> >>> >> in which case the set of run to false at the start of run() overwrites >>> the >>> >> set of it to true at the start of done(), and you're in for infinite >>> loop >>> >> fun. >>> >> >>> >> Cheers, >>> >> >>> >> Paul >>> >> >>> > >>> >> >> >> >> -- >> Todd Lipcon >> Software Engineer, Cloudera >> >