I think I've seen stuff like that too when writing the replication
code, and it only seems to happen on OSX.

@Ryan, did you try running the same test a couple of times on a linux box?

J-D

On Fri, Apr 16, 2010 at 2:29 AM, Ryan Rawson <ryano...@gmail.com> wrote:
> At least it's in test code right?
>
> *sigh*
>
> On Thu, Apr 15, 2010 at 5:27 PM, Todd Lipcon <t...@cloudera.com> wrote:
>> On Thu, Apr 15, 2010 at 5:26 PM, Ryan Rawson <ryano...@gmail.com> wrote:
>>
>>> Looking at the code to AtomicBoolean it uses an atomic int to
>>> accomplish it's task on OSX.
>>>
>>> So just what the heck is going on here?
>>>
>>>
>> I think you're fooling yourself, and the bug isn't gone, just hiding :)
>>
>> -Todd
>>
>>
>>>  On Thu, Apr 15, 2010 at 5:24 PM, Ryan Rawson <ryano...@gmail.com> wrote:
>>> > I doubt it's case #2, there is a lot of complex code that runs between
>>> > putThread.start() and putThread.done().
>>> >
>>> > In terms of JVMs, I'm using Java 6 on OSX x64.  HBase effectively
>>> > requires Java 6 (and if we dont explicitly require it, we should) and
>>> > it also specifically cannot use certain broken JVM pushes (eg:
>>> > jdk6u18) so much that we are adding in code to prevent ourselves from
>>> > running on it and warning the user.
>>> >
>>> > But just for a moment, I think it's inappropriate for the JVM to be
>>> > specifying caching or non-caching of variables in the systems cache.
>>> > That is way too much abstraction leakage up to the language level.
>>> > Most SMP systems have cache coherency control that allow you to read
>>> > from cache yet get invalidations when other processors (on other dies)
>>> > write to that memory entry.
>>> >
>>> > But nevertheless, the problem no longer exists with AtomicBoolean :-)
>>> >
>>> >
>>> >
>>> > On Thu, Apr 15, 2010 at 5:05 PM, Paul Cowan <co...@aconex.com> wrote:
>>> >> On -9/01/37 05:59, Ryan Rawson wrote:
>>> >>>
>>> >>> So the previous use of volatile for a boolean seems like a textbook
>>> >>> case, but the situation i discovered was pretty clear cut. I have no
>>> >>> other explanation than a highly delayed volatile read (which are
>>> >>> allowed).
>>> >>
>>> >> I don't see that they are allowed, actually.
>>> >>
>>> >> Section 17.4.5 of the JLS says that:
>>> >>
>>> >>> * An unlock on a monitor happens-before every subsequent lock on that
>>> >>> monitor.
>>> >>> * A write to a volatile field (§8.3.1.4) happens-before every
>>> subsequent
>>> >>> read of that field.
>>> >>
>>> >> IOW, the situations (unlock-then-lock) and (volatile-write then
>>> >> volatile-read) have the same visibility guarantees.
>>> >>
>>> >> Section 8.3.1.4 says:
>>> >>
>>> >>> A field may be declared volatile, in which case the Java memory model
>>> >>> (§17)  ensures that all threads see a consistent value for the
>>> variable.
>>> >>
>>> >> In your case, the thread calling done() is not seeing the same value as
>>> the
>>> >> thread calling run(), which is not consistent.
>>> >>
>>> >> And for good measure Java Concurrency in Practice makes it much more
>>> >> explicit (emphasis mine):
>>> >>
>>> >>> Volatile variables are not cached in registers or in caches where they
>>> are
>>> >>> hidden from other processors, so *a read of a volatile variable always
>>> >>> returns the most recent write by any thread*.
>>> >>
>>> >> And finally, on changing to an AtomicBoolean fixing the problem, JCIP
>>> says:
>>> >>
>>> >>> Atomic variables offer the same memory semantics as volatile variables
>>> >>
>>> >> So this doesn't really make sense either.
>>> >>
>>> >> All that's a long way of saying that the only ways I can see your
>>> situation
>>> >> happening are:
>>> >>
>>> >> * pre-Java-1.5 (and hence pre-JSR-133) JVM
>>> >> * JVM with a bug
>>> >> * ordering is not as you expect, i.e. the actual chronological order is
>>> not:
>>> >>
>>> >>    THREAD 1                 THREAD 2
>>> >>    spawn new thread
>>> >>                             run()
>>> >>    done()
>>> >>    join()
>>> >>
>>> >> but rather:
>>> >>
>>> >>    THREAD 1                 THREAD 2
>>> >>    spawn new thread
>>> >>    done()
>>> >>                             run()
>>> >>    join()
>>> >>
>>> >> in which case the set of run to false at the start of run() overwrites
>>> the
>>> >> set of it to true at the start of done(), and you're in for infinite
>>> loop
>>> >> fun.
>>> >>
>>> >> Cheers,
>>> >>
>>> >> Paul
>>> >>
>>> >
>>>
>>
>>
>>
>> --
>> Todd Lipcon
>> Software Engineer, Cloudera
>>
>

Reply via email to