On 2/28/07, Gregory Shimansky <[EMAIL PROTECTED]> wrote:
Weldon Washburn wrote: > On second thought, the only way I know to implement volatile long (64-bit) > Java variables on ia32 is: > > grab critical section > mov [ecx], low32bits; // to do a write, the code for doing a read is > similar > mov[ecx+4], hi32bits; > release critical section Is it possible for 64-bit atomic load stores to use double load/stores
hmm... can you tell us the specific instructions you are suggesting? I see quad loads/stores but can't find the double load/store version. I also tried to find the guarantees on bus transactions. Somewhere I recall it is documented that 4-byte aligned loads/stores are guaranteed to be atomic. Maybe there are some new guarantees on 64-bit writes. In any case, we would still have to be compatible with existing Pentium III hardware and probably have to go with some sort of critical section approach.
or SSE4 on the processors that have it?
Good point. I recall old versions were really only focused on multimedia. And writing multimedia bits to memory is not sensitive to order or atomicity. In other words, if you are writing to a frame buffer, speed of writes is important but the order the bits hit the buffer is not. Again, I looked but could not find the latest info SSE4 and atomicity.
Some observations: > 1) > Fixing the "volatile long" bug (Harmony-2092) by using critical section as > above should, as a side-effect, allow DekkerTest.java to run. > 2) > Using volatile long sort of, kind of defeats a major reason to use Dekker > algorithm in the first place. Why bother if the performance is the same as > using critical sections? > 3) > Using "volatile int" in DekkerTest.java probably still fails because reads > can pass writes. One way to fix this might be to make the JIT emit r/w > memory fence whenever reading/writing the volatile int. While memory > fences > are often cheaper than HW locks, they are not free. > 4) > My guess is that there are no old legacy Java apps that use Dekker > algorithm. In other words, nobody is dependant on Dekker algorithm > working. My guess is that they are, however, dependent on volatile long > and > volatile int working properly. (which has the side effect of making Dekker > algo work.) > > > On 2/21/07, Weldon Washburn <[EMAIL PROTECTED]> wrote: >> >> >> >> On 2/21/07, Gregory Shimansky <[EMAIL PROTECTED]> wrote: >> > >> > On Wednesday 21 February 2007 21:47 Rana Dasgupta wrote: >> > > Weldon, >> > > But I am not sure why the behavior would be different from J9 on >> the >> > same >> > > hardware. Do we jit volatiles differently? >> >> >> The differences in behavior can be caused by lots of things that are not >> related to memory model. For example the JIT might actually emit slighly >> different code. Slighly different code can easily open/close race >> conditions. The important concept is that both J9 and drlvm fail. >> And the >> failure appears to be because modern hardware is most likely not >> designed to >> run Dekker's algo without memory fences. >> >> There is a bug on DRLVM about volatile variables HARMONY-2092. It is >> about >> > long and double type variables assignments. Is it the same as in >> > Dekker's >> > algorithm? >> >> DekkerTest.java uses "long" variables. Yes, this could change the rate >> of failure but not eliminate failures completely. >> >> >> > On 2/20/07, Weldon Washburn <[EMAIL PROTECTED]> wrote: >> > > > It seems Dekker's algorithm is not expected to work on SPARC or >> IA32 >> > SMP >> > > > boxes unless memory fences are used. DekkerTest.java in >> > Harmony-2986 >> > > > does not contain memory fences. The volatile keyword guarantees >> the >> > >> > > > compiler will write a given variable to memory. However, the HW >> may >> > > > actually have a >> > > > write buffer and allow reads to pass writes. As far as I know, the >> > Java >> > > > language does not provide a means to invoke a memory fence. Thus >> > there >> > > > is no way to fix up DekkerTest.java. I may be misunderstanding >> > something >> > > > here. Does anyone have comment? >> > > > >> > > > An excellent description of the issues involved is in a David Dice >> > > > presentation at: >> > > > >> > > > http://blogs.sun.com/dave/resource/synchronization-public2.pdf >> > > > >> > > > -- >> > > > Weldon Washburn >> > > > Intel Enterprise Solutions Software Division >> > >> > -- >> > Gregory >> > -- Gregory
-- Weldon Washburn Intel Enterprise Solutions Software Division
