Please keep in mind that this is just an EA, of the _first_ Valhalla JEP.  The 
goal is to get feedback on the fundamentals, there are many JEPs coming after 
401, and many optimizations that we understand but have not yet committed to 
practice.

There are many ways we might _eventually_ get to 128 bit atomicity, but I 
suspect that day is … not close.  All the possible mechanisms have tradeoffs 
that make it not the slam-dunk it might appear.  STM, TSX, LLSC, CAS, and 
friends all have access cost penalties.  Vector registers have costs to move 
data between the vector and regular registers.  128 bit atomics almost surely 
means 128 bit alignment — which becomes a tradeoff between flatness and 
density.  (And not just 128 bit alignment for flattened fields; possibly for 
_all_ objects.)

We are also working on mechanisms to allow relaxed atomicity based on user 
directives, but this dramatically raises the complexity of the programming 
model, because it takes you from “its just an immutable object” (which anyone 
can reason about) to having to think about tconcurrency and data races.

So, yes, there are instructions that on paper have the desired characteristics 
— but this is merely a necessary but not sufficient condition for routinely 
flattening to 128.



On Nov 2, 2025, at 10:52 PM, Danny Thomas 
<[email protected]<mailto:[email protected]>> wrote:

Hi folks,

I caught Frederic's excellent JVMLS talk over the weekend and was most 
interested in the updates on tearing, and appreciate the pragmatism of not 
allowing tearing of flattened values.

At 13:15[1] it's mentioned that there's no flattening for value types larger 
than 64 bits, as most platforms don't have support for larger atomic 
operations. My understanding is that both Intel and AMD have guaranteed the 
atomicity of 128 bit SSE loads and stores on CPUs with AVX[2] and aarch64 has 
had support for 128 bit atomic instructions for some time (and without any 
performance limitations w/ LSE).

JEP 401 only has this to say:

In the future, 128-bit flattened references may be possible on platforms that 
support atomic reads and writes of that size, or in special cases like final 
fields.

Is there a misunderstanding of the state of platform support for 128-bit 
atomics or have I missed details omitted for brevity?

Cheers,
Danny

1. https://youtu.be/NF4CpL_EWFI?t=795
2. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104688

Reply via email to