> On 24 May 2016, at 21:29, Aleksey Shipilev <[email protected]> 
> wrote:
> 
> On 05/24/2016 05:43 AM, John Rose wrote:
>> On May 23, 2016, at 4:20 PM, Martin Buchholz <[email protected]
>> <mailto:[email protected]>> wrote:
>>> 
>>> As I said in a previous message, you can implement subword CAS using
>>> fullword CAS in a loop.
>>> 
>>> cas8bit(expect, update) {
>>> for (;;) {
>>>   fullword = atomicRead32()
>>>   if ((fullword &0xff) != expect) return false;
>>>   if (cas32(fullword, (fullword & ~0xff) | update) return true;
>>> }
>>> }
> 
> Yes, stupid me! I was under impression that loops are no-no to emulate
> strong CAS. But we do loops already with LL/SC…

Indeed, doh!

Martin, many thanks for persisting with this.


> 
>> Yes, that's the "artisanal" version I would reach for.
>> It doesn't scale well if there is unrelated activity on nearby bytes.
> 
> Okay, we are exploring it here:
> https://bugs.openjdk.java.net/browse/JDK-8157726
> 
> I was able to intrinsify subword accesses on x86_64, and their
> performance is on par with int versions. Plain Martin-style Java loops
> are around 2x slower than direct intrinsics in a few basic tests (I
> expect them to be even slower on contended cases and/or non-x86
> platforms). But first, we need to hook them up to VarHandles (in
> progress now).
> 

Nice work! This is looking very promising on x86.

Paul.

Reply via email to