[Bug target/104688] New: gcc and libatomic can use SSE for 128-bit atomic loads on Intel CPUs with AVX

xry111 at mengyan1223 dot wang via Gcc-bugs Fri, 25 Feb 2022 06:22:30 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104688


            Bug ID: 104688
           Summary: gcc and libatomic can use SSE for 128-bit atomic loads
                    on Intel CPUs with AVX
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: xry111 at mengyan1223 dot wang
  Target Milestone: ---

In Dec 2021, Intel updated the SDM and added the following content:

> Processors that enumerate support for Intel® AVX (by setting the feature flag 
> CPUID.01H:ECX.AVX[bit 28]) guarantee that the 16-byte memory operations 
> performed by the following instructions will always be carried out atomically:
> - MOVAPD, MOVAPS, and MOVDQA.
> - VMOVAPD, VMOVAPS, and VMOVDQA when encoded with VEX.128.
> - VMOVAPD, VMOVAPS, VMOVDQA32, and VMOVDQA64 when encoded with EVEX.128 and 
> k0 (masking disabled).
> 
> (Note that these instructions require the linear addresses of their memory 
> operands to be 16-byte aligned.)

(see Change 13, https://cdrdv2.intel.com/v1/dl/getContent/671294)

So we can use SSE for Intel CPUs with AVX, instead of a loop with LOCK
CMPXCHG16B.

AMD has no such guarantee (at least for now), so we still need LOCK CMPXCHG16B
on old Intel CPUs and (old or new) AMD CPUs.

[Bug target/104688] New: gcc and libatomic can use SSE for 128-bit atomic loads on Intel CPUs with AVX

Reply via email to