(long long, double) is wrong on x86

peter at cordes dot ca Sat, 20 May 2017 12:33:04 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71660


Peter Cordes <peter at cordes dot ca> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |peter at cordes dot ca

--- Comment #5 from Peter Cordes <peter at cordes dot ca> ---
(In reply to Thiago Macieira from comment #3)
> (In reply to Jakub Jelinek from comment #1)
> > Foir double-word compare and exchange you need double-word alignment, so I
> > think the current alignment is correct.
> 
> The instruction manual says that CMPXCHG16B requires 128-bit alignment, but
> doesn't say the same for CMPXCHG8B. It says that the AC(0) alignment check
> fault could happen if it's not aligned, but doesn't say what the required
> alignment is.

The more important point is that simple loads and stores are not atomic on
cache-line splits, so requiring natural alignment for atomic objects would
avoid that.  LOCKed read-modify-write ops are also *much* slower on cache-line
splits.


#AC isn't really relevant, but I'd assume it requires 8B alignment since it's
really a single 8B atomic RMW.

#AC faults only happen if the kernel sets the AC bit in EFLAGS, which will
cause *any* unaligned access to fault.  Code all over the place assumes that
unaligned accesses are safe.  e.g. glibc memcpy commonly uses unaligned loads
for small non-power-of-2 sizes or unaligned inputs.  So you can't really enable
the AC flag with normal code.

I assume this is why Intel was lazy about documenting the exact details of #AC
behaviour for this instruction, or figured it was obvious.

[Bug libstdc++/71660] [5/6/7/8 regression] alignment of std::atomic<8 byte primitive type> (long long, double) is wrong on x86

Reply via email to