https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71660
Peter Cordes <peter at cordes dot ca> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |peter at cordes dot ca --- Comment #5 from Peter Cordes <peter at cordes dot ca> --- (In reply to Thiago Macieira from comment #3) > (In reply to Jakub Jelinek from comment #1) > > Foir double-word compare and exchange you need double-word alignment, so I > > think the current alignment is correct. > > The instruction manual says that CMPXCHG16B requires 128-bit alignment, but > doesn't say the same for CMPXCHG8B. It says that the AC(0) alignment check > fault could happen if it's not aligned, but doesn't say what the required > alignment is. The more important point is that simple loads and stores are not atomic on cache-line splits, so requiring natural alignment for atomic objects would avoid that. LOCKed read-modify-write ops are also *much* slower on cache-line splits. #AC isn't really relevant, but I'd assume it requires 8B alignment since it's really a single 8B atomic RMW. #AC faults only happen if the kernel sets the AC bit in EFLAGS, which will cause *any* unaligned access to fault. Code all over the place assumes that unaligned accesses are safe. e.g. glibc memcpy commonly uses unaligned loads for small non-power-of-2 sizes or unaligned inputs. So you can't really enable the AC flag with normal code. I assume this is why Intel was lazy about documenting the exact details of #AC behaviour for this instruction, or figured it was obvious.