https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412

--- Comment #48 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jonathan Yong <[email protected]>:

https://gcc.gnu.org/g:77b2eaf09c77bf2dddcb8c35ee8bc9cc99e2f93c

commit r17-839-g77b2eaf09c77bf2dddcb8c35ee8bc9cc99e2f93c
Author: oltolm <[email protected]>
Date:   Sun May 24 12:57:49 2026 +0200

    x86: fix under-aligned indirect AVX argument/return stack slots on win64
[PR54412]

    Fix x86 caller/callee handling for over-aligned indirect arguments/returns

    On x86_64-w64-mingw32, TARGET_SEH limits MAX_SUPPORTED_STACK_ALIGNMENT
    to 128 bits, but 256-bit AVX values are still passed and returned
indirectly.
    Some caller/callee stack-slot paths still used generic allocators that cap
    requested alignment to MAX_SUPPORTED_STACK_ALIGNMENT, producing slots that
are
    under-aligned for later vmovapd/vmovaps accesses.

    Fix caller-side paths by using dynamically allocated stack space for:

    * over-aligned by-reference argument copies
    * over-aligned hidden return slots

    Fix callee-side paths by overallocating the local stack slot, then aligning
the
    effective address within that slot when required alignment exceeds
    MAX_SUPPORTED_STACK_ALIGNMENT.

    This preserves ABI behavior while ensuring alignment-sensitive AVX accesses
are
    correctly aligned in both caller and callee paths.

    Use a target hook to control when this over-aligned stack-slot handling is
    required, instead of hardcoding target conditionals in generic code.

    gcc/ChangeLog:

            PR target/54412
            * target.def (overaligned_stack_slot_required): New calls hook.
            * calls.cc (allocate_call_dynamic_stack_space): New helper.
            (initialize_argument_information): Use
            targetm.calls.overaligned_stack_slot_required for over-aligned
            by-reference argument copies.
            (expand_call): Use
            targetm.calls.overaligned_stack_slot_required for over-aligned
            hidden return slots.
            * function.cc (assign_stack_local_aligned): New helper.
            (assign_parm_setup_block): Use
            targetm.calls.overaligned_stack_slot_required for over-aligned
            stack parm slots.
            (assign_parm_setup_reg): Likewise.
            * config/i386/i386.cc (ix86_overaligned_stack_slot_required): New.
            (TARGET_OVERALIGNED_STACK_SLOT_REQUIRED): Define for i386.
            * doc/tm.texi.in: Add hook placement.
            * doc/tm.texi: Regenerate.

    Signed-off-by: oltolm <[email protected]>
    Signed-off-by: Jonathan Yong <[email protected]>

Reply via email to