On Mon, 8 Dec 2025, Uros Bizjak wrote:

> On Mon, Dec 8, 2025 at 2:41 PM Richard Biener <[email protected]> wrote:
> >
> > The following adjusts costing of vector construction from scalars for
> > FP modes which with 387 math can reside in FP regs which need spilling
> > to be reloaded to XMM.  I've played on the safe side with mixed
> > SSE/387 math.
> >
> > Bootstrap and regtest running on x86_64-unknown-linux-gnu.
> >
> > OK?
> >
> > Thanks,
> > Richard.
> >
> >         PR target/121230
> >         * config/i386/i386.cc (ix86_vector_costs::add_stmt_cost):
> >         With FP mode and 387 math cost spill/reload.
> >
> >         * gcc.target/i386/pr121230.c: New testcase.
> > ---
> >  gcc/config/i386/i386.cc                  | 15 ++++++++++++++-
> >  gcc/testsuite/gcc.target/i386/pr121230.c | 16 ++++++++++++++++
> >  2 files changed, 30 insertions(+), 1 deletion(-)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr121230.c
> >
> > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> > index db43045753b..ad978d7474d 100644
> > --- a/gcc/config/i386/i386.cc
> > +++ b/gcc/config/i386/i386.cc
> > @@ -26397,7 +26397,20 @@ ix86_vector_costs::add_stmt_cost (int count, 
> > vect_cost_for_stmt kind,
> >                                 (TREE_OPERAND (gimple_assign_rhs1 (def), 
> > 0))))))
> >             {
> >               if (fp)
> > -               m_num_sse_needed[where]++;
> > +               {
> > +                 /* Scalar FP values residing in x87 registers need to be
> > +                    spilled and reloaded.  */
> > +                 if (ix86_fpmath & FPMATH_387)
> 
> Perhaps you can use the IS_STACK_MODE() macro, it determines more
> precisely which mode is handled in stack registers.

Sure (though practically vectorized are only SFmode and DFmode?).
Like the following.

Re-testing on x86_64-unknown-linux-gnu, OK?

Thanks,
Richard.

>From 69c63b06daf193fcc5fa2e0093db0d4198b75432 Mon Sep 17 00:00:00 2001
From: Richard Biener <[email protected]>
Date: Mon, 8 Dec 2025 14:36:58 +0100
Subject: [PATCH] target/121230 - x86 vector CTOR cost with 387 math
To: [email protected]

The following adjusts costing of vector construction from scalars for
FP modes which with 387 math can reside in FP regs which need spilling
to be reloaded to XMM.  I've played on the safe side with mixed
SSE/387 math.

        PR target/121230
        * config/i386/i386.cc (ix86_vector_costs::add_stmt_cost):
        With FP mode and 387 math cost spill/reload.

        * gcc.target/i386/pr121230.c: New testcase.
---
 gcc/config/i386/i386.cc                  | 15 ++++++++++++++-
 gcc/testsuite/gcc.target/i386/pr121230.c | 16 ++++++++++++++++
 2 files changed, 30 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr121230.c

diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index db43045753b..75a9cb6211a 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -26397,7 +26397,20 @@ ix86_vector_costs::add_stmt_cost (int count, 
vect_cost_for_stmt kind,
                                (TREE_OPERAND (gimple_assign_rhs1 (def), 0))))))
            {
              if (fp)
-               m_num_sse_needed[where]++;
+               {
+                 /* Scalar FP values residing in x87 registers need to be
+                    spilled and reloaded.  */
+                 auto mode2 = TYPE_MODE (TREE_TYPE (op));
+                 if (IS_STACK_MODE (mode2))
+                   {
+                     int cost
+                       = (ix86_cost->hard_register.fp_store[mode2 == SFmode
+                                                            ? 0 : 1]
+                          + ix86_cost->sse_load[sse_store_index (mode2)]);
+                     stmt_cost += COSTS_N_INSNS (cost) / 2;
+                   }
+                 m_num_sse_needed[where]++;
+               }
              else
                {
                  m_num_gpr_needed[where]++;
diff --git a/gcc/testsuite/gcc.target/i386/pr121230.c 
b/gcc/testsuite/gcc.target/i386/pr121230.c
new file mode 100644
index 00000000000..67c9c5ccb2d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr121230.c
@@ -0,0 +1,16 @@
+/* { dg-do compile { target ia32 } } */
+/* { dg-options "-O3 -march=athlon-xp -mfpmath=387 
-fexcess-precision=standard" } */
+
+typedef struct {
+    float a;
+    float b;
+} f32_2;
+
+f32_2 add32_2(f32_2 x, f32_2 y) {
+    return (f32_2){ x.a + y.a, x.b + y.b};
+}
+
+/* We do not want the vectorizer to vectorize the store and/or the
+   conversion (with IA32 we do not support V2SF add) given that spills
+   FP regs to reload them to XMM.  */
+/* { dg-final { scan-assembler-not "movss\[ \\t\]+\[0-9\]*\\\(%esp\\\), %xmm" 
} } */
-- 
2.51.0

Reply via email to