Re: [x86 PATCH] Fix FAIL of gcc.target/i386/pr78794.c on ia32.

2023-06-27 Thread Uros Bizjak via Gcc-patches
On Tue, Jun 27, 2023 at 8:40 PM Roger Sayle  wrote:
>
>
> This patch fixes the FAIL of gcc.target/i386/pr78794.c on ia32, which
> is caused by minor STV rtx_cost differences with -march=silvermont.
> It turns out that generic tuning results in pandn, but the lack of
> accurate parameterization for COMPARE in compute_convert_gain combined
> with small differences in scalar<->SSE costs on silvermont results in
> this DImode chain not being converted.
>
> The solution is to provide more accurate costs/gains for converting
> (DImode and SImode) comparisons.
>
> I'd been holding off of doing this as I'd thought it would be possible
> to turn pandn;ptestz into ptestc (for an even bigger scalar-to-vector
> win) but I've recently realized that these optimizations (as I've
> implemented them) occur in the wrong order (stv2 occurs after
> combine), so it isn't easy for STV to convert CCZmode into CCCmode.
> Doh!  Perhaps something can be done in peephole2...
>
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32}
> with no new failures.  Ok for mainline?
>
>
> 2023-06-27  Roger Sayle  
>
> gcc/ChangeLog
> PR target/78794
> * config/i386/i386-features.cc (compute_convert_gain): Provide
> more accurate gains for conversion of scalar comparisons to
> PTEST.

LGTM.

Thanks,
Uros.

>
> Thanks for your patience.
> Roger
> --
>


[x86 PATCH] Fix FAIL of gcc.target/i386/pr78794.c on ia32.

2023-06-27 Thread Roger Sayle

This patch fixes the FAIL of gcc.target/i386/pr78794.c on ia32, which
is caused by minor STV rtx_cost differences with -march=silvermont.
It turns out that generic tuning results in pandn, but the lack of
accurate parameterization for COMPARE in compute_convert_gain combined
with small differences in scalar<->SSE costs on silvermont results in
this DImode chain not being converted.

The solution is to provide more accurate costs/gains for converting
(DImode and SImode) comparisons.

I'd been holding off of doing this as I'd thought it would be possible
to turn pandn;ptestz into ptestc (for an even bigger scalar-to-vector
win) but I've recently realized that these optimizations (as I've
implemented them) occur in the wrong order (stv2 occurs after
combine), so it isn't easy for STV to convert CCZmode into CCCmode.
Doh!  Perhaps something can be done in peephole2...


This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32}
with no new failures.  Ok for mainline?


2023-06-27  Roger Sayle  

gcc/ChangeLog
PR target/78794
* config/i386/i386-features.cc (compute_convert_gain): Provide
more accurate gains for conversion of scalar comparisons to
PTEST.


Thanks for your patience.
Roger
--

diff --git a/gcc/config/i386/i386-features.cc b/gcc/config/i386/i386-features.cc
index 4a3b07a..53bec08 100644
--- a/gcc/config/i386/i386-features.cc
+++ b/gcc/config/i386/i386-features.cc
@@ -631,7 +631,31 @@ general_scalar_chain::compute_convert_gain ()
break;
 
  case COMPARE:
-   /* Assume comparison cost is the same.  */
+   if (XEXP (src, 1) != const0_rtx)
+ {
+   /* cmp vs. pxor;pshufd;ptest.  */
+   igain += COSTS_N_INSNS (m - 3);
+ }
+   else if (GET_CODE (XEXP (src, 0)) != AND)
+ {
+   /* test vs. pshufd;ptest.  */
+   igain += COSTS_N_INSNS (m - 2);
+ }
+   else if (GET_CODE (XEXP (XEXP (src, 0), 0)) != NOT)
+ {
+   /* and;test vs. pshufd;ptest.  */
+   igain += COSTS_N_INSNS (2 * m - 2);
+ }
+   else if (TARGET_BMI)
+ {
+   /* andn;test vs. pandn;pshufd;ptest.  */
+   igain += COSTS_N_INSNS (2 * m - 3);
+ }
+   else
+ {
+   /* not;and;test vs. pandn;pshufd;ptest.  */
+   igain += COSTS_N_INSNS (3 * m - 3);
+ }
break;
 
  case CONST_INT: