[Bug target/113133] [14 Regression] ICE: SIGSEGV in mark_label_nuses(rtx_def*) (emit-rtl.cc:3896) with -O -fno-tree-ter -mavx512f -march=barcelona

2024-01-01 Thread haochen.jiang at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113133

--- Comment #11 from Haochen Jiang  ---
I just checked the code and pattern. I suppose the simple remove is reasonable
here. We should only allow x/ymm16+ for scalar instructions, but not this
pattern.

[Bug target/113133] [14 Regression] ICE: SIGSEGV in mark_label_nuses(rtx_def*) (emit-rtl.cc:3896) with -O -fno-tree-ter -mavx512f -march=barcelona

2023-12-29 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113133

Uroš Bizjak  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #10 from Uroš Bizjak  ---
Fixed by a partial revert of
r14-4499-gc1eef66baa8dde706d7ea6921648e6016dc7c93d.

[Bug target/113133] [14 Regression] ICE: SIGSEGV in mark_label_nuses(rtx_def*) (emit-rtl.cc:3896) with -O -fno-tree-ter -mavx512f -march=barcelona

2023-12-29 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113133

--- Comment #9 from GCC Commits  ---
The master branch has been updated by Uros Bizjak :

https://gcc.gnu.org/g:1e7f9abb892443719c82bb17910caa8fb5eeec15

commit r14-6862-g1e7f9abb892443719c82bb17910caa8fb5eeec15
Author: Uros Bizjak 
Date:   Fri Dec 29 09:47:43 2023 +0100

i386: Fix TARGET_USE_VECTOR_FP_CONVERTS SF->DF float_extend splitter
[PR113133]

The post-reload splitter currently allows xmm16+ registers with
TARGET_EVEX512.
The splitter changes SFmode of the output operand to V4SFmode, but the
vector
mode is currently unsupported in xmm16+ without TARGET_AVX512VL.
lowpart_subreg
returns NULL_RTX in this case and the compilation fails with invalid RTX.

The patch removes support for x/ymm16+ registers with TARGET_EVEX512.  The
support should be restored once ix86_hard_regno_mode_ok is fixed to allow
16-byte modes in x/ymm16+ with TARGET_EVEX512.

PR target/113133

gcc/ChangeLog:

* config/i386/i386.md
(TARGET_USE_VECTOR_FP_CONVERTS SF->DF float_extend splitter):
Do not handle xmm16+ with TARGET_EVEX512.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr113133-1.c: New test.
* gcc.target/i386/pr113133-2.c: New test.

[Bug target/113133] [14 Regression] ICE: SIGSEGV in mark_label_nuses(rtx_def*) (emit-rtl.cc:3896) with -O -fno-tree-ter -mavx512f -march=barcelona

2023-12-29 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113133

--- Comment #8 from Uroš Bizjak  ---
(In reply to Haochen Jiang from comment #6)
> Aha, I see what happened. x/ymm16+ are usable for AVX512F w/o AVX512VL and
> that is why I added that to allow them.
> 
> Let me find a way to see if we can fix this.

It looks to me that ix86_hard_regno_mode_ok should be fixed to allow x/ymm16+
also with EVEX512. Currently we have:

  /* TODO check for QI/HI scalars.  */
  /* AVX512VL allows sse regs16+ for 128/256 bit modes.  */
  if (TARGET_AVX512VL
  && (VALID_AVX256_REG_OR_OI_MODE (mode)
  || VALID_AVX512VL_128_REG_MODE (mode)))
return true;

so the compiler is unable to change some of the modes of the xmm16 to 128-bit
mode using lowpart_subreg, e.g. DFmode to V4SFmode.

Please also note that your original patch missed to add TARGET_EVEX512 to the
splitter that handles float_truncate with TARGET_USE_VECTOR_FP_CONVERTS.

I propose to proceed with the minimal fix from Comment #3 as a hotfix to
unbreak the testcase in this PR. The real, but more involved fix is to fix
ix86_hard_regno_mode_ok, which I'll leave to you.

[Bug target/113133] [14 Regression] ICE: SIGSEGV in mark_label_nuses(rtx_def*) (emit-rtl.cc:3896) with -O -fno-tree-ter -mavx512f -march=barcelona

2023-12-28 Thread haochen.jiang at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113133

--- Comment #7 from Haochen Jiang  ---
(In reply to Uroš Bizjak from comment #1)
> Created attachment 56962 [details]
> Proposed patch
> 
> Patch in testing.
> 
> lowpart_subreg can't handle:
> 
> lowpart_subreg (V4SFmode, operands[0], DFmode);
> 
> and
> 
> lowpart_subreg (V2DFmode, operands[0], SFmode);
> 
> subreg conversions and will return NULL_RTX for these cases.

I suppose the patch here is ok at least from my initial test.

[Bug target/113133] [14 Regression] ICE: SIGSEGV in mark_label_nuses(rtx_def*) (emit-rtl.cc:3896) with -O -fno-tree-ter -mavx512f -march=barcelona

2023-12-28 Thread haochen.jiang at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113133

--- Comment #6 from Haochen Jiang  ---
Aha, I see what happened. x/ymm16+ are usable for AVX512F w/o AVX512VL and that
is why I added that to allow them.

Let me find a way to see if we can fix this.

[Bug target/113133] [14 Regression] ICE: SIGSEGV in mark_label_nuses(rtx_def*) (emit-rtl.cc:3896) with -O -fno-tree-ter -mavx512f -march=barcelona

2023-12-28 Thread haochen.jiang at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113133

--- Comment #5 from Haochen Jiang  ---
(In reply to Uroš Bizjak from comment #3)
> This patch also fixes the failure:
> 
> --cut here--
> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> index ca6dbf42a6d..cdb9ddc4eb3 100644
> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -5210,7 +5210,7 @@ (define_split
> && optimize_insn_for_speed_p ()
> && reload_completed
> && (!EXT_REX_SSE_REG_P (operands[0])
> -   || TARGET_AVX512VL || TARGET_EVEX512)"
> +   || TARGET_AVX512VL)"
> [(set (match_dup 2)
>  (float_extend:V2DF
>(vec_select:V2SF
> --cut here--

Hmm, it looks weird I added EVEX512 near AVX512VL, checking why I am doing
that.

[Bug target/113133] [14 Regression] ICE: SIGSEGV in mark_label_nuses(rtx_def*) (emit-rtl.cc:3896) with -O -fno-tree-ter -mavx512f -march=barcelona

2023-12-28 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113133

Uroš Bizjak  changed:

   What|Removed |Added

 CC||haochen.jiang at intel dot com

--- Comment #4 from Uroš Bizjak  ---
Caused by r14-4499-gc1eef66baa8dde706d7ea6921648e6016dc7c93d

[Bug target/113133] [14 Regression] ICE: SIGSEGV in mark_label_nuses(rtx_def*) (emit-rtl.cc:3896) with -O -fno-tree-ter -mavx512f -march=barcelona

2023-12-28 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113133

Uroš Bizjak  changed:

   What|Removed |Added

   Target Milestone|--- |14.0

--- Comment #3 from Uroš Bizjak  ---
This patch also fixes the failure:

--cut here--
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index ca6dbf42a6d..cdb9ddc4eb3 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -5210,7 +5210,7 @@ (define_split
&& optimize_insn_for_speed_p ()
&& reload_completed
&& (!EXT_REX_SSE_REG_P (operands[0])
-   || TARGET_AVX512VL || TARGET_EVEX512)"
+   || TARGET_AVX512VL)"
[(set (match_dup 2)
 (float_extend:V2DF
   (vec_select:V2SF
--cut here--

[Bug target/113133] [14 Regression] ICE: SIGSEGV in mark_label_nuses(rtx_def*) (emit-rtl.cc:3896) with -O -fno-tree-ter -mavx512f -march=barcelona

2023-12-28 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113133

--- Comment #2 from Uroš Bizjak  ---
Another testcase:

--cut here--
void
foo1 (double *d, float f)
{
  register float x __asm ("xmm16") = f;
  asm volatile ("" : "+v" (x));

  *d = x;
}

void
foo2 (float *f, double d)
{
  register double x __asm ("xmm16") = d;
  asm volatile ("" : "+v" (x));

  *f = x;
}
--cut here--

[Bug target/113133] [14 Regression] ICE: SIGSEGV in mark_label_nuses(rtx_def*) (emit-rtl.cc:3896) with -O -fno-tree-ter -mavx512f -march=barcelona

2023-12-28 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113133

Uroš Bizjak  changed:

   What|Removed |Added

   Last reconfirmed||2023-12-28
 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1
   Assignee|unassigned at gcc dot gnu.org  |ubizjak at gmail dot com

--- Comment #1 from Uroš Bizjak  ---
Created attachment 56962
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56962=edit
Proposed patch

Patch in testing.

lowpart_subreg can't handle:

lowpart_subreg (V4SFmode, operands[0], DFmode);

and

lowpart_subreg (V2DFmode, operands[0], SFmode);

subreg conversions and will return NULL_RTX for these cases.