[Bug middle-end/111754] [14 Regression] ICE: in decompose, at rtl.h:2313 at -O

2023-11-28 Thread prathamesh3492 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111754

--- Comment #15 from prathamesh3492 at gcc dot gnu.org ---
Sorry for the regression, and thanks for the prompt fix!

[Bug middle-end/111754] [14 Regression] ICE: in decompose, at rtl.h:2313 at -O

2023-11-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111754

--- Comment #14 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:e6c01334ccfca8bc748c8de90ba2a636d1662490

commit r14-5902-ge6c01334ccfca8bc748c8de90ba2a636d1662490
Author: Jakub Jelinek 
Date:   Tue Nov 28 10:16:47 2023 +0100

testsuite: Fix up pr111754.c test

On Tue, Nov 28, 2023 at 03:56:47PM +0800, juzhe.zh...@rivai.ai wrote:
> Hi, there is a regression in RISC-V caused by this patch:
>
> FAIL: gcc.dg/vect/pr111754.c -flto -ffat-lto-objects  scan-tree-dump
optimized "return { 0.0, 9.0e\\+0, 0.0, 0.0 }"
> FAIL: gcc.dg/vect/pr111754.c scan-tree-dump optimized "return { 0.0,
9.0e\\+0, 0.0, 0.0 }"
>
> I have checked the dump is :
> F foo (F a, F b)
> {
>[local count: 1073741824]:
>= { 0.0, 9.0e+0, 0.0, 0.0 };
>   return ;
>
> }
>
> The dump IR seems reasonable to me.
> I wonder whether we should walk around in RISC-V backend to generate the
same IR as ARM SVE ?
> Or we should adjust the test ?

Note, the test also FAILs on i686-linux (but not e.g. on x86_64-linux):
/home/jakub/src/gcc/obj67/gcc/xgcc -B/home/jakub/src/gcc/obj67/gcc/
/home/jakub/src/gcc/gcc/testsuite/gcc.dg/vect/pr111754.c
-fdiagnostics-plain-output -O2 -fdump-tree-optimized -S
+-o pr111754.s
/home/jakub/src/gcc/gcc/testsuite/gcc.dg/vect/pr111754.c: In function
'foo':
/home/jakub/src/gcc/gcc/testsuite/gcc.dg/vect/pr111754.c:7:1: warning: SSE
vector return without SSE enabled changes the ABI [-Wpsabi]
/home/jakub/src/gcc/gcc/testsuite/gcc.dg/vect/pr111754.c:6:3: note: the ABI
for passing parameters with 16-byte alignment has changed in GCC 4.6
/home/jakub/src/gcc/gcc/testsuite/gcc.dg/vect/pr111754.c:6:3: warning: SSE
vector argument without SSE enabled changes the ABI [-Wpsabi]
FAIL: gcc.dg/vect/pr111754.c (test for excess errors)
Excess errors:
/home/jakub/src/gcc/gcc/testsuite/gcc.dg/vect/pr111754.c:7:1: warning: SSE
vector return without SSE enabled changes the ABI [-Wpsabi]
/home/jakub/src/gcc/gcc/testsuite/gcc.dg/vect/pr111754.c:6:3: warning: SSE
vector argument without SSE enabled changes the ABI [-Wpsabi]

PASS: gcc.dg/vect/pr111754.c scan-tree-dump-not optimized "VEC_PERM_EXPR"
FAIL: gcc.dg/vect/pr111754.c scan-tree-dump optimized "return { 0.0,
9.0e\\+0, 0.0, 0.0 }"

So, I think it is wrong to specify
/* { dg-options "-O2 -fdump-tree-optimized" } */
in the test, should be dg-additional-options instead, so that it gets
the implied vector compilation options e.g. for i686-linux (-msse2 in that
case at least), question is if -Wno-psabi should be added as well or not,
and certainly the scan-tree-dump needs to be guarded by appropriate
vect_* effective target (but dunno which, one which asserts support for
V4SFmode and returning it).
Alternatively, perhaps don't check optimized dump but some earlier one
before generic vector lowering, then hopefully it could match on all
targets?  Maybe with the  = ... vs. return ... variants.

2023-11-28  Jakub Jelinek  

PR middle-end/111754
* gcc.dg/vect/pr111754.c: Use dg-additional-options rather than
dg-options, add -Wno-psabi and use -fdump-tree-forwprop1 rather
than
-fdump-tree-optimized.  Scan forwprop1 dump rather than optimized
and
scan for either direct return or setting of  to the vector.

[Bug middle-end/111754] [14 Regression] ICE: in decompose, at rtl.h:2313 at -O

2023-11-27 Thread prathamesh3492 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111754

prathamesh3492 at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #13 from prathamesh3492 at gcc dot gnu.org ---
Fixed.

[Bug middle-end/111754] [14 Regression] ICE: in decompose, at rtl.h:2313 at -O

2023-11-27 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111754

--- Comment #12 from GCC Commits  ---
The master branch has been updated by Prathamesh Kulkarni
:

https://gcc.gnu.org/g:2065438db4ac13af7e0de2f934959413647f74a7

commit r14-5890-g2065438db4ac13af7e0de2f934959413647f74a7
Author: Prathamesh Kulkarni 
Date:   Mon Nov 27 22:40:49 2023 +0530

PR111754: Rework encoding of result for VEC_PERM_EXPR with constant input
vectors.

gcc/ChangeLog:
PR middle-end/111754
* fold-const.cc (fold_vec_perm_cst): Set result's encoding to sel's
encoding, and set res_nelts_per_pattern to 2 if sel contains
stepped
sequence but input vectors do not.
(test_nunits_min_2): New test Case 8.
(test_nunits_min_4): New tests Case 8 and Case 9.

gcc/testsuite/ChangeLog:
PR middle-end/111754
* gcc.target/aarch64/sve/slp_3.c: Adjust code-gen.
* gcc.target/aarch64/sve/slp_4.c: Likewise.
* gcc.dg/vect/pr111754.c: New test.

Co-authored-by: Richard Sandiford 

[Bug middle-end/111754] [14 Regression] ICE: in decompose, at rtl.h:2313 at -O

2023-11-10 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111754

Sam James  changed:

   What|Removed |Added

 CC||sjames at gcc dot gnu.org
   Keywords|needs-bisection |

--- Comment #11 from Sam James  ---
(In reply to Richard Sandiford from comment #10)
> Yeah, the problem went latent after the fix for PR111648, [...]

OK, removing needs-bisection then.

[Bug middle-end/111754] [14 Regression] ICE: in decompose, at rtl.h:2313 at -O

2023-11-05 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111754

--- Comment #10 from Richard Sandiford  ---
Yeah, the problem went latent after the fix for PR111648, but the underlying
problem is still there.  Prathamesh is working on a fix for that.  See the
thread starting at
https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633784.html for
discussion.

[Bug middle-end/111754] [14 Regression] ICE: in decompose, at rtl.h:2313 at -O

2023-11-05 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111754

--- Comment #9 from Andrew Pinski  ---
The ICE seems to be gone, and the generated code looks correct.

[Bug middle-end/111754] [14 Regression] ICE: in decompose, at rtl.h:2313 at -O

2023-10-10 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111754

--- Comment #8 from rguenther at suse dot de  ---
> Am 10.10.2023 um 16:07 schrieb prathamesh3492 at gcc dot gnu.org 
> :
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111754
> 
> --- Comment #6 from prathamesh3492 at gcc dot gnu.org ---
> (In reply to rguent...@suse.de from comment #4)
>>> On Tue, 10 Oct 2023, prathamesh3492 at gcc dot gnu.org wrote:
>>> 
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111754
>>> 
>>> --- Comment #3 from prathamesh3492 at gcc dot gnu.org ---
>>> The issue is that we only support integral vector types in 
>>> fold_vec_perm_cst,
>>> but fail to check for the same before calling it from fold_vec_perm.
>>> The following tweak fixes the ICE:
>>> 
>>> diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
>>> index 4f8561509ff..a29a8af6d2f 100644
>>> --- a/gcc/fold-const.cc
>>> +++ b/gcc/fold-const.cc
>>> @@ -10801,7 +10801,8 @@ fold_vec_perm (tree type, tree arg0, tree arg1, 
>>> const
>>> vec_perm_indices )
>>> return NULL_TREE;
>>> 
>>>   if (TREE_CODE (arg0) == VECTOR_CST
>>> -  && TREE_CODE (arg1) == VECTOR_CST)
>>> +  && TREE_CODE (arg1) == VECTOR_CST
>>> +  && INTEGRAL_TYPE_P (TREE_TYPE (type)))
>>> return fold_vec_perm_cst (type, arg0, arg1, sel);
>> 
>> Huh, that looks wrong.  I fail to see how the element type matters
>> at all.
> 
> IIUC, the element type matters for VLA folding when sel has a stepped sequence
> because in that case we need to derive elements from the encoding using
> vector_cst_elt / vector_cst_int_elt, and it gets enforced for VLS vectors too
> because they are handled in unified manner in fold_vec_perm_cst.
> 
> Another possible approach is to use "VLS exception" in fold_vec_perm_cst to
> encode all the elements:
> res_npatterns = res_nelts;
> res_nelts_per_patterns = 1
> just like we do if valid_mask_for_fold_vec_perm_cst_p returns false.
> 
> Does the following fix look OK instead ?

I think so, the important part is to fold for VLS types.  I defer to Richard S.
For whether we can do better for VLA here

> diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
> index 4f8561509ff..356eb052fbc 100644
> --- a/gcc/fold-const.cc
> +++ b/gcc/fold-const.cc
> @@ -10642,6 +10642,11 @@ valid_mask_for_fold_vec_perm_cst_p (tree arg0, tree
> arg1,
>   if (sel_nelts_per_pattern < 3)
> return true;
> 
> +  /* If SEL contains stepped sequence, ensure that we are dealing with
> + integral vector_cst.  */
> +  if (!INTEGRAL_TYPE_P (TREE_TYPE (arg0)))
> +return false;
> +
>   for (unsigned pattern = 0; pattern < sel_npatterns; pattern++)
> {
>   poly_uint64 a1 = sel[pattern + sel_npatterns];
> 
> -- 
> You are receiving this mail because:
> You are on the CC list for the bug.

[Bug middle-end/111754] [14 Regression] ICE: in decompose, at rtl.h:2313 at -O

2023-10-10 Thread prathamesh3492 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111754

--- Comment #7 from prathamesh3492 at gcc dot gnu.org ---
(In reply to Richard Biener from comment #5)
> It seems we have VECTOR_CST_NELTS_PER_PATTERN ({ 9.0e+0, 0.0, 0.0, 0.0 })
> 2 and VECTOR_CST_NPATTERNS == 1.  And the selector { 1, 0, 1, 2 } has
> npatterns == 1 and nelts-per-pattern == 3.
> 
>   /* (1) If SEL is a suitable mask as determined by
>  valid_mask_for_fold_vec_perm_cst_p, then:
>  res_npatterns = max of npatterns between ARG0, ARG1, and SEL
>  res_nelts_per_pattern = max of nelts_per_pattern between
>  ARG0, ARG1 and SEL.
>  (2) If SEL is not a suitable mask, and TYPE is VLS then:
>  res_npatterns = nelts in result vector.
>  res_nelts_per_pattern = 1.
>  This exception is made so that VLS ARG0, ARG1 and SEL work as before. 
> */
>   if (valid_mask_for_fold_vec_perm_cst_p (arg0, arg1, sel, reason))
> {
>   res_npatterns
> = std::max (VECTOR_CST_NPATTERNS (arg0),
> std::max (VECTOR_CST_NPATTERNS (arg1),
>   sel.encoding ().npatterns ()));
> 
>   res_nelts_per_pattern
> = std::max (VECTOR_CST_NELTS_PER_PATTERN (arg0),
> std::max (VECTOR_CST_NELTS_PER_PATTERN (arg1),
>   sel.encoding ().nelts_per_pattern ()));
> 
>   res_nelts = res_npatterns * res_nelts_per_pattern;
> 
> this seems to be a case that doesn't fit, so the fix needs to be to
> valid_mask_for_fold_vec_perm_cst_p which really looks a bit
> unwieldly.
valid_mask_for_fold_vec_perm_cst_p returns incorrectly true here,
which is being addressed in PR111648 patch:
https://gcc.gnu.org/pipermail/gcc-patches/2023-October/631926.html

Even if the vectors had integral element type:
arg0 = arg1 = (v4si){ 9, 0, 0, 0 }  // encoded as {9, 0, ...}
and sel = { 1, 0, 1, 2 }  // encoded as {1, 0, 1, ...}

The pattern in sel {1, 0, 1, ...}
would choose elements from arg0, and
res would have incorrect encoding with step = -9:
res = { arg0[1], arg0[0], arg0[1], ... } 
= { 0, 9, 0, ... }
And res[3] will be incorrectly computed as -9 instead of arg0[2].

However, for floating element types, even if encoding is correct,
I assume it will still ICE when trying to derive elements not present in
encoding since poly_int_cst can only deal with integral elements ?
> 
> An assert that res_nelts is power-of-two would be nice to add.
Sorry, I don't understand. res_nelts for VLA need not be power of 2,
since res_nelts_per_pattern can be 3. The encoding for res is chosen
to be max of npatterns and max of nelts_per_pattern between arg0, arg1, and
sel.

Thanks,
Prathamesh

[Bug middle-end/111754] [14 Regression] ICE: in decompose, at rtl.h:2313 at -O

2023-10-10 Thread prathamesh3492 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111754

--- Comment #6 from prathamesh3492 at gcc dot gnu.org ---
(In reply to rguent...@suse.de from comment #4)
> On Tue, 10 Oct 2023, prathamesh3492 at gcc dot gnu.org wrote:
> 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111754
> > 
> > --- Comment #3 from prathamesh3492 at gcc dot gnu.org ---
> > The issue is that we only support integral vector types in 
> > fold_vec_perm_cst,
> > but fail to check for the same before calling it from fold_vec_perm.
> > The following tweak fixes the ICE:
> > 
> > diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
> > index 4f8561509ff..a29a8af6d2f 100644
> > --- a/gcc/fold-const.cc
> > +++ b/gcc/fold-const.cc
> > @@ -10801,7 +10801,8 @@ fold_vec_perm (tree type, tree arg0, tree arg1, 
> > const
> > vec_perm_indices )
> >  return NULL_TREE;
> > 
> >if (TREE_CODE (arg0) == VECTOR_CST
> > -  && TREE_CODE (arg1) == VECTOR_CST)
> > +  && TREE_CODE (arg1) == VECTOR_CST
> > +  && INTEGRAL_TYPE_P (TREE_TYPE (type)))
> >  return fold_vec_perm_cst (type, arg0, arg1, sel);
> 
> Huh, that looks wrong.  I fail to see how the element type matters
> at all.

IIUC, the element type matters for VLA folding when sel has a stepped sequence
because in that case we need to derive elements from the encoding using
vector_cst_elt / vector_cst_int_elt, and it gets enforced for VLS vectors too
because they are handled in unified manner in fold_vec_perm_cst.

Another possible approach is to use "VLS exception" in fold_vec_perm_cst to
encode all the elements:
res_npatterns = res_nelts;
res_nelts_per_patterns = 1
just like we do if valid_mask_for_fold_vec_perm_cst_p returns false.

Does the following fix look OK instead ?

diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
index 4f8561509ff..356eb052fbc 100644
--- a/gcc/fold-const.cc
+++ b/gcc/fold-const.cc
@@ -10642,6 +10642,11 @@ valid_mask_for_fold_vec_perm_cst_p (tree arg0, tree
arg1,
   if (sel_nelts_per_pattern < 3)
 return true;

+  /* If SEL contains stepped sequence, ensure that we are dealing with
+ integral vector_cst.  */
+  if (!INTEGRAL_TYPE_P (TREE_TYPE (arg0)))
+return false;
+
   for (unsigned pattern = 0; pattern < sel_npatterns; pattern++)
 {
   poly_uint64 a1 = sel[pattern + sel_npatterns];

[Bug middle-end/111754] [14 Regression] ICE: in decompose, at rtl.h:2313 at -O

2023-10-10 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111754

Richard Biener  changed:

   What|Removed |Added

 CC||rsandifo at gcc dot gnu.org

--- Comment #5 from Richard Biener  ---
It seems we have VECTOR_CST_NELTS_PER_PATTERN ({ 9.0e+0, 0.0, 0.0, 0.0 })
2 and VECTOR_CST_NPATTERNS == 1.  And the selector { 1, 0, 1, 2 } has
npatterns == 1 and nelts-per-pattern == 3.

  /* (1) If SEL is a suitable mask as determined by
 valid_mask_for_fold_vec_perm_cst_p, then:
 res_npatterns = max of npatterns between ARG0, ARG1, and SEL
 res_nelts_per_pattern = max of nelts_per_pattern between
 ARG0, ARG1 and SEL.
 (2) If SEL is not a suitable mask, and TYPE is VLS then:
 res_npatterns = nelts in result vector.
 res_nelts_per_pattern = 1.
 This exception is made so that VLS ARG0, ARG1 and SEL work as before.  */
  if (valid_mask_for_fold_vec_perm_cst_p (arg0, arg1, sel, reason))
{
  res_npatterns
= std::max (VECTOR_CST_NPATTERNS (arg0),
std::max (VECTOR_CST_NPATTERNS (arg1),
  sel.encoding ().npatterns ()));

  res_nelts_per_pattern
= std::max (VECTOR_CST_NELTS_PER_PATTERN (arg0),
std::max (VECTOR_CST_NELTS_PER_PATTERN (arg1),
  sel.encoding ().nelts_per_pattern ()));

  res_nelts = res_npatterns * res_nelts_per_pattern;

this seems to be a case that doesn't fit, so the fix needs to be to
valid_mask_for_fold_vec_perm_cst_p which really looks a bit
unwieldly.

An assert that res_nelts is power-of-two would be nice to add.

[Bug middle-end/111754] [14 Regression] ICE: in decompose, at rtl.h:2313 at -O

2023-10-10 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111754

--- Comment #4 from rguenther at suse dot de  ---
On Tue, 10 Oct 2023, prathamesh3492 at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111754
> 
> --- Comment #3 from prathamesh3492 at gcc dot gnu.org ---
> The issue is that we only support integral vector types in fold_vec_perm_cst,
> but fail to check for the same before calling it from fold_vec_perm.
> The following tweak fixes the ICE:
> 
> diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
> index 4f8561509ff..a29a8af6d2f 100644
> --- a/gcc/fold-const.cc
> +++ b/gcc/fold-const.cc
> @@ -10801,7 +10801,8 @@ fold_vec_perm (tree type, tree arg0, tree arg1, const
> vec_perm_indices )
>  return NULL_TREE;
> 
>if (TREE_CODE (arg0) == VECTOR_CST
> -  && TREE_CODE (arg1) == VECTOR_CST)
> +  && TREE_CODE (arg1) == VECTOR_CST
> +  && INTEGRAL_TYPE_P (TREE_TYPE (type)))
>  return fold_vec_perm_cst (type, arg0, arg1, sel);

Huh, that looks wrong.  I fail to see how the element type matters
at all.

[Bug middle-end/111754] [14 Regression] ICE: in decompose, at rtl.h:2313 at -O

2023-10-10 Thread prathamesh3492 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111754

--- Comment #3 from prathamesh3492 at gcc dot gnu.org ---
The issue is that we only support integral vector types in fold_vec_perm_cst,
but fail to check for the same before calling it from fold_vec_perm.
The following tweak fixes the ICE:

diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
index 4f8561509ff..a29a8af6d2f 100644
--- a/gcc/fold-const.cc
+++ b/gcc/fold-const.cc
@@ -10801,7 +10801,8 @@ fold_vec_perm (tree type, tree arg0, tree arg1, const
vec_perm_indices )
 return NULL_TREE;

   if (TREE_CODE (arg0) == VECTOR_CST
-  && TREE_CODE (arg1) == VECTOR_CST)
+  && TREE_CODE (arg1) == VECTOR_CST
+  && INTEGRAL_TYPE_P (TREE_TYPE (type)))
 return fold_vec_perm_cst (type, arg0, arg1, sel);

   /* For fall back case, we want to ensure we have VLS vectors

and results in the following .optimized dump:
F bar (F a, F b)
{
  F c;

   [local count: 1073741824]:
  c_2 = VEC_PERM_EXPR ;
  __builtin_logbl (0.0);
  return c_2;

}

F foo ()
{
   [local count: 1073741824]:
  __builtin_logbl (0.0);
  return { 0.0, 9.0e+0, 0.0, 0.0 };

}

[Bug middle-end/111754] [14 Regression] ICE: in decompose, at rtl.h:2313 at -O

2023-10-10 Thread prathamesh3492 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111754

--- Comment #2 from prathamesh3492 at gcc dot gnu.org ---
Hi,
Sorry for the breakage, will take a look.

Thanks,
Prathamesh

[Bug middle-end/111754] [14 Regression] ICE: in decompose, at rtl.h:2313 at -O

2023-10-10 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111754

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-10-10
  Component|rtl-optimization|middle-end
 Ever confirmed|0   |1
   Target Milestone|--- |14.0
   Priority|P3  |P1
   Keywords||needs-bisection

--- Comment #1 from Richard Biener  ---
(gdb) p debug_tree (exp)
 
unit-size 
align:32 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type
0x76d672a0 precision:32
pointer_to_this >
sizes-gimplified V4SF
size 
unit-size 
align:128 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type
0x76e99b28 nunits:4>
constant npatterns:1 nelts-per-pattern:3
elt:0:  
constant 0.0>
elt:1:  
constant 9.0e+0> elt:2:  >

Confirmed.  That's an odd vector mode 'F'!?  And we have three elements?

Happens after inlining bar () into foo ():

F bar (F a, F b)
{
  F c;

   [local count: 1073741824]:
  c_2 = VEC_PERM_EXPR ; 
  __builtin_logbl (0.0);
  return c_2;

F foo ()
{
  F _3;

   [local count: 1073741824]:
  _3 = bar ({ 9.0e+0, 0.0, 0.0, 0.0 }, { 0.0, 0.0, 0.0, 0.0 });

and the constant folding of the VEC_PERM_EXPR causes this.

ICEs way earlier (during inlining) when you do -fdump-ipa-inline