[Bug tree-optimization/94401] [10 Regression] pr92420.c fails on aarch64 since r10-7415

2020-04-02 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94401

Kewen Lin  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #7 from Kewen Lin  ---
Should be fixed now.

[Bug tree-optimization/94401] [10 Regression] pr92420.c fails on aarch64 since r10-7415

2020-04-02 Thread cvs-commit at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94401

--- Comment #6 from CVS Commits  ---
The master branch has been updated by Kewen Lin :

https://gcc.gnu.org/g:81ce375d1fdd99f9d93b00f4895eab74c3d8b54a

commit r10-7519-g81ce375d1fdd99f9d93b00f4895eab74c3d8b54a
Author: Kewen Lin 
Date:   Thu Apr 2 08:48:03 2020 -0500

Fix PR94401 by considering reverse overrun

The commit r10-7415 brings scalar type consideration
to eliminate epilogue peeling for gaps, but it exposed
one problem that the current handling doesn't consider
the memory access type VMAT_CONTIGUOUS_REVERSE, for
which the overrun happens on low address side.  This
patch is to make the code take care of it by updating
the offset and construction element order accordingly.

Bootstrapped/regtested on powerpc64le-linux-gnu P8
and aarch64-linux-gnu.

2020-04-02  Kewen Lin  

gcc/ChangeLog

PR tree-optimization/94401
* tree-vect-loop.c (vectorizable_load): Handle VMAT_CONTIGUOUS_REVERSE
access type when loading halves of vector to avoid peeling for gaps.

[Bug tree-optimization/94401] [10 Regression] pr92420.c fails on aarch64 since r10-7415

2020-03-30 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94401

--- Comment #5 from Kewen Lin  ---
Created attachment 48150
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48150=edit
untested patch

This can fix the REG failures on aarch64.

[Bug tree-optimization/94401] [10 Regression] pr92420.c fails on aarch64 since r10-7415

2020-03-30 Thread linkw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94401

Kewen Lin  changed:

   What|Removed |Added

 CC||segher at gcc dot gnu.org,
   ||wschmidt at gcc dot gnu.org
 Status|NEW |ASSIGNED

--- Comment #4 from Kewen Lin  ---
My commit extends the current scalar epilogue peeling for gaps 
elimination, it makes the case can make use of int for the construction. But it
reveals the existing handlings misses to handle VMAT_CONTIGUOUS_REVERSE case,
currently it assumes overrun happens on high address end, it's true for almost
all cases, but this case is on the low address end. So if we have to load the
high part and put it in the latter part of constructed vector for
VMAT_CONTIGUOUS_REVERSE.

The IR before/after the commit looks

good:
  vect__9.16_80 = MEM  [(int *)vectp_y.14_78];
  vect__9.17_81 = VEC_PERM_EXPR ;
  vect__9.18_82 = VEC_PERM_EXPR ;

bad:
  _30 = MEM[(int *)vectp_y.12_34];
  _20 = {_30, 0};
  vect__9.14_19 = VIEW_CONVERT_EXPR(_20);
  vect__9.15_61 = VEC_PERM_EXPR ;
  vect__9.16_54 = VEC_PERM_EXPR ;

[Bug tree-optimization/94401] [10 Regression] pr92420.c fails on aarch64 since r10-7415

2020-03-30 Thread law at redhat dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94401

Jeffrey A. Law  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2020-03-30
 CC||law at redhat dot com
 Status|UNCONFIRMED |NEW

--- Comment #3 from Jeffrey A. Law  ---
Confirmed.  My tester tripped over this as well.

No special options are needed at configure time.

[Bug tree-optimization/94401] [10 Regression] pr92420.c fails on aarch64 since r10-7415

2020-03-30 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94401

Richard Biener  changed:

   What|Removed |Added

 Target||aarch64
Version|unknown |10.0
   Target Milestone|--- |10.0
   Keywords||wrong-code
   Priority|P3  |P1
Summary|pr92420.c fails on aarch64  |[10 Regression] pr92420.c
   |since r10-7415  |fails on aarch64 since
   ||r10-7415