[Bug tree-optimization/89653] Missing vectorization of loop containing std::min/std::max and temporary

2022-11-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89653

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |10.0

[Bug tree-optimization/89653] Missing vectorization of loop containing std::min/std::max and temporary

2022-11-01 Thread moritz.kreutzer at siemens dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89653

--- Comment #11 from Moritz Kreutzer  ---
I am currently out of the office, with limited to no email access. I will be
returning on November 28. For urgent questions regarding ARM64 support please
contact Julian Hornich, for GPGPU-related issues please contact Michael Kuron,
and for compiler- and build-related issues please contact Tom James. For
anything else (which is urgent), please reach out to Joel Daniels.


Thanks,
Moritz

-
Siemens Industry Software GmbH; Anschrift: Am Kabellager 9, 51063 K?ln;
Gesellschaft mit beschr?nkter Haftung; Gesch?ftsf?hrer: Klaus L?ckel, Alexander
Walter; Sitz der Gesellschaft: K?ln; Registergericht: Amtsgericht K?ln, HRB
84564; Vorsitzender des Aufsichtsrats: Timo Nentwich

[Bug tree-optimization/89653] Missing vectorization of loop containing std::min/std::max and temporary

2019-05-02 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89653

Richard Biener  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
  Known to work||10.0
 Resolution|--- |FIXED

--- Comment #10 from Richard Biener  ---
Fixed on trunk.

[Bug tree-optimization/89653] Missing vectorization of loop containing std::min/std::max and temporary

2019-05-02 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89653

--- Comment #9 from Richard Biener  ---
Author: rguenth
Date: Thu May  2 14:08:08 2019
New Revision: 270800

URL: https://gcc.gnu.org/viewcvs?rev=270800=gcc=rev
Log:
2019-05-02  Richard Biener  

PR tree-optimization/89653
* tree-ssa-loop.c (pass_data_tree_loop_init): Execute
update-address-taken before the pass.
* passes.def (pass_tree_loop_init): Put comment before it.

* g++.dg/vect/pr89653.cc: New testcase.

Added:
trunk/gcc/testsuite/g++.dg/vect/pr89653.cc
Modified:
trunk/gcc/ChangeLog
trunk/gcc/passes.def
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-ssa-loop.c

[Bug tree-optimization/89653] Missing vectorization of loop containing std::min/std::max and temporary

2019-03-25 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89653

--- Comment #8 from Richard Biener  ---
(In reply to Moritz Kreutzer from comment #7)
> Thanks for taking this up Richard! I just want to check back: Do you need
> any assistance with testing or more information from my side?

Not at this point - this is an enhancement queued for next stage1 and GCC 10
only, so it has to wait at this moment.

[Bug tree-optimization/89653] Missing vectorization of loop containing std::min/std::max and temporary

2019-03-25 Thread moritz.kreutzer at siemens dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89653

--- Comment #7 from Moritz Kreutzer  ---
Thanks for taking this up Richard! I just want to check back: Do you need any
assistance with testing or more information from my side?

[Bug tree-optimization/89653] Missing vectorization of loop containing std::min/std::max and temporary

2019-03-11 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89653

--- Comment #6 from Richard Biener  ---
Created attachment 45934
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45934=edit
patch I am testing

And this one I am testing, executing update-address-taken from loop_init
(thus one time extra only).

[Bug tree-optimization/89653] Missing vectorization of loop containing std::min/std::max and temporary

2019-03-11 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89653

--- Comment #5 from Richard Biener  ---
Created attachment 45932
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45932=edit
untested DCE patch

Patch cleaning up after PRE via TODO_update_address_taken from (each) DCE.

[Bug tree-optimization/89653] Missing vectorization of loop containing std::min/std::max and temporary

2019-03-11 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89653

--- Comment #4 from Richard Biener  ---
Created attachment 45931
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45931=edit
untested phiprop patch

Patch making phiprop hoist the load through the PHI.

[Bug tree-optimization/89653] Missing vectorization of loop containing std::min/std::max and temporary

2019-03-11 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89653

--- Comment #3 from Richard Biener  ---
Ah, OK.  So our special pass to deal with std::min taking arguments by
reference
is "confused" enough by

  _4 = *_3;  // vec[i]
  D.17247 = _5;
  if (_4 > _5)
goto ; [34.00%]
  else
goto ; [66.00%]

   [local count: 324914276]:

   [local count: 955630225]:
  # _17 = PHI <_3(4), (5)>
  _6 = *_17;

the pass tries to replace the dereference in BB6 but fails on the 4->6
edge where it thinks placing *_3 on it isn't profitable.

Only the if-conversion phase in the vectorizer will then recognize the
MIN_EXPR operation but it is too late to get rid of the "memory" temporary
involved here.

Note that PRE performs what the special phiprop could have done, but it
leaves the dead memory operation around, which the DCE that's there cannot
remove because D.17094 still appears to have its address taken.

  _5 = reciptmp_8 * _4;
  D.17094 = _5;
  if (_4 > _5)
goto ; [34.00%]
  else
goto ; [66.00%]

   [local count: 630715944]:

   [local count: 955630224]:
  # prephitmp_29 = PHI <_5(5), _4(6)>
  *_3 = prephitmp_29;
  D.17094 ={v} {CLOBBER};

there are plenty DCE passes between this and vectorization but no DSE
which would fix it up and also no update-address-taken.
Doing this after PRE is too early (the PHI with the address-taking is
still in the IL), doing it after each DCE might be a (somewhat costly) option.

I also have a patch to make the above catched in phiprop.

[Bug tree-optimization/89653] Missing vectorization of loop containing std::min/std::max and temporary

2019-03-11 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89653

--- Comment #2 from Richard Biener  ---
Oh, no.  It is present even in .original and we somehow fail to elide it (maybe
because of the clobber?!

;; Function void loop1(double*, double, int) (null)
;; enabled by -tree-original


{
  {
int i = 0;

<>;
while (1)
  {
if (i >= end) goto ;
< ((const double &) vec + (sizetype) ((long unsigned
int) i * 8), (const double &) _EXPR )) >;
<>;
  }
:;
  }
}

[Bug tree-optimization/89653] Missing vectorization of loop containing std::min/std::max and temporary

2019-03-11 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89653

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2019-03-11
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Richard Biener  ---
The issue is D.17094 appearantly introduced by the recip pass.

   [local count: 955630224]:
  # i_20 = PHI 
  _1 = (long unsigned int) i_20;
  _2 = _1 * 8;
  _3 = vec_11(D) + _2;
  _4 = *_3;
  _5 = reciptmp_8 * _4;
  D.17094 = _5;
  prephitmp_29 = MIN_EXPR <_4, _5>;
  *_3 = prephitmp_29;
  D.17094 ={v} {CLOBBER};
  i_16 = i_20 + 1;
  if (end_10(D) <= i_16)