[Bug tree-optimization/77848] Gimple if-conversion results in redundant comparisons

2017-01-13 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77848
Bug 77848 depends on bug 78411, which changed state.

Bug 78411 Summary: [7 Regression] FAIL: gcc.target/i386/pr45685.c 
scan-assembler-times cmov 6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78411

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug tree-optimization/77848] Gimple if-conversion results in redundant comparisons

2016-11-17 Thread wschmidt at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77848

--- Comment #27 from Bill Schmidt  ---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78396 is open to track that
failure.

[Bug tree-optimization/77848] Gimple if-conversion results in redundant comparisons

2016-11-17 Thread wschmidt at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77848

Bill Schmidt  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #26 from Bill Schmidt  ---
Fixed.  I will open a bug for the bb-slp-cond-1.c regression.

[Bug tree-optimization/77848] Gimple if-conversion results in redundant comparisons

2016-11-17 Thread wschmidt at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77848

--- Comment #25 from Bill Schmidt  ---
Author: wschmidt
Date: Thu Nov 17 14:22:17 2016
New Revision: 242550

URL: https://gcc.gnu.org/viewcvs?rev=242550=gcc=rev
Log:
[gcc]

2016-11-17  Bill Schmidt  
Richard Biener  

PR tree-optimization/77848
* tree-if-conv.c (tree_if_conversion): Always version loops unless
the user specified -ftree-loop-if-convert.

[gcc/testsuite]

2016-11-17  Bill Schmidt  
Richard Biener  

PR tree-optimization/77848
* gfortran.dg/vect/pr77848.f: New test.


Added:
trunk/gcc/testsuite/gfortran.dg/vect/pr77848.f
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-if-conv.c

[Bug tree-optimization/77848] Gimple if-conversion results in redundant comparisons

2016-11-16 Thread wschmidt at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77848

--- Comment #24 from Bill Schmidt  ---
The above commit doesn't yet solve the problem, but enables more outer-loop
vectorization in preparation for the fix.

[Bug tree-optimization/77848] Gimple if-conversion results in redundant comparisons

2016-11-16 Thread wschmidt at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77848

--- Comment #23 from Bill Schmidt  ---
Author: wschmidt
Date: Wed Nov 16 22:17:10 2016
New Revision: 242520

URL: https://gcc.gnu.org/viewcvs?rev=242520=gcc=rev
Log:
2016-11-16  Bill Schmidt  

PR tree-optimization/77848
* tree-if-conv.c (version_loop_for_if_conversion): When versioning
an outer loop, only save basic block aux information for the inner
loop.
(versionable_outer_loop_p): New function.
(tree_if_conversion): Version the outer loop instead of the inner
one if the pattern will be recognized for outer-loop
vectorization.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/tree-if-conv.c

[Bug tree-optimization/77848] Gimple if-conversion results in redundant comparisons

2016-11-15 Thread wschmidt at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77848

--- Comment #22 from Bill Schmidt  ---
Proposed patch: https://gcc.gnu.org/ml/gcc-patches/2016-11/msg01541.html

[Bug tree-optimization/77848] Gimple if-conversion results in redundant comparisons

2016-11-15 Thread wschmidt at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77848

--- Comment #21 from Bill Schmidt  ---
Great, thanks.  Just realized I need to add a test case yet -- should have this
on the list later today.

[Bug tree-optimization/77848] Gimple if-conversion results in redundant comparisons

2016-11-15 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77848

--- Comment #20 from rguenther at suse dot de  ---
On Tue, 15 Nov 2016, wschmidt at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77848
> 
> --- Comment #19 from Bill Schmidt  ---
> I have a patch that solves this problem by always versioning loops when
> vectorization is enabled, and also sets up if-conversion for outer loops so
> that outer-loop vectorization can succeed as before.  Surprisingly, this 
> didn't
> require any changes to the vectorization code to pass the test suite, and I
> verified a couple of examples to see that the expected vectorization was
> occurring.

Heh, that's surprising but all the better!

> We still have these regressions:
> 
> > FAIL: gcc.dg/vect/bb-slp-cond-1.c -flto -ffat-lto-objects  
> > scan-tree-dump-times slp1 "basic block vectorized" 1
> > FAIL: gcc.dg/vect/bb-slp-cond-1.c scan-tree-dump-times slp1 "basic block 
> > vectorized" 1
> 
> Richard, what are your current thoughts on this re: comment #6?

I think we should go this route despite this particular regression.  The
loop should be vectorizable in loop vectorization and I promise
to look at the regression.  Note that in theory we can also dispatch
to BB vectorization from loop vectorization when that failed to catch
the vectorization of if-converted code.

> I think I should probably submit the current patch for review despite the
> regression, while we talk about the effect on SLP vectorization.  I'll do this
> tomorrow unless I hear otherwise.

Yes, I think that's a good idea.

Richard.

[Bug tree-optimization/77848] Gimple if-conversion results in redundant comparisons

2016-11-14 Thread wschmidt at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77848

--- Comment #19 from Bill Schmidt  ---
I have a patch that solves this problem by always versioning loops when
vectorization is enabled, and also sets up if-conversion for outer loops so
that outer-loop vectorization can succeed as before.  Surprisingly, this didn't
require any changes to the vectorization code to pass the test suite, and I
verified a couple of examples to see that the expected vectorization was
occurring.

We still have these regressions:

> FAIL: gcc.dg/vect/bb-slp-cond-1.c -flto -ffat-lto-objects  
> scan-tree-dump-times slp1 "basic block vectorized" 1
> FAIL: gcc.dg/vect/bb-slp-cond-1.c scan-tree-dump-times slp1 "basic block 
> vectorized" 1

Richard, what are your current thoughts on this re: comment #6?

I think I should probably submit the current patch for review despite the
regression, while we talk about the effect on SLP vectorization.  I'll do this
tomorrow unless I hear otherwise.

Thanks for your help on the approach!

Bill

[Bug tree-optimization/77848] Gimple if-conversion results in redundant comparisons

2016-11-07 Thread wschmidt at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77848

--- Comment #18 from Bill Schmidt  ---
Oh, I see.  Makes sense.  I'll look into it soonish after handling a
high-priority interrupt that came in over the weekend...

[Bug tree-optimization/77848] Gimple if-conversion results in redundant comparisons

2016-11-07 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77848

--- Comment #17 from Richard Biener  ---
Nono, you misunderstood -- I meant for the if-converter to produce

 if (LOOP_VECTORIZED)
  {
for (;;) // outer loop
  for (;;)
// if-converted inner loop
  }
 else
  {
for (;;) // outer loop
  for (;;) // inner loop
  }

if (and only if) the outer loop CFG shape matches what the vectorizer will
later eventually handle.

You still need adjustments to the vectorizer but only in its handling of
eliminating the LOOP_VECTORIZED conditional.

[Bug tree-optimization/77848] Gimple if-conversion results in redundant comparisons

2016-11-06 Thread wschmidt at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77848

--- Comment #16 from Bill Schmidt  ---
Created attachment 39975
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39975=edit
WIP patch for outer-loop vectorization

[Bug tree-optimization/77848] Gimple if-conversion results in redundant comparisons

2016-11-06 Thread wschmidt at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77848

--- Comment #15 from Bill Schmidt  ---
(In reply to rguent...@suse.de from comment #14)
> On November 5, 2016 4:31:54 PM GMT+01:00, "wschmidt at gcc dot gnu.org"
>  wrote:
> 
> >Notable degradations:
> >  403.gcc: -1.8%
> >
> >Other results in noise (+/- 1.0%)
> >
> >Thus, not too bad for a simple patch, though we don't like seeing
> >403.gcc
> >degrade.
> 
> Analyzing would be useful, I suspect missed RTL if-conversion opportunities.

Agreed; however, I may not be able to get to that this week, unfortunately.

> 
> >I've made some progress on the outer loop vectorization issue, but
> >haven't
> >completely solved it yet.  I'll try to have something to look at there
> >in the
> >next couple of days.
> 
> I'd try detecting the outer loop vect CFG shape the vectorizer handles in
> if-combersion and then simply version the outer loop...  Of course needs
> handling of the VECTORIZED conditional on an outer loop in the vectorizer.

Right, that's the direction I've been exploring.  Unfortunately the loop
analysis seems pretty dependent on the shape, and things get ugly inside
vect_mark_stmts_to_be_vectorized.  Eventually I run into the following that
defeats outer loop vectorization:

/home/wschmidt/gcc/gcc-mainline-test2/gcc/testsuite/gcc.dg/vect/vect-cond-1.c:2\
0:3: note: def_stmt: curr_a_20 = PHI 
/home/wschmidt/gcc/gcc-mainline-test2/gcc/testsuite/gcc.dg/vect/vect-cond-1.c:2\
0:3: note: type of def: unknown
/home/wschmidt/gcc/gcc-mainline-test2/gcc/testsuite/gcc.dg/vect/vect-cond-1.c:2\
0:3: note: Unsupported pattern.
/home/wschmidt/gcc/gcc-mainline-test2/gcc/testsuite/gcc.dg/vect/vect-cond-1.c:2\
0:3: note: not vectorized: unsupported use in stmt.
/home/wschmidt/gcc/gcc-mainline-test2/gcc/testsuite/gcc.dg/vect/vect-cond-1.c:2\
0:3: note: unexpected pattern.

Here the PHI in question is in the loop header for the inner unoptimized loop,
which is unexpected by the analysis and becomes vect_unknown_def_type.  Thus we
hit the "Unsupported pattern" in vect_is_simple_use and we're done.

I tried an egregious hack to just ignore such PHIs in nested loops to see what
would happen.  We end up blowing up deep in the analyze_overlapping_iterations
code during data-ref analysis.

I'm somewhat at a dead end here, so I'll attach the patch thus far and call it
a night.  I'd appreciate any thoughts on any better direction.  The hunk in
vect_is_simple_use is the egregious hack; it needs to be removed before the
test case will compile.

[Bug tree-optimization/77848] Gimple if-conversion results in redundant comparisons

2016-11-06 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77848

--- Comment #14 from rguenther at suse dot de  ---
On November 5, 2016 4:31:54 PM GMT+01:00, "wschmidt at gcc dot gnu.org"
 wrote:
>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77848
>
>--- Comment #13 from Bill Schmidt  ---
>SPEC cpu2006 on powerpc64le-unknown-linux-gnu with the simple patch:
>
>Geomean, integer:  +0.2%
>Geomean, float:+0.5%
>Geomean, overall:  +0.4%
>
>Notable improvements:
>  454.calculix:+3.7%
>  453.povray:  +3.4%
>  458.sjeng:   +1.4%
>  429.mcf: +1.2%
>  445.gobmk:   +1.2%
>  471.omnetpp: +1.2%
>
>Notable degradations:
>  403.gcc: -1.8%
>
>Other results in noise (+/- 1.0%)
>
>Thus, not too bad for a simple patch, though we don't like seeing
>403.gcc
>degrade.

Analyzing would be useful, I suspect missed RTL if-conversion opportunities.

>I've made some progress on the outer loop vectorization issue, but
>haven't
>completely solved it yet.  I'll try to have something to look at there
>in the
>next couple of days.

I'd try detecting the outer loop vect CFG shape the vectorizer handles in
if-combersion and then simply version the outer loop...  Of course needs
handling of the VECTORIZED conditional on an outer loop in the vectorizer.

[Bug tree-optimization/77848] Gimple if-conversion results in redundant comparisons

2016-11-05 Thread wschmidt at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77848

--- Comment #13 from Bill Schmidt  ---
SPEC cpu2006 on powerpc64le-unknown-linux-gnu with the simple patch:

Geomean, integer:  +0.2%
Geomean, float:+0.5%
Geomean, overall:  +0.4%

Notable improvements:
  454.calculix:+3.7%
  453.povray:  +3.4%
  458.sjeng:   +1.4%
  429.mcf: +1.2%
  445.gobmk:   +1.2%
  471.omnetpp: +1.2%

Notable degradations:
  403.gcc: -1.8%

Other results in noise (+/- 1.0%)

Thus, not too bad for a simple patch, though we don't like seeing 403.gcc
degrade.

I've made some progress on the outer loop vectorization issue, but haven't
completely solved it yet.  I'll try to have something to look at there in the
next couple of days.

[Bug tree-optimization/77848] Gimple if-conversion results in redundant comparisons

2016-11-04 Thread wschmidt at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77848

--- Comment #12 from Bill Schmidt  ---
So I'll now test

Index: gcc/tree-if-conv.c
===
--- gcc/tree-if-conv.c  (revision 241802)
+++ gcc/tree-if-conv.c  (working copy)
@@ -2767,7 +2767,9 @@ tree_if_conversion (struct loop *loop)
  || loop->dont_vectorize))
 goto cleanup;

-  if ((any_pred_load_store || any_complicated_phi)
+  /* FIXME: When SLP vectorization can handle if-conversion on its own,
+ predicate all of if-conversion on flag_tree_loop_vectorize.  */
+  if ((any_pred_load_store || any_complicated_phi || flag_tree_loop_vectorize)
   && !version_loop_for_if_conversion (loop))
 goto cleanup;

[Bug tree-optimization/77848] Gimple if-conversion results in redundant comparisons

2016-11-04 Thread wschmidt at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77848

--- Comment #11 from Bill Schmidt  ---
(In reply to rguent...@suse.de from comment #10)
> On Fri, 4 Nov 2016, wschmidt at gcc dot gnu.org wrote:
> 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77848
> > 
> > --- Comment #8 from Bill Schmidt  ---
> > FYI, the patch I am testing is:
> > 
> > Index: gcc/tree-if-conv.c
> > ===
> > --- gcc/tree-if-conv.c  (revision 241802)
> > +++ gcc/tree-if-conv.c  (working copy)
> > @@ -2767,8 +2767,7 @@ tree_if_conversion (struct loop *loop)
> >   || loop->dont_vectorize))
> >  goto cleanup;
> > 
> > -  if ((any_pred_load_store || any_complicated_phi)
> > -  && !version_loop_for_if_conversion (loop))
> > +  if (flag_tree_loop_vectorize && !version_loop_for_if_conversion (loop))
> >  goto cleanup;
> 
> can any_pred_load_store or any_complicated_phi never be true without
> flag_tree_loop_vectorize?

It's quite possible, I'm not sure.  I was trying to remove all preconditions
for doing if-conversion, but without the test for flag_tree_loop_vectorize I
ran into failures in the test suite for -O2 -ftree-if-conversion or whatever it
is.

> 
> Btw, I think we should simply guard if-conversion with
> flag_tree_loop_vectorize... (given it has no cost model)

Seems that can work if we can make SLP vectorization handle the
PHI-convertibles on its own without if-conversion, as you suggested, but until
then it will lose some SLP opportunities.

By the way, outer loop vectorization is failing because of

  if ((loop->inner)->inner || (loop->inner)->next)
{
  if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
 "not vectorized: multiple nested loops.\n");
  return false;
}

The versioned inner loop has two loops, so we don't even consider it further.

Bill

> 
> Richard.
> 
> >/* Now all statements are if-convertible.  Combine all the basic
> > 
> >

[Bug tree-optimization/77848] Gimple if-conversion results in redundant comparisons

2016-11-04 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77848

--- Comment #10 from rguenther at suse dot de  ---
On Fri, 4 Nov 2016, wschmidt at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77848
> 
> --- Comment #8 from Bill Schmidt  ---
> FYI, the patch I am testing is:
> 
> Index: gcc/tree-if-conv.c
> ===
> --- gcc/tree-if-conv.c  (revision 241802)
> +++ gcc/tree-if-conv.c  (working copy)
> @@ -2767,8 +2767,7 @@ tree_if_conversion (struct loop *loop)
>   || loop->dont_vectorize))
>  goto cleanup;
> 
> -  if ((any_pred_load_store || any_complicated_phi)
> -  && !version_loop_for_if_conversion (loop))
> +  if (flag_tree_loop_vectorize && !version_loop_for_if_conversion (loop))
>  goto cleanup;

can any_pred_load_store or any_complicated_phi never be true without
flag_tree_loop_vectorize?

Btw, I think we should simply guard if-conversion with
flag_tree_loop_vectorize... (given it has no cost model)

Richard.

>/* Now all statements are if-convertible.  Combine all the basic
> 
>

[Bug tree-optimization/77848] Gimple if-conversion results in redundant comparisons

2016-11-04 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77848

--- Comment #9 from rguenther at suse dot de  ---
On Fri, 4 Nov 2016, wschmidt at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77848
> 
> --- Comment #7 from Bill Schmidt  ---
> OK, I will try to get some machine time to do performance testing of the
> existing patch as soon as possible.
> 
> Here is the list of failures:
> 
> > FAIL: gcc.dg/vect/bb-slp-cond-1.c -flto -ffat-lto-objects  
> > scan-tree-dump-times slp1 "basic block vectorized" 1
> > FAIL: gcc.dg/vect/bb-slp-cond-1.c scan-tree-dump-times slp1 "basic block 
> > vectorized" 1
> 66a69,76
> > FAIL: gcc.dg/vect/vect-cond-1.c -flto -ffat-lto-objects  
> > scan-tree-dump-times vect "OUTER LOOP VECTORIZED" 1
> > FAIL: gcc.dg/vect/vect-cond-1.c scan-tree-dump-times vect "OUTER LOOP 
> > VECTORIZED" 1
> > FAIL: gcc.dg/vect/vect-cond-3.c -flto -ffat-lto-objects  
> > scan-tree-dump-times vect "OUTER LOOP VECTORIZED" 1
> > FAIL: gcc.dg/vect/vect-cond-3.c scan-tree-dump-times vect "OUTER LOOP 
> > VECTORIZED" 1
> > FAIL: gcc.dg/vect/vect-cond-4.c -flto -ffat-lto-objects  
> > scan-tree-dump-times vect "OUTER LOOP VECTORIZED" 1
> > FAIL: gcc.dg/vect/vect-cond-4.c scan-tree-dump-times vect "OUTER LOOP 
> > VECTORIZED" 1
> > FAIL: gcc.dg/vect/vect-cond-6.c -flto -ffat-lto-objects  
> > scan-tree-dump-times vect "OUTER LOOP VECTORIZED" 1
> > FAIL: gcc.dg/vect/vect-cond-6.c scan-tree-dump-times vect "OUTER LOOP 
> > VECTORIZED" 1

Ok, so that's outer loop vect not handling the if (LOOP_VECTORIZED ()) 
case -- if-conversion only handles innermost loops but the vectorizer
handles a single outer loop.  This means the if (LOOP_VECTORIZED ())
would need to be put on the outer loop or outer loop vectorization
would need to handle it in some way.  I think we already disable
outer loop vectorization when sth forces LOOP_VECTORIZED () for
if-conversion.  I think "ignoring" LOOP_VECTORIZED and its associated
CFG in outer loop vectorization and folding it away before transform
might work...  (we also fail to if-convert the "outer" loop btw).
See vect_analyze_loop_form_1 for what loop form we expect for outer
loop vectorization.

[Bug tree-optimization/77848] Gimple if-conversion results in redundant comparisons

2016-11-04 Thread wschmidt at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77848

--- Comment #8 from Bill Schmidt  ---
FYI, the patch I am testing is:

Index: gcc/tree-if-conv.c
===
--- gcc/tree-if-conv.c  (revision 241802)
+++ gcc/tree-if-conv.c  (working copy)
@@ -2767,8 +2767,7 @@ tree_if_conversion (struct loop *loop)
  || loop->dont_vectorize))
 goto cleanup;

-  if ((any_pred_load_store || any_complicated_phi)
-  && !version_loop_for_if_conversion (loop))
+  if (flag_tree_loop_vectorize && !version_loop_for_if_conversion (loop))
 goto cleanup;

   /* Now all statements are if-convertible.  Combine all the basic

[Bug tree-optimization/77848] Gimple if-conversion results in redundant comparisons

2016-11-04 Thread wschmidt at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77848

--- Comment #7 from Bill Schmidt  ---
OK, I will try to get some machine time to do performance testing of the
existing patch as soon as possible.

Here is the list of failures:

> FAIL: gcc.dg/vect/bb-slp-cond-1.c -flto -ffat-lto-objects  
> scan-tree-dump-times slp1 "basic block vectorized" 1
> FAIL: gcc.dg/vect/bb-slp-cond-1.c scan-tree-dump-times slp1 "basic block 
> vectorized" 1
66a69,76
> FAIL: gcc.dg/vect/vect-cond-1.c -flto -ffat-lto-objects  scan-tree-dump-times 
> vect "OUTER LOOP VECTORIZED" 1
> FAIL: gcc.dg/vect/vect-cond-1.c scan-tree-dump-times vect "OUTER LOOP 
> VECTORIZED" 1
> FAIL: gcc.dg/vect/vect-cond-3.c -flto -ffat-lto-objects  scan-tree-dump-times 
> vect "OUTER LOOP VECTORIZED" 1
> FAIL: gcc.dg/vect/vect-cond-3.c scan-tree-dump-times vect "OUTER LOOP 
> VECTORIZED" 1
> FAIL: gcc.dg/vect/vect-cond-4.c -flto -ffat-lto-objects  scan-tree-dump-times 
> vect "OUTER LOOP VECTORIZED" 1
> FAIL: gcc.dg/vect/vect-cond-4.c scan-tree-dump-times vect "OUTER LOOP 
> VECTORIZED" 1
> FAIL: gcc.dg/vect/vect-cond-6.c -flto -ffat-lto-objects  scan-tree-dump-times 
> vect "OUTER LOOP VECTORIZED" 1
> FAIL: gcc.dg/vect/vect-cond-6.c scan-tree-dump-times vect "OUTER LOOP 
> VECTORIZED" 1

[Bug tree-optimization/77848] Gimple if-conversion results in redundant comparisons

2016-11-04 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77848

--- Comment #6 from Richard Biener  ---
Note that bb-slp-cond-1.c is a particulaly bad example as it shows a defect in
the loop vectorizing data dependence analysis.

But yes, BB vectorization also benefits from if-conversion (but not only in
loops) though technically it should "simply" be re-written to support
multiple BBs and vectorizing PHIs (this shouldn't be too difficult).

Delaying the folding of LOOP_VECTORIZED is not going to work very well
I think and it will at least disturb complete unrolling.  After all you
don't know whether the complete BB will be vectorized.

Can you please show the complete list of FAILs you get?

I'd say we should go forward with the original idea and do some real-world
benchmarking (SPEC?) to look for unwanted fallout.

Then for cases like bb-slp-cond-1.c the fix is to fix loop vectorization
and for others it might be trying to teach BB vectorization to handle
control-flow (I can give this a shot though it's now somewhat late in
stage1...)

[Bug tree-optimization/77848] Gimple if-conversion results in redundant comparisons

2016-11-03 Thread wschmidt at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77848

--- Comment #5 from Bill Schmidt  ---
I suppose that's not sensible as stated, as the SLP vectorizer doesn't really
think in terms of loops.

But this is an existing problem independent of whether we force loop-versioning
on in all cases.  Right now, a versioned loop that is not loop vectorized will
not allow SLP vectorization on the if-converted version.  We just don't happen
to have test coverage that notices that.

[Bug tree-optimization/77848] Gimple if-conversion results in redundant comparisons

2016-11-03 Thread wschmidt at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77848

--- Comment #4 from Bill Schmidt  ---
Ah, never mind, I see what's happening.

The order of events is
 * if-conversion
 * loop vectorization
 * DCE
 * cunroll
 * slp vectorization

If we force versioning on with if-conversion, then the loop vectorizer sees
that it can't do anything with the loop, so it folds away the if-converted
version.  Thus SLP never gets a chance to look at the if-converted loop.  So it
seems that always forcing versioning on has an unintended consequence of
missing SLP opportunities.

Should the folding of LOOP_VECTORIZED (1, 2) be deferred to the SLP vectorizer?

[Bug tree-optimization/77848] Gimple if-conversion results in redundant comparisons

2016-11-03 Thread wschmidt at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77848

--- Comment #3 from Bill Schmidt  ---
Just got back to looking at this.  I've implemented this suggestion and it
seems to work well for the most part; it solves the poor code generation we
were seeing on this test case, and most of the test suite passes.  I am seeing
a couple of failures, though.

The one I'm investigating now is gcc.dg/vect/bb-slp-cond-1.c.  When not
versioning the loop, SLP vectorization succeeds.  When versioning the loop, an
identical loop under the vectorization check is not vectorized for SLP.

Looking at the vectorization detail dump, it appears that the blocks in the
if-converted loop are not visited when the loop is versioned.  Only the blocks
in the original fallback loop are visited, which of course cannot be vectorized
since only if-conversion allows vectorization to take place.

I'll dig into it further, but I'm wondering if anyone has seen this sort of
behavior before, and can give me an idea what to look for.  I wonder if
versioning can leave the loop with a stale DFS numbering or something similar.

Any help appreciated!

Bill

[Bug tree-optimization/77848] Gimple if-conversion results in redundant comparisons

2016-10-05 Thread wschmidt at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77848

--- Comment #2 from Bill Schmidt  ---
Thanks, Richard!  I appreciate the analysis, as I wasn't really sure what the
proper fix should be here.

[Bug tree-optimization/77848] Gimple if-conversion results in redundant comparisons

2016-10-05 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77848

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-10-05
   Target Milestone|7.0 |---
 Ever confirmed|0   |1

--- Comment #1 from Richard Biener  ---
if-conversion was added purely as vectorization enabler and has no costing
model when applied to loops that do not end up being vectorized.

if-conversion has code to emit the if-converted body in a loop copy that is
thrown away when not vectorized -- I've argued this should be the only and
the default behavior which would also solve this PR.

I'd happily approve a patch implementing it ;)  Look for
version_loop_for_if_conversion () and always apply it.

[Bug tree-optimization/77848] Gimple if-conversion results in redundant comparisons

2016-10-04 Thread wschmidt at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77848

Bill Schmidt  changed:

   What|Removed |Added

 CC||dje at gcc dot gnu.org,
   ||segher at gcc dot gnu.org
   Target Milestone|--- |7.0