[Bug tree-optimization/78604] [7 regression] test case gcc.target/powerpc/p8vector-vectorize-1.c fails starting with r242750

2017-01-23 Thread amker at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78604

--- Comment #7 from amker at gcc dot gnu.org ---
(In reply to Michael Meissner from comment #6)
> Unless -ffast-math or -fno-honor-nans is used, you cannot invert < to >=,
> because you will get a different result if either operand is a NaN. 
> However, the basic code for vector compares hasn't changed much since the
> early power7 days.

I don't know details in power.  Though it's vectorized instruction, it has
integer inner type.  That's why the "<" is converted to ">=" in tree-ifcvt just
before vectorizer.  Is it safe to transform in this case?  Thanks.

[Bug tree-optimization/78604] [7 regression] test case gcc.target/powerpc/p8vector-vectorize-1.c fails starting with r242750

2017-01-20 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78604

--- Comment #6 from Michael Meissner  ---
Unless -ffast-math or -fno-honor-nans is used, you cannot invert < to >=,
because you will get a different result if either operand is a NaN.  However,
the basic code for vector compares hasn't changed much since the early power7
days.

[Bug tree-optimization/78604] [7 regression] test case gcc.target/powerpc/p8vector-vectorize-1.c fails starting with r242750

2017-01-13 Thread wschmidt at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78604

Bill Schmidt  changed:

   What|Removed |Added

 CC||meissner at gcc dot gnu.org

--- Comment #5 from Bill Schmidt  ---
Mike, can you respond to c#4?

[Bug tree-optimization/78604] [7 regression] test case gcc.target/powerpc/p8vector-vectorize-1.c fails starting with r242750

2017-01-13 Thread amker at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78604

--- Comment #4 from amker at gcc dot gnu.org ---
(In reply to amker from comment #3)
> For function sign_lt and uns_lt, the change causes worse code generation
> unfortunately.  Take uns_lt as example, the difference in optimized dump is

> So powerpc has to compute ">=" with "> || =="?  Apart from this, the change
> itself only introduces additional canonicalization during tree-ifcvt,
> transforming "<" to "<=".  If the former form is more efficient on powerpc,
> shall we take this into consideration when canonicalization?
In powerpc backend, function rs6000_emit_vector_cond_expr only inverts
unordered comparisons.  If I understand correctly, it could be improved to
handle more cases like this one.  So this is a backend issue?
Thanks.

[Bug tree-optimization/78604] [7 regression] test case gcc.target/powerpc/p8vector-vectorize-1.c fails starting with r242750

2017-01-12 Thread amker at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78604

--- Comment #3 from amker at gcc dot gnu.org ---
For function sign_lt and uns_lt, the change causes worse code generation
unfortunately.  Take uns_lt as example, the difference in optimized dump is as
like:

529,530c529,530
<   vect_cst__32 = {val1_8(D), val1_8(D)};
<   vect_cst__33 = {val2_7(D), val2_7(D)};
---
>   vect_cst__32 = {val2_7(D), val2_7(D)};
>   vect_cst__33 = {val1_8(D), val1_8(D)};
536c536
<   vect_iftmp.412_34 = VEC_COND_EXPR ;
---
>   vect_iftmp.412_34 = VEC_COND_EXPR = vect__2.411_31, 
> vect_cst__32, vect_cst__33>;


Condition "<" is transformed into ">=" with the change, which results in more
insns after expanding:
***
*** 1014,1024 
REG_EQUAL `uc'
 49: r199:V2DI=[r195:SI+r167:SI]
 50: r200:V2DI=[r197:SI+r167:SI]
!51: r202:V2DI=gtu(r200:V2DI,r199:V2DI)
!52: r201:V2DI={(r202:V2DI!=const_vector)?r170:V2DI:r171:V2DI}
!53: [r193:SI+r167:SI]=r201:V2DI
!54: r167:SI=r167:SI+0x10
!56: r203:CCUNS=cmp(r167:SI,0x2000)
!57: pc={(r203:CCUNS!=0)?L55:pc}
REG_BR_PROB 9899
!58: NOTE_INSN_BASIC_BLOCK 5
--- 1016,1028 
REG_EQUAL `uc'
 49: r199:V2DI=[r195:SI+r167:SI]
 50: r200:V2DI=[r197:SI+r167:SI]
!51: r202:V2DI=gtu(r199:V2DI,r200:V2DI)
!52: r203:V2DI=r199:V2DI==r200:V2DI
!53: r204:V2DI=r202:V2DI|r203:V2DI
!54: r201:V2DI={(r204:V2DI!=const_vector)?r170:V2DI:r171:V2DI}
!55: [r193:SI+r167:SI]=r201:V2DI
!56: r167:SI=r167:SI+0x10
!58: r205:CCUNS=cmp(r167:SI,0x2000)
!59: pc={(r205:CCUNS!=0)?L57:pc}
REG_BR_PROB 9899
!60: NOTE_INSN_BASIC_BLOCK 5

So "<" is expanded into instruction (51), but ">=" is expanded into
instructions (51~53).

So powerpc has to compute ">=" with "> || =="?  Apart from this, the change
itself only introduces additional canonicalization during tree-ifcvt,
transforming "<" to "<=".  If the former form is more efficient on powerpc,
shall we take this into consideration when canonicalization?

Thanks.

[Bug tree-optimization/78604] [7 regression] test case gcc.target/powerpc/p8vector-vectorize-1.c fails starting with r242750

2017-01-11 Thread amker at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78604

--- Comment #2 from amker at gcc dot gnu.org ---
(In reply to Jakub Jelinek from comment #1)
> So, is what gcc trunk generates less efficient than what it used to generate
> before, or is just different?  If the latter, surely the test should be
> adjusted.

I will check it.  Sorry for missing this.  I suppose that revision is mine.

Thanks.

[Bug tree-optimization/78604] [7 regression] test case gcc.target/powerpc/p8vector-vectorize-1.c fails starting with r242750

2017-01-11 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78604

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek  ---
So, is what gcc trunk generates less efficient than what it used to generate
before, or is just different?  If the latter, surely the test should be
adjusted.

[Bug tree-optimization/78604] [7 regression] test case gcc.target/powerpc/p8vector-vectorize-1.c fails starting with r242750

2016-11-30 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78604

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |7.0