[Bug rtl-optimization/87678] Redundant vmovss with -fPIC

2018-11-03 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87678 --- Comment #12 from Uroš Bizjak --- (In reply to Segher Boessenkool from comment #11) > Should LRA do this? Shouldn't it be done earlier? Or later, in a peephole > for example? I think that combine should do this propagation, if the

[Bug rtl-optimization/87678] Redundant vmovss with -fPIC

2018-11-03 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87678 --- Comment #11 from Segher Boessenkool --- Should LRA do this? Shouldn't it be done earlier? Or later, in a peephole for example?

[Bug rtl-optimization/87678] Redundant vmovss with -fPIC

2018-11-03 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87678 --- Comment #10 from Uroš Bizjak --- (In reply to Segher Boessenkool from comment #9) > Ah. So you want this optimisation (which is currently done by LRA) to be > done > by combine as well; it's not that the resulting assembler code for this >

[Bug rtl-optimization/87678] Redundant vmovss with -fPIC

2018-11-02 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87678 --- Comment #9 from Segher Boessenkool --- Ah. So you want this optimisation (which is currently done by LRA) to be done by combine as well; it's not that the resulting assembler code for this testcase is worse than what you'd like to see. And

[Bug rtl-optimization/87678] Redundant vmovss with -fPIC

2018-11-02 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87678 --- Comment #8 from Uroš Bizjak --- (In reply to Segher Boessenkool from comment #7) > It's not clear to me what you would have liked it to do instead? The loads from constant memory pools always have REG_EQUAL of a relevant constant attached

[Bug rtl-optimization/87678] Redundant vmovss with -fPIC

2018-11-02 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87678 --- Comment #7 from Segher Boessenkool --- Hi Uros, It's not clear to me what you would have liked it to do instead?

[Bug rtl-optimization/87678] Redundant vmovss with -fPIC

2018-11-02 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87678 --- Comment #6 from Uroš Bizjak --- Here is a bit simpler testcase: --cut here-- typedef float __v4sf __attribute__((__vector_size__ (16))); __v4sf foo (__v4sf x) { return x + (__v4sf){ 2.3f, 2.3f, 2.3f, 2.3f }; } --cut here-- "cc1 -O2" on

[Bug rtl-optimization/87678] Redundant vmovss with -fPIC

2018-11-02 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87678 --- Comment #5 from Uroš Bizjak --- (In reply to Segher Boessenkool from comment #4) > It tries twice, first just the substitution, and then that modified with > the REG_EQUAL. You know a mem is not often valid in the resulting insn, > but

[Bug rtl-optimization/87678] Redundant vmovss with -fPIC

2018-11-02 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87678 --- Comment #4 from Segher Boessenkool --- It tries twice, first just the substitution, and then that modified with the REG_EQUAL. You know a mem is not often valid in the resulting insn, but combine doesn't, and that is not the same thing as

[Bug rtl-optimization/87678] Redundant vmovss with -fPIC

2018-11-02 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87678 --- Comment #3 from Uroš Bizjak --- Another similar problem: __m128 bar (__m128 x) { return x + _mm_set1_ps (2.3f); } gcc -O2 -msse2 creates following _combine dump: --cut here-- Trying 6 -> 7: 6: r85:V4SF=[`*.LC0'] REG_EQUAL

[Bug rtl-optimization/87678] Redundant vmovss with -fPIC

2018-10-22 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87678 --- Comment #2 from Segher Boessenkool --- This is a much more general problem in combine. In general it only tries once, and it only tries the fully simplified form, including known bit values etc.