[Bug target/89071] AVX vcvtsd2ss lets us avoid PXOR dependency breaking for scalar float<->double and other scalar xmm,xmm instructions

2021-05-14 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89071 Bug 89071 depends on bug 87007, which changed state. Bug 87007 Summary: [8 Regression] 10% slowdown with -march=skylake-avx512 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87007 What|Removed |Added

[Bug target/89071] AVX vcvtsd2ss lets us avoid PXOR dependency breaking for scalar float<->double and other scalar xmm,xmm instructions

2019-02-22 Thread peter at cordes dot ca
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89071 --- Comment #22 from Peter Cordes --- Nice, that's exactly the kind of thing I suggested in bug 80571. If this covers * vsqrtss/sd (mem),%merge_into, %xmm * vpcmpeqd%same,%same, %dest# false dep on KNL / Silvermont * vcmptrueps

[Bug target/89071] AVX vcvtsd2ss lets us avoid PXOR dependency breaking for scalar float<->double and other scalar xmm,xmm instructions

2019-02-22 Thread hjl at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89071 --- Comment #21 from hjl at gcc dot gnu.org --- Author: hjl Date: Fri Feb 22 15:54:08 2019 New Revision: 269119 URL: https://gcc.gnu.org/viewcvs?rev=269119=gcc=rev Log: i386: Add pass_remove_partial_avx_dependency With -mavx, for $ cat foo.i

[Bug target/89071] AVX vcvtsd2ss lets us avoid PXOR dependency breaking for scalar float<->double and other scalar xmm,xmm instructions

2019-02-03 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89071 Uroš Bizjak changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug target/89071] AVX vcvtsd2ss lets us avoid PXOR dependency breaking for scalar float<->double and other scalar xmm,xmm instructions

2019-02-03 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89071 Uroš Bizjak changed: What|Removed |Added Target Milestone|--- |9.0

[Bug target/89071] AVX vcvtsd2ss lets us avoid PXOR dependency breaking for scalar float<->double and other scalar xmm,xmm instructions

2019-02-03 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89071 --- Comment #19 from H.J. Lu --- (In reply to Uroš Bizjak from comment #18) > The only remaining question is on cvtsd2ss mem->xmm, where ICC goes with the > same strategy as with other non-conversion SSE unops: > >vmovsdd(%rip), %xmm0 >

[Bug target/89071] AVX vcvtsd2ss lets us avoid PXOR dependency breaking for scalar float<->double and other scalar xmm,xmm instructions

2019-02-03 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89071 --- Comment #18 from Uroš Bizjak --- The only remaining question is on cvtsd2ss mem->xmm, where ICC goes with the same strategy as with other non-conversion SSE unops: vmovsdd(%rip), %xmm0 vcvtsd2ss %xmm0, %xmm0, %xmm0 but with

[Bug target/89071] AVX vcvtsd2ss lets us avoid PXOR dependency breaking for scalar float<->double and other scalar xmm,xmm instructions

2019-02-03 Thread uros at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89071 --- Comment #17 from uros at gcc dot gnu.org --- Author: uros Date: Sun Feb 3 16:48:41 2019 New Revision: 268496 URL: https://gcc.gnu.org/viewcvs?rev=268496=gcc=rev Log: PR target/89071 * config/i386/i386.md (*sqrt2_sse): Add

[Bug target/89071] AVX vcvtsd2ss lets us avoid PXOR dependency breaking for scalar float<->double and other scalar xmm,xmm instructions

2019-02-03 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89071 --- Comment #16 from Uroš Bizjak --- (In reply to Peter Cordes from comment #15) > (In reply to Uroš Bizjak from comment #13) > > I assume that memory inputs are not problematic for SSE/AVX {R,}SQRT, RCP > > and ROUND instructions. Contrary to

[Bug target/89071] AVX vcvtsd2ss lets us avoid PXOR dependency breaking for scalar float<->double and other scalar xmm,xmm instructions

2019-02-01 Thread peter at cordes dot ca
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89071 --- Comment #15 from Peter Cordes --- (In reply to Uroš Bizjak from comment #13) > I assume that memory inputs are not problematic for SSE/AVX {R,}SQRT, RCP > and ROUND instructions. Contrary to CVTSI2S{S,D}, CVTSS2SD and CVTSD2SS, we >

[Bug target/89071] AVX vcvtsd2ss lets us avoid PXOR dependency breaking for scalar float<->double and other scalar xmm,xmm instructions

2019-02-01 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89071 Uroš Bizjak changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed|

[Bug target/89071] AVX vcvtsd2ss lets us avoid PXOR dependency breaking for scalar float<->double and other scalar xmm,xmm instructions

2019-02-01 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89071 --- Comment #13 from Uroš Bizjak --- I assume that memory inputs are not problematic for SSE/AVX {R,}SQRT, RCP and ROUND instructions. Contrary to CVTSI2S{S,D}, CVTSS2SD and CVTSD2SS, we currently don't emit XOR clear in front of these

[Bug target/89071] AVX vcvtsd2ss lets us avoid PXOR dependency breaking for scalar float<->double and other scalar xmm,xmm instructions

2019-01-31 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89071 --- Comment #12 from Uroš Bizjak --- (In reply to Peter Cordes from comment #10) > It also bizarrely uses it for VMOVSS, which gcc should only emit if it > actually wants to merge (right?). *If* this part of the patch isn't a bug > > -

[Bug target/89071] AVX vcvtsd2ss lets us avoid PXOR dependency breaking for scalar float<->double and other scalar xmm,xmm instructions

2019-01-31 Thread uros at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89071 --- Comment #11 from uros at gcc dot gnu.org --- Author: uros Date: Thu Jan 31 20:06:42 2019 New Revision: 268427 URL: https://gcc.gnu.org/viewcvs?rev=268427=gcc=rev Log: PR target/89071 * config/i386/i386.md (*extendsfdf2):

[Bug target/89071] AVX vcvtsd2ss lets us avoid PXOR dependency breaking for scalar float<->double and other scalar xmm,xmm instructions

2019-01-29 Thread peter at cordes dot ca
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89071 --- Comment #10 from Peter Cordes --- (In reply to Uroš Bizjak from comment #9) > There was similar patch for sqrt [1], I think that the approach is > straightforward, and could be applied to other reg->reg scalar insns as > well, independently

[Bug target/89071] AVX vcvtsd2ss lets us avoid PXOR dependency breaking for scalar float<->double and other scalar xmm,xmm instructions

2019-01-28 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89071 --- Comment #9 from Uroš Bizjak --- There was similar patch for sqrt [1], I think that the approach is straightforward, and could be applied to other reg->reg scalar insns as well, independently of PR87007 patch. [1]

[Bug target/89071] AVX vcvtsd2ss lets us avoid PXOR dependency breaking for scalar float<->double and other scalar xmm,xmm instructions

2019-01-28 Thread peter at cordes dot ca
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89071 --- Comment #8 from Peter Cordes --- Created attachment 45544 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45544=edit testloop-cvtss2sd.asm (In reply to H.J. Lu from comment #7) > I fixed assembly codes and run it on different AVX

[Bug target/89071] AVX vcvtsd2ss lets us avoid PXOR dependency breaking for scalar float<->double and other scalar xmm,xmm instructions

2019-01-28 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89071 --- Comment #7 from H.J. Lu --- I fixed assembly codes and run it on different AVX machines. I got similar results: ./test sse : 28346518 sse_clear: 28046302 avx : 28214775 avx2 : 28251195 avx_clear: 28092687 avx_clear:

[Bug target/89071] AVX vcvtsd2ss lets us avoid PXOR dependency breaking for scalar float<->double and other scalar xmm,xmm instructions

2019-01-28 Thread peter at cordes dot ca
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89071 --- Comment #6 from Peter Cordes --- (In reply to Peter Cordes from comment #5) > But whatever the effect is, it's totally unrelated to what you were *trying* > to test. :/ After adding a `ret` to each AVX function, all 5 are basically the same

[Bug target/89071] AVX vcvtsd2ss lets us avoid PXOR dependency breaking for scalar float<->double and other scalar xmm,xmm instructions

2019-01-28 Thread peter at cordes dot ca
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89071 --- Comment #5 from Peter Cordes --- (In reply to H.J. Lu from comment #4) > (In reply to Peter Cordes from comment #2) > > Can you show some > > asm where this performs better? > > Please try cvtsd2ss branch at: > >

[Bug target/89071] AVX vcvtsd2ss lets us avoid PXOR dependency breaking for scalar float<->double and other scalar xmm,xmm instructions

2019-01-28 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89071 --- Comment #4 from H.J. Lu --- (In reply to Peter Cordes from comment #2) > (In reply to H.J. Lu from comment #1) > > But > > > > vxorps %xmm0, %xmm0, %xmm0 > > vcvtsd2ss %xmm1, %xmm0, %xmm0 > > > > are faster than both. > >

[Bug target/89071] AVX vcvtsd2ss lets us avoid PXOR dependency breaking for scalar float<->double and other scalar xmm,xmm instructions

2019-01-28 Thread peter at cordes dot ca
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89071 --- Comment #3 from Peter Cordes --- (In reply to H.J. Lu from comment #1) I have a patch for PR 87007: > > https://gcc.gnu.org/ml/gcc-patches/2019-01/msg00298.html > > which inserts a vxorps at the last possible position. vxorps > will be

[Bug target/89071] AVX vcvtsd2ss lets us avoid PXOR dependency breaking for scalar float<->double and other scalar xmm,xmm instructions

2019-01-28 Thread peter at cordes dot ca
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89071 --- Comment #2 from Peter Cordes --- (In reply to H.J. Lu from comment #1) > But > > vxorps %xmm0, %xmm0, %xmm0 > vcvtsd2ss %xmm1, %xmm0, %xmm0 > > are faster than both. On Skylake-client (i7-6700k), I can't reproduce this

[Bug target/89071] AVX vcvtsd2ss lets us avoid PXOR dependency breaking for scalar float<->double and other scalar xmm,xmm instructions

2019-01-28 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89071 H.J. Lu changed: What|Removed |Added Depends on||87007 --- Comment #1 from H.J. Lu ---