[Bug target/109499] Unnecessary zeroing in SVE loops
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109499 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Last reconfirmed||2024-02-22 CC||pinskia at gcc dot gnu.org --- Comment #5 from Andrew Pinski --- Confirmed.
[Bug target/109499] Unnecessary zeroing in SVE loops
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109499 --- Comment #4 from rsandifo at gcc dot gnu.org --- (In reply to rguent...@suse.de from comment #3) > AVX512 masking allows merge and zero modes, zero being cheaper > (obviously). I think "zero" is what all targets support so we could > define GIMPLE to be that way - inactive lanes become zero. That's > then also less of a "partial definition" and "undefined" should be > avoided at best? Thanks, sounds good to me. If direct support for merging turns out to be useful in future, maybe we could add the value of inactive lanes as an extra parameter at that point. Would be quite an invasive change, but it would just be work.
[Bug target/109499] Unnecessary zeroing in SVE loops
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109499 --- Comment #3 from rguenther at suse dot de --- On Thu, 13 Apr 2023, rsandifo at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109499 > > --- Comment #2 from rsandifo at gcc dot gnu.org > --- > (In reply to Richard Biener from comment #1) > > Is there not enough info to catch this on the RTL level with a peephole? > That works for simple cases like the first loop. But in general, I think we > want the full power of gimple to push the information down. The second loop > is > one example of that, but in general, there could be a chain of operations that > naturally do the right thing for inactive lanes. AVX512 masking allows merge and zero modes, zero being cheaper (obviously). I think "zero" is what all targets support so we could define GIMPLE to be that way - inactive lanes become zero. That's then also less of a "partial definition" and "undefined" should be avoided at best?
[Bug target/109499] Unnecessary zeroing in SVE loops
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109499 --- Comment #2 from rsandifo at gcc dot gnu.org --- (In reply to Richard Biener from comment #1) > Is there not enough info to catch this on the RTL level with a peephole? That works for simple cases like the first loop. But in general, I think we want the full power of gimple to push the information down. The second loop is one example of that, but in general, there could be a chain of operations that naturally do the right thing for inactive lanes.
[Bug target/109499] Unnecessary zeroing in SVE loops
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109499 --- Comment #1 from Richard Biener --- Is there not enough info to catch this on the RTL level with a peephole?