https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119702
--- Comment #16 from Surya Kumari Jangala <jskumari at gcc dot gnu.org> --- (In reply to Segher Boessenkool from comment #13) > (In reply to Surya Kumari Jangala from comment #12) > > Ok. We also need to tackle the original issue, which is that a shift left > > can be optimized by generating a vector add. Perhaps tackle this issue > > first? > > Yup. > > Most of it is handled by generic things already: shifts are always better > than > mults. And in most cases additions are faster than shifts (or you can do > more > of them concurrently or similar), so in many cases they are preferred, but > that > is not so super obvious already. You might be able to do four adds > concurrently, > but you might be able to do two shifts concurrently additionally, so it all > depends on what other code there is what works best there, and for what works > best usually you have to look at the usual instruction mix. > > There is no canonical representation of this in RTL, either: both x+x and > x<<1 are fine. > > So, if we really care, we should have patterns for both representations > in our backend, and generate whatever is the best code (for the selected > CPU!) > > In practice both are acceptable, so as long as we get the best code in the > common cases, we should be happy already :-) With the testcase in the "Description", we are seeing both a splat and a shift being generated. Instead, a single add instruction is more efficient.