"Andre Vieira (lists)" <andre.simoesdiasvie...@arm.com> writes: > [...] The question at hand > here is, what can the vectorizer use for a specific loop. If we are > using Advanced SIMD modes then it needs to call an Advanced SIMD clone, > and if we are using SVE modes then it needs to call an SVE clone. At > least until we support the ABI conversion, because like I said for an > unpacked argument they behave differently.
Probably also worth noting that multi-byte elements are laid out differently for big-endian. E.g. V4SI is loaded as a 128-bit integer whereas VNx4SI is loaded as an array of 4 32-bit integers, with the first 32-bit integer going in the least significant bits of the register. So it would only be possible to use Advanced SIMD clones for SVE modes and vice versa for little-endian, or if the elements are all bytes, or if we add some reverses to the inputs and outputs. Richard