"Andre Vieira (lists)" <andre.simoesdiasvie...@arm.com> writes:
> [...] The question at hand 
> here is, what can the vectorizer use for a specific loop. If we are 
> using Advanced SIMD modes then it needs to call an Advanced SIMD clone, 
> and if we are using SVE modes then it needs to call an SVE clone. At 
> least until we support the ABI conversion, because like I said for an 
> unpacked argument they behave differently.

Probably also worth noting that multi-byte elements are laid out
differently for big-endian.  E.g. V4SI is loaded as a 128-bit integer
whereas VNx4SI is loaded as an array of 4 32-bit integers, with the
first 32-bit integer going in the least significant bits of the register.

So it would only be possible to use Advanced SIMD clones for SVE modes
and vice versa for little-endian, or if the elements are all bytes,
or if we add some reverses to the inputs and outputs.

Richard

Reply via email to