Hi Richard,
On 22/06/2026 18:36, Christopher Bazley wrote:
Hi Richard,
On 22/06/2026 13:00, Christopher Bazley wrote:
Hi Richard,
Thanks for reviewing my patch again.
On 09/06/2026 10:33, Richard Biener wrote:
Which raises the question - we agreed on how to handle VLA vector
CONSTRUCTORs, but the VLA VECTOR_CST representation does not
have sth equivalent here?
In this patch, the equivalent text is simply "Omitted elements are
implicitly zero" in the description of gimple_build_vector_from_elems.
I previously updated the description of the pattern vec_init in md.texi:
https://forge.sourceware.org/gcc/gcc-TEST/pulls/177/files
"If @var{m} specifies a scalable vector mode, then operand 1 only
specifies the minimum number of elements implied by @var{m} and
elements beyond are zero initialized."
I also previously added this to the description of the
store_constructor function:
"If the constructor EXP has a vector type then elements of TARGET for
which there is no corresponding element in EXP are zero'd. For a
variable-length vector type, only elements up to the minimum number of
subparts of the type are explicitly zero'd; any elements beyond that
are implicitly zero."
Maybe something similar should be added to the description of
make_vector(), since that seems to be the sole origin of VECTOR_CST.
The description of CONSTRUCTOR in gcc/doc/generic.texi already says
"You should not assume that all fields will be represented.
Unrepresented fields will be cleared (zeroed), unless the
CONSTRUCTOR_NO_CLEARING flag is set, in which case their value becomes
undefined."
What I propose is to extend the description of @item VECTOR_CST in
gcc/doc/generic.texi. Something like this:
"Only the minimum number of elements required for a scalable vector
constant need be represented.
Unrepresented elements of a scalable vector constant will be cleared
(zeroed)."
The more I think about it whilst working on possible wording, the more I
think it would be a mistake to overconstrain the specification of
VECTOR_CST.
As far as I can tell, the existing description of VECTOR_CST in gcc/doc/
generic.texi already fully specifies the VECTOR_CST representation. My
understanding is that "an arbitrary-length sequence" is meant to convey
that there is no length information implicit in a VECTOR_CST.
It's tempting to add constraints such as "at least the minimum number of
elements required for a scalable vector must be represented" or "no more
than the minimum number of elements... must be represented", but both
statements seem meaningless to me because (if I understand correctly)
any encoding selected by choosing the values of VECTOR_CST_NPATTERNS and
VECTOR_CST_NELTS_PER_PATTERN inevitably represents the values of all
possible elements of a vector of VLA type.
Let's suppose we want to encode a constant { 13, 2, 156, 212, 1, 21, 0,
0} so we use the following interleaved sequences to represent the
minimum number of subparts of a vector type such as VNx8HI:
Correction: the following interleaved sequences actually explicitly
represent *twice* the minimum number of subparts of VNx8HI, i.e. 16.
It would be possible to use fewer encoded elements only if implicit
values could be decoded from the patterns (for example, because the
implicit values form a zero tail when VECTOR_CST_NELTS_PER_PATTERN == 2,
or because they repeat an earlier sequence when
VECTOR_CST_NELTS_PER_PATTERN == 1).
{ 13, 0 }
{ 2, 0 }
{ 156, 0 }
{ 212, 0 }
{ 1, 0 }
{ 21, 0 }
{ 0, 0 }
{ 0, 0 }
where the sequences are represented by the following patterns:
base0 == 13, base1 == 0, step == 0
base0 == 2, base1 == 0, step == 0
base0 == 156, base1 == 0, step == 0
base0 == 212, base1 == 0, step == 0
base0 == 1, base1 == 0, step == 0
base0 == 21, base1 == 0, step == 0
base0 == 0, base1 == 0, step == 0
base0 == 0, base1 == 0, step == 0
VECTOR_CST_NPATTERNS == 8
VECTOR_CST_NELTS_PER_PATTERN == 2
The vector is encoded using the first 32 elements with any remaining
elements the vector might have being implicit extensions of them.
Sorry, I meant the first 16 elements, not the first 32 elements.
But now, let's imagine the programmer makes a mistake and accidentally
omits the last pattern, so that VECTOR_CST_NPATTERNS == 7:
{ 13, 0 }
{ 2, 0 }
{ 156, 0 }
{ 212, 0 }
{ 1, 0 }
{ 21, 0 }
{ 0, 0 }
Now, the minimum number of elements required for a scalable vector are
still represented. They don't even have different values, because the
second element of just one pattern such as { 13, 0 } can take up
infinite space.
It follows that, in cases like this, there can be more efficient
encodings of a VECTOR_CST than those currently created by
build_vector_from_ctor. Currently, build_vector_from_ctor always
encodes twice the minimum number of subparts of a variable-length vector
type. It seems to me that it would be sufficient to encode twice the
number of elements in the vector of CONSTRUCTOR_ELT, because
build_vector_from_ctor sets the second element of each pattern to zero
and those zeros repeat an infinite number of times.
I don't think optimising the encoding of VECTOR_CST is necessary to
enable BB SLP with predicated tails though.
--
Christopher Bazley
Staff Software Engineer, GNU Tools Team.
Arm Ltd, 110 Fulbourn Road, Cambridge, CB1 9NJ, UK.
http://www.arm.com/