Hi Richard,

On 22/06/2026 18:36, Christopher Bazley wrote:
Hi Richard,

On 22/06/2026 13:00, Christopher Bazley wrote:
Hi Richard,

Thanks for reviewing my patch again.

On 09/06/2026 10:33, Richard Biener wrote:
Which raises the question - we agreed on how to handle VLA vector
CONSTRUCTORs, but the VLA VECTOR_CST representation does not
have sth equivalent here?

In this patch, the equivalent text is simply "Omitted elements are implicitly zero" in the description of gimple_build_vector_from_elems.

I previously updated the description of the pattern vec_init in md.texi:
https://forge.sourceware.org/gcc/gcc-TEST/pulls/177/files

"If @var{m} specifies a scalable vector mode, then operand 1 only specifies the minimum number of elements implied by @var{m} and elements beyond are zero initialized."

I also previously added this to the description of the store_constructor function:

"If the constructor EXP has a vector type then elements of TARGET for which there is no corresponding element in EXP are zero'd.  For a variable-length vector type, only elements up to the minimum number of subparts of the type are explicitly zero'd; any elements beyond that are implicitly zero."


Maybe something similar should be added to the description of make_vector(), since that seems to be the sole origin of VECTOR_CST.

The description of CONSTRUCTOR in gcc/doc/generic.texi already says "You should not assume that all fields will be represented. Unrepresented fields will be cleared (zeroed), unless the CONSTRUCTOR_NO_CLEARING flag is set, in which case their value becomes undefined."

What I propose is to extend the description of @item VECTOR_CST in
gcc/doc/generic.texi. Something like this:

"Only the minimum number of elements required for a scalable vector constant need be represented.

Unrepresented elements of a scalable vector constant will be cleared (zeroed)."

The more I think about it whilst working on possible wording, the more I think it would be a mistake to overconstrain the specification of VECTOR_CST.

As far as I can tell, the existing description of VECTOR_CST in gcc/doc/ generic.texi already fully specifies the VECTOR_CST representation. My understanding is that "an arbitrary-length sequence" is meant to convey that there is no length information implicit in a VECTOR_CST.

It's tempting to add constraints such as "at least the minimum number of elements required for a scalable vector must be represented" or "no more than the minimum number of elements... must be represented", but both statements seem meaningless to me because (if I understand correctly) any encoding selected by choosing the values of VECTOR_CST_NPATTERNS and VECTOR_CST_NELTS_PER_PATTERN inevitably represents the values of all possible elements of a vector of VLA type.

Let's suppose we want to encode a constant { 13, 2, 156, 212, 1, 21, 0, 0} so we use the following interleaved sequences to represent the minimum number of subparts of a vector type such as VNx8HI:

Correction: the following interleaved sequences actually explicitly represent *twice* the minimum number of subparts of VNx8HI, i.e. 16.

It would be possible to use fewer encoded elements only if implicit values could be decoded from the patterns (for example, because the implicit values form a zero tail when VECTOR_CST_NELTS_PER_PATTERN == 2, or because they repeat an earlier sequence when VECTOR_CST_NELTS_PER_PATTERN == 1).

{ 13, 0 }
{ 2, 0 }
{ 156, 0 }
{ 212, 0 }
{ 1, 0 }
{ 21, 0 }
{ 0, 0 }
{ 0, 0 }

where the sequences are represented by the following patterns:

base0 == 13, base1 == 0, step == 0
base0  == 2, base1 == 0, step == 0
base0  == 156, base1 == 0, step == 0
base0  == 212, base1 == 0, step == 0
base0  == 1, base1 == 0, step == 0
base0  == 21, base1 == 0, step == 0
base0  == 0, base1 == 0, step == 0
base0  == 0, base1 == 0, step == 0

VECTOR_CST_NPATTERNS == 8
VECTOR_CST_NELTS_PER_PATTERN == 2

The vector is encoded using the first 32 elements with any remaining elements the vector might have being implicit extensions of them.

Sorry, I meant the first 16 elements, not the first 32 elements.

But now, let's imagine the programmer makes a mistake and accidentally omits the last pattern, so that VECTOR_CST_NPATTERNS == 7:

{ 13, 0 }
{ 2, 0 }
{ 156, 0 }
{ 212, 0 }
{ 1, 0 }
{ 21, 0 }
{ 0, 0 }

Now, the minimum number of elements required for a scalable vector are still represented. They don't even have different values, because the second element of just one pattern such as { 13, 0 } can take up infinite space.

It follows that, in cases like this, there can be more efficient encodings of a VECTOR_CST than those currently created by build_vector_from_ctor. Currently, build_vector_from_ctor always encodes twice the minimum number of subparts of a variable-length vector type. It seems to me that it would be sufficient to encode twice the number of elements in the vector of CONSTRUCTOR_ELT, because build_vector_from_ctor sets the second element of each pattern to zero and those zeros repeat an infinite number of times.

I don't think optimising the encoding of VECTOR_CST is necessary to enable BB SLP with predicated tails though.

--
Christopher Bazley
Staff Software Engineer, GNU Tools Team.
Arm Ltd, 110 Fulbourn Road, Cambridge, CB1 9NJ, UK.
http://www.arm.com/

Reply via email to