On 25/06/2026 14:00, Richard Biener wrote:
On Mon, Jun 22, 2026 at 4:00 PM Christopher Bazley <[email protected]> wrote:

On 22/06/2026 09:41, Richard Biener wrote:
On Fri, Jun 19, 2026 at 4:54 PM Christopher Bazley <[email protected]> wrote:

On 17/06/2026 08:47, Richard Biener wrote:
On Tue, Jun 16, 2026 at 5:52 PM Christopher Bazley <[email protected]> wrote:

Hi Richard,

On 09/06/2026 10:40, Richard Biener wrote:
On Wed, Jun 3, 2026 at 5:21 PM Christopher Bazley <[email protected]> wrote:
@@ -8075,7 +8078,12 @@ store_constructor (tree exp, rtx target, int cleared, 
poly_int64 size,
                       similarly non-const type vectors. */
                    icode = convert_optab_handler (vec_init_optab, mode, 
eltmode);
                  }
-
+           else
+             {
+               /* Handle variable-length vector types.  */
+               icode = convert_optab_handler (vec_init_optab, mode, eltmode);
+               const_n_elts = constant_lower_bound (n_elts);
+             }

And here I'd like to see

                gcc_assert (icode != CODE_FOR_nothing);


Unfortunately, I cannot make this requested change because it causes the
compiler to crash in an existing test case, even if I restrict the scope
of the assertion to the new 'else' block.

The 'else' block was where I wanted the assert to be, of course.


Reproducer:

make check-gcc RUNTESTFLAGS="aarch64-sve-acle.exp=cops_bool.c"

Variable values:

mode = E_VNx16BImode
eltmode = E_QImode
icode = CODE_FOR_nothing

Backtrace:

#0  fancy_abort (file=file@entry=0x255f858
"/work/results_sc/src-patched/gcc/expr.cc", line=line@entry=8085,
function=function@entry=0x25602a0 "store_constructor")
        at /work/results_sc/src-patched/gcc/diagnostics/context.h:558
#1  0x0000000000be9c98 in store_constructor
(exp=exp@entry=0xfffff6577e58, target=target@entry=0xfffff605d408,
cleared=cleared@entry=0, size=..., reverse=reverse@entry=false)
        at /work/results_sc/src-patched/gcc/expr.cc:8085
#2  0x0000000000bec00c in expand_constructor
(exp=exp@entry=0xfffff6577e58, target=0xfffff605d408,
target@entry=0xfffff605d3f0, modifier=modifier@entry=EXPAND_NORMAL,
avoid_temp_mem=avoid_temp_mem@entry=false)
        at /work/results_sc/src-patched/gcc/poly-int.h:470
#3  0x0000000000bd21b8 in expand_expr_real_1 (exp=0xfffff6577e58,
target=<optimized out>, tmode=E_VNx16BImode, modifier=EXPAND_NORMAL,
alt_rtl=0x0, inner_reference_p=<optimized out>)
        at /work/results_sc/src-patched/gcc/expr.cc:11943
#4  0x0000000000bd845c in expand_expr_real_gassign
(g=g@entry=0xfffff65a88f0, target=target@entry=0xfffff605d3f0,
tmode=tmode@entry=E_VNx16BImode, modifier=modifier@entry=EXPAND_NORMAL,
        alt_rtl=alt_rtl@entry=0x0,
inner_reference_p=inner_reference_p@entry=false) at
/work/results_sc/src-patched/gcc/gimple.h:2728
#5  0x0000000000bd5604 in expand_expr_real_1
(exp=exp@entry=0xfffff65bf318, target=<optimized out>,
tmode=E_VNx16BImode, modifier=EXPAND_NORMAL, alt_rtl=0x0,
inner_reference_p=<optimized out>)
        at /work/results_sc/src-patched/gcc/expr.cc:11552
#6  0x0000000000bd8298 in expand_expr_real
(exp=exp@entry=0xfffff65bf318, target=<optimized out>, tmode=<optimized
out>, modifier=modifier@entry=EXPAND_NORMAL, alt_rtl=alt_rtl@entry=0x0,
        inner_reference_p=inner_reference_p@entry=false) at
/work/results_sc/src-patched/gcc/expr.cc:9628
#7  0x0000000000a86328 in expand_expr (exp=0xfffff65bf318,
target=<optimized out>, mode=<optimized out>, modifier=EXPAND_NORMAL) at
/work/results_sc/src-patched/gcc/expr.h:323
#8  expand_return (retval=0xfffff74c9a28) at
/work/results_sc/src-patched/gcc/cfgexpand.cc:4172
#9  expand_gimple_stmt_1 (stmt=0xfffff65a89a0) at
/work/results_sc/src-patched/gcc/cfgexpand.cc:4281
#10 expand_gimple_stmt (stmt=stmt@entry=0xfffff65a89a0) at
/work/results_sc/src-patched/gcc/cfgexpand.cc:4390
#11 0x0000000000a88938 in expand_gimple_basic_block (bb=<optimized out>,
asan_epilog_seq=asan_epilog_seq@entry=0x0) at
/work/results_sc/src-patched/gcc/cfgexpand.cc:6507
--Type <RET> for more, q to quit, c to continue without paging--
#12 0x0000000000a8a568 in (anonymous namespace)::pass_expand::execute
(this=<optimized out>, fun=0xfffff6591000) at
/work/results_sc/src-patched/gcc/cfgexpand.cc:7254
#13 0x0000000000f7ce98 in execute_one_pass (pass=pass@entry=0x3297c70)
at /work/results_sc/src-patched/gcc/passes.cc:2646
#14 0x0000000000f7d8d4 in execute_pass_list_1 (pass=0x3297c70) at
/work/results_sc/src-patched/gcc/passes.cc:2757
#15 0x0000000000f7d948 in execute_pass_list (fn=<optimized out>,
pass=<optimized out>) at /work/results_sc/src-patched/gcc/passes.cc:2768
#16 0x0000000000aced24 in cgraph_node::expand (this=0xfffff6588aa0) at
/work/results_sc/src-patched/gcc/context.h:49
#17 cgraph_node::expand (this=0xfffff6588aa0) at
/work/results_sc/src-patched/gcc/cgraphunit.cc:1827
#18 0x0000000000ad0dc0 in expand_all_functions () at
/work/results_sc/src-patched/gcc/cgraphunit.cc:2057
#19 symbol_table::compile (this=this@entry=0xfffff7406000) at
/work/results_sc/src-patched/gcc/cgraphunit.cc:2435
#20 0x0000000000ad3a48 in symbol_table::compile (this=0xfffff7406000) at
/work/results_sc/src-patched/gcc/cgraphunit.cc:2348
#21 symbol_table::finalize_compilation_unit (this=0xfffff7406000) at
/work/results_sc/src-patched/gcc/cgraphunit.cc:2626
#22 0x00000000010bf96c in compile_file () at
/work/results_sc/src-patched/gcc/toplev.cc:482
#23 0x000000000086baac in do_compile () at
/work/results_sc/src-patched/gcc/toplev.cc:2228
#24 toplev::main (this=this@entry=0xfffffffff078, argc=<optimized out>,
argc@entry=35, argv=<optimized out>, argv@entry=0xfffffffff1f8) at
/work/results_sc/src-patched/gcc/toplev.cc:2392
#25 0x000000000086d094 in main (argc=35, argv=0xfffffffff1f8) at
/work/results_sc/src-patched/gcc/main.cc:39
(gdb) up
#1  0x0000000000be9c98 in store_constructor
(exp=exp@entry=0xfffff6577e58, target=target@entry=0xfffff605d408,
cleared=cleared@entry=0, size=..., reverse=reverse@entry=false)
        at /work/results_sc/src-patched/gcc/expr.cc:8085
8085                    gcc_assert (icode != CODE_FOR_nothing);

Context:

8080                  }
8081                else
8082                  {
8083                    /* Handle variable-length vector types.  */
8084                    icode = convert_optab_handler (vec_init_optab,
mode, eltmode);
8085                    gcc_assert (icode != CODE_FOR_nothing);
8086                    const_n_elts = constant_lower_bound (n_elts);
8087                  }
8088
8089                if (const_n_elts && icode != CODE_FOR_nothing)

As far as I can tell, store_constructor is designed to handle the case
where vector is null. For example, it calls the store_constructor_field
function per element instead of assigning values to RTVEC_ELT (vector,
eltpos).

But right here it does

                 vector = rtvec_alloc (const_n_elts);


That statement is in a secondary block that is only entered if
const_n_elts && icode != CODE_FOR_nothing.

In the specific failing test case, the only effect of my proposed change
is to set const_n_elts to a non-zero value; 'vector' keeps its initial
value because icode is still equal to CODE_FOR_nothing:

rtvec vector = NULL;

const_n_elts is not used outside the secondary block that is not
entered, therefore the non-zero value of const_n_elts is irrelevant to
the failing test case.

so not sure what you mean with 'vector is null'.  Or is
constant_lower_bound == 0?

constant_lower_bound (n_elts) is 16, as expected for mode = E_VNx16BImode.

The only case that might be relevant is a {} CTOR, one without elements.  But

CONSTRUCTOR_NELTS (exp) is also 16, so I think that the {} case does not
apply here.

we should have special-paths for that - are they possibly not taken for VLA
vector modes?

I'll note that in general vector CTORs for BImode vectors that are not constant
are a red herring - totally inefficent - and we should avoid those at all cost.

As far as I can tell, the vector is constant, and the generated code
looks acceptable to me, e.g.

          ptrue   p0.b, vl3
          ptrue   p1.b, all

from this macro body (when n == 3):

       svbool_t pg = svptrue_b8 (); \
       svbool_t data1 = svptrue_pat_b8 (SV_VL ## n); \

But yes, the requirement outlined is that the target needs to support
vec_init with E_VNx16BImode, E_QImode.  If it does not the vectorizer
should not have created such CTOR.

I don't think the vectoriser did create it, at least not in this case.
The cops_bool.c.200t.slp2 file is full of "***** Analysis failed
with..." messages. The test does not seem to contain any loops, either.

My understanding is that the vector constructor is created by a call to
a function such as svptrue_b8 (), e.g.

       svbool_t all_true = svptrue_b8 ();

becomes { -1, ... } in

     cmp_35 = sveor_b_z ({ -1, ... }, init1_30, res_init1_34);

which is the GIMPLE version of

     svbool_t cmp = sveor_b_z (all_true, init1, res_init1);

But {-1, ...} is a VECTOR_CST, not a CONSTRUCTOR.


You are right. I was looking at the wrong part of the dump file. I think
this is the right part:

__attribute__((noipa, noinline, noclone, no_icf))
svbool_t func_init4 ()
{
    svbool_t temp;
    int _1;
    <signed-boolean:1> _2;
    int _3;
    <signed-boolean:1> _4;

    <bb 2> [local count: 1073741824]:
    _1 = t ();
    _2 = _1 != 0;
    _3 = f ();
    _4 = _3 != 0;
    temp_8 = {-1, _2, 0, -1, 0, _4, 0, 0, 0, -1, 0, -1, 0, -1, 0, -1};

So temp_8 is VNx16BI, this is exactly a CTOR that would previously be
"invalid" and we've now defined semantics to zero-extend, requiring
vec_initvnx16bibi to be implemented.

The fallback we run when this isn't implemented is not consistently
clearing, I think that would, for VLA modes, make sure

  a) target is a REG_P
  b) we clear the reg

make sure store_constructor_field works here.  I know I fixed several
correctness issues for AVX512 masks constructed this way (code
generation there is equally ugly).

OK, I think I'm beginning to understand the issues.

store_constructor_field is not sufficient unless all fields are stored, which is impossible for a variable-length vector type.

Normally, clearing is handled by one of two execution paths that are predicated on !vector.

The first path depends on a 'need_to_clear' value calculated immediately beforehand, which is based on the incidence of zero elements and presence of tail padding.

'need_to_clear' is true in the case of "aarch64-sve-acle.exp=cops_bool.c" because the total number of elements to be stored is known to be less than the number of subparts in the TARGET type (maybe_lt (count, n_elts) == maybe_lt ({16, 0}, {16, 16})). I have a feeling that 'need_to_clear' is always true for variable-length vector types: for the total number of elements in a CONSTRUCTOR to have a non-zero second coefficient, the number of elements per value ('n_elts_here') would also need to be polynomial.

My ad hoc test cases trying to reproduce that condition result in errors such as:
error: fields cannot have SVE type ‘svint8_t’
error: array elements cannot have SVE type ‘svint8_t’

However, there may be other mechanisms for creating VLA-type elements that I am unaware of.

The other threshold of maybe_gt (4 * zero_count, 3 * count) == maybe_gt (4 * {8, 0}, 3 * {16, 0}) == maybe_gt ({32, 0}, {48, 0}) is not met for "cops_bool.c" but could be met for other variable-length vector types.

The first path also requires the passed-in SIZE ("the number of bytes of TARGET we are allowed to modify") to be known, which is not the case for "aarch64-sve-acle.exp=cops_bool.c" (even though the size is not actually required when emitting MOVE of 0 into a register TARGET).

The second path relies on the number of subparts in the vector type maybe being greater than one (i.e. don't clear one-element vectors because it's redundant) and the TARGET being a register. This is the path taken in "aarch64-sve-acle.exp=cops_bool.c". Its success has nothing to do with 'need_to_clear' or 'size' and everything to do with TARGET being a register.

So, the two preconditions to guarantee zero-initialisation of variable-length vector CONSTRUCTORs seems to be known SIZE, without which clear_storage for a non-register TARGET cannot be used, or TARGET is a register.

In many cases, SIZE cannot be known for a variable-length type (despite having type poly_int64), because tree_fits_shwi_p returns false in the int_expr_size function that is usually called to get the value of the SIZE argument. The only exceptions where a variable SIZE might be passed to store_constructor appear to be calls from store_constructor_field (for CONSTRUCTOR) and recursion with ARRAY_TYPE.

So, I agree that store_constructor ought to detect such cases where ZI is required but impossible.

--
Christopher Bazley
Staff Software Engineer, GNU Tools Team.
Arm Ltd, 110 Fulbourn Road, Cambridge, CB1 9NJ, UK.
http://www.arm.com/

Reply via email to