Hi Monk, could you detail the issue/patch a bit? Are we generally violating LMUL >= SEW/ELEN with zve32f (zve32x as well then)? And what's "implicit zve32f"? In the test case it's specified explicitly.
> According to Section 3.4.2, Vector Register Grouping, in the RISC-V > Vector Specification, the rule for LMUL is LMUL >= SEW/ELEN > + /* Follow rule LMUL >= SEW / ELEN. */ > + int elen = TARGET_VECTOR_ELEN_64 ? 1 : 2; > int factor = TARGET_MIN_VLEN / size; > if (inner_size == 8) > - factor = MIN (factor, 8); > + factor = MIN (factor, 8 / elen); > else if (inner_size == 16) > - factor = MIN (factor, 4); > + factor = MIN (factor, 4 / elen); > else if (inner_size == 32) > - factor = MIN (factor, 2); > + factor = MIN (factor, 2 / elen); > else if (inner_size == 64) > factor = MIN (factor, 1); > else As far as I understand it the minimum LMUL rule applies to the minimum SEW = 8 and the ELEN of the implementation. An LMUL = 1/8 is invalid for a VLEN = 32 because that would mean we'd only have 4 bits per element. The spec says: For standard vector extensions with ELEN=32, fractional LMULs of 1/2 and 1/4 must be supported. For standard vector extensions with ELEN=64, fractional LMULs of 1/2, 1/4, and 1/8 must be supported. So the problem is we assume a "sane" implementation that would implement LMUL=1/8 whenever VLEN > 32 but that's too optimistic? Then the problem would be that we're using TARGET_MIN_VLEN rather than ELEN here and there are implementations that could technically support LMUL = 1/8 but don't? This sounds a bit like vector unaligned access all over again... So we'd want a "sane" uarch flag that keeps the current MIN_VLEN behavior but needed to make LMUL = 1/4 the minimum by default. This only applies to LMUL = 1/8, though and not all the other cases. > +/* { dg-options "-march=rv32imafc_zve32f_zvl128b -mabi=ilp32 -O2" } */ > +/* { dg-final { scan-assembler > {vsetivli\s+zero,\s*2,\s*e32,\s*m1,\s*t[au],\s*m[au]} } } */ > +/* { dg-final { scan-assembler > {vsetivli\s+zero,\s*4,\s*e32,\s*m1,\s*t[au],\s*m[au]} } } */ >From what I can tell the test case uses a V2SImode, so 64 bit. When VLEN=128 (zvl128b) isn't the correct LMUL mf2 rather than m1? In particular, how would the same LMUL for AVL=2 and AVL=4 and the same data type be correct? Maybe it would help to add a run test? A PR might be useful as well to track things as we're late in the release cycle. -- Regards Robin