On Mon, May 11, 2026 at 9:12 PM Ionuț Nicula <[email protected]> wrote:
>
> On Monday, May 11th, 2026 at 12:55, Richard Biener
> <[email protected]> wrote:
>
> > The reason is that i might overflow and thus
> > the evolution of 'i' for the haystack[i] access isn't affine.
> > Use unsigned long and it works again.
> >
> > This is a general issue with unsigned IVs smaller than
> > pointer size.
>
> Sorry, I somehow messed up my tests and misunderstood what was going on.
>
> I was more interested in this example. This doesn't get vectorized (`-O3
> -march=znver5 -ffreestanding`):
>
>     unsigned long foo(uint8_t *data) {
>         unsigned long i = 0;
>         while (1) {
>             if (data[i] == 0)
>                 return i;
>             i++;
>         }
>     }

This gets turned into if (*data != 0) return strlen(data + 1) + 1; with
-fno-tree-loop-distribute-patterns it would get vectorized but vectorization
is deemed not profitable.  We vectorize it with -fno-vect-cost-model.

>
> But this does:
>
>     long foo(uint8_t *data) {
>         long i = 0;
>         while (1) {
>             if (data[i] == 0)
>                 return i;
>             i++;
>         }
>     }
>
> `unsigned long` index coupled `uint16_t *` data does get vectorized,
> though.
>
> Is it because technically `uint8_t[size_t]` can span the whole address
> space and therefore the `size_t` index can validly wrap around, whereas
> `uint16_t[size_t]` is bounded by the address space so you can never
> reach `uint16_t[2 + (SIZE_MAX/2)]`?

It's a cost model issue.  With 'long' we get an estimate on the number
of iterations, with 'unsigned long' we don't.

Richard.

>

Reply via email to