On 10/3/25 12:01, Jeff Law wrote:
> On 10/3/25 12:13 PM, Robin Dapp wrote:
>>> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr118945-2.c
>>> b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr118945-2.c
>>> new file mode 100644
>>> index 000000000..956574067
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr118945-2.c
>>> @@ -0,0 +1,28 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-march=rva23u64 -mtune=generic-ooo -Ofast -S" } */
>>> +
>>> +void vmult(
>>> +    double* dst,
>>> +    const double* src,
>>> +    const unsigned int* rowstart,
>>> +    const unsigned int* colnums,
>>> +    const double* val,
>>> +    const unsigned int n_rows
>>> +) {
>>> +    const double* val_ptr = &val[rowstart[0]];
>>> +    const unsigned int* colnum_ptr = &colnums[rowstart[0]];
>>> +    double* dst_ptr = dst;
>>> +
>>> +    for (unsigned int row = 0; row < n_rows; ++row) {
>>> +        double s = 0.;
>>> +        const double* const val_end_of_row = &val[rowstart[row + 1]];
>>> +        while (val_ptr != val_end_of_row) {
>>> +            s += *val_ptr++ * src[*colnum_ptr++];
>>> +        }
>>> +        *dst_ptr++ = s;
>>> +    }
>>> +}
>>> +
>>> +/* { dg-final { scan-assembler-times 
>>> {vsetvli\s+[a-z0-9]+,\s*[a-z0-9]+,\s*e[0-9]+,\s*m[f0-9]+,\s*ta,\s*ma} 4 } } 
>>> */
>>> +/* { dg-final { scan-assembler-times 
>>> {vsetvli\s+[a-z0-9]+,\s*[a-z0-9]+,\s*e[0-9]+,\s*m[f0-9]+,\s*tu,\s*ma} 1 } } 
>>> */
>>> +
>> I thought we could scan the vsetvl dump directly for a demand merge _not_
>> happening.  Is that not possible?
>>
>> In the end not a big deal, the checks are better than before.  IMHO this can
>> move forward unless Jeff or somebody else thinks differently.
> I'm inclined to let it move forward.  CI doesn't seem to be running on 
> this patch for reasons unknown.  I'll throw it into my tester for a 
> final verification in a bit and hopefully push to the trunk later today.

I pulled this into our internal CI and have some statistical (build time
vsetvl's) data for now.
I also plan to run this on our silicon to see if it makes and difference since
this request originally came from our perf team.
AFAIKR the author said that parest improved by 30% on their OoO uarch ?

Cheers,
-Vineet

Reply via email to