On 7/11/2023 4:50 AM, Petr Pavlu wrote:
> On  6. Jul 23 20:39, Wu, Fei wrote:
>> On 5/29/2023 11:29 AM, Wu, Fei wrote:
>>> On 5/28/2023 1:06 AM, Petr Pavlu wrote:
>>>> On 21. Apr 23 17:25, Jojo R wrote:
>>>>> We consider to add RVV/Vector [1] feature in valgrind, there are some
>>>>> challenges.
>>>>> RVV like ARM's SVE [2] programming model, it's scalable/VLA, that means 
>>>>> the
>>>>> vector length is agnostic.
>>>>> ARM's SVE is not supported in valgrind :(
>>>>>
>>>>> There are three major issues in implementing RVV instruction set in 
>>>>> Valgrind
>>>>> as following:
>>>>>
>>>>> 1. Scalable vector register width VLENB
>>>>> 2. Runtime changing property of LMUL and SEW
>>>>> 3. Lack of proper VEX IR to represent all vector operations
>>>>>
>>>>> We propose applicable methods to solve 1 and 2. As for 3, we explore 
>>>>> several
>>>>> possible but maybe imperfect approaches to handle different cases.
>>>>>
>> I did a very basic prototype for vlen Vector-IR, particularly on RISC-V
>> Vector (RVV):
>>
>> * Define new iops such as Iop_VAdd8/16/32/64, the difference from
>> existing SIMD version is that no element number is specified like
>> Iop_Add8x32
>>
>> * Define new IR type Ity_VLen along side existing types such as Ity_I64,
>> Ity_V256
>>
>> * Define new class HRcVecVLen in HRegClass for vlen vector registers
>> The real length is embedded in both IROp and IRType for vlen ops/types,
>> it's runtime-decided and already known when handling insn such as vadd,
>> this leads to more flexibility, e.g. backend can issue extra vsetvl if
>> necessary.
>>
>> With the above, RVV instruction in the guest can be passed from
>> frontend, to memcheck, to the backend, and generate the final RVV insn
>> during host isel, a very basic testcase has been tested.
>>
>> Now here comes to the complexities:
>>
>> 1. RVV has the concept of LMUL, which groups multiple (or partial)
>> vector registers, e.g. when LMUL==2, v2 means the real v2+v3. This
>> complicates the register allocation.
>>
>> 2. RVV uses the "implicit" v0 for mask, its content must be loaded to
>> the exact "v0" register instead of any other ones if host isel wants to
>> leverage RVV insn, this implicitness in ISA requires more explicitness
>> in Valgrind implementation.
>>
>> For #1 LMUL, a new register allocation algorithm for it can be added,
>> and it will be great if someone is willing to try it, I'm not sure how
>> much effort it will take. The other way is splitting it into multiple
>> ops which only takes one vector register, taking vadd for example, 2
>> vadd will run with LMUL=1 for one vadd with LMUL=2, this is still okay
>> for the widening insn, most of the arithmetic insns can be covered in
>> this way. The exception could be register gather insn vrgather, which we
>> can consult other ways for it, e.g. scalar or helper.
>>
>> For #2 v0 mask, one way is to handle the mask in the very beginning at
>> guest_riscv64_toIR.c, similar to what AVX port does:
>>
>> a) Read the whole dest register without mask
>> b) Generate unmasked result by running op without mask
>> c) Applying mask to a,b and generate the final dest
>>
>> by doing this, insn with mask is converted to non-mask ones, although
>> more insns are generated but the performance should be acceptable. There
>> are still exceptions, e.g. vadc (Add-with-Carry), v0 is not used as mask
>> but as carry, but just as mentioned above, it's okay to use other ways
>> for a few insns. Eventually, we can pass v0 mask down to the backend if
>> it's proved a better solution.
>>
>> This approach will introduce a bunch of new vlen Vector IRs, especially
>> the arithmetic IRs such as vadd, my goal is for a good solution which
>> takes reasonable time to reach usable status, yet still be able to
>> evolve and generic enough for other vector ISA. Any comments?
> 
> Could you please share a repository with your changes or send them to me
> as patches? I have a few questions but I think it might be easier for me
> first to see the actual code.
> 
Please see attachment. It's a very raw version to just verify the idea,
mask is not added but expected to be done as mentioned above, it's based
on commit 71272b2529 on your branch, patch 0013 is the key.

btw, I will setup a repository but it takes a few days to pass the
internal process.

Thanks,
Fei.

> Thanks,
> Petr

Attachment: rvv.tar.bz2
Description: Binary data

_______________________________________________
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users

Reply via email to