Re: [I] [Bug] TVM/LLVM sets RISC-V VLEN to 128 bits instead of 256 on Banana Pi K1 [tvm]

via GitHub Mon, 10 Feb 2025 10:00:41 -0800


cbalint13 commented on issue #17625:
URL: https://github.com/apache/tvm/issues/17625#issuecomment-2648825520


   H@JieGH ,
   
   > Hi @cbalint13, Thanks for the advice. I now have a method for choosing 
VLEN, which is using an additional flag: ` tvm.target.Target("llvm -jit=orcjit 
-mtriple=riscv64 -mcpu=spacemit-x60 - 
mattr=+64bit,+m,+a,+f,+d,+c,+zfh,+v,+zvl256b")`
   > 
   > By specifying zvl256b flags, it means enable 'Zvl' (Minimum Vector Length) 
256. This indeed has an impact on the execution's performance. However,
   
    Yes, another way to tell LLVM the VLEN is via the canonical flags, but we 
also need TVM itself to be aware of this.
   
   > 1. the TVM still warns the 128bit sets as default bit length despite the 
zvl flags having been enabled and having an impact on performance.  I have not 
yet run the latest update you posted at Handle vector width (VLEN) for RISCV 
arches [Handle vector width (VLEN) for RISCV arches 
#17631](https://github.com/apache/tvm/pull/17631)
   
   You have an older LLVM, and it does not know about ```-mcpu=spacemit-x60```, 
so it will fall as a ```generic```.
   * Can check llvm version from tvm side:
      ```
      $ python3 -c "import tvm; print(tvm.target.codegen.llvm_version_major())"
      20
      ```
   * Can also look inside riscv64 what ```-mcpu``` options are there for your 
installed LLVM:
      ```
      $ python -c "import tvm; 
print(tvm.target.codegen.llvm_get_cpu_archlist(tvm.target.Target('llvm 
-mtriple=riscv64--')))"
      ["generic", "generic-rv32", "generic-rv64", "mips-p8700", "rocket",
       "rocket-rv32", "rocket-rv64", "rp2350-hazard3", "sifive-7-series", 
       "sifive-e20", "sifive-e21", "sifive-e24", "sifive-e31", "sifive-e34", 
       "sifive-e76", "sifive-p450", "sifive-p470", "sifive-p670", "sifive-s21", 
       "sifive-s51", "sifive-s54", "sifive-s76", "sifive-u54", "sifive-u74", 
       "sifive-x280", "spacemit-x60", "syntacore-scr1-base", 
"syntacore-scr1-max", 
       "syntacore-scr3-rv32", "syntacore-scr3-rv64", "syntacore-scr4-rv32", 
       "syntacore-scr4-rv64", "syntacore-scr5-rv32", "syntacore-scr5-rv64",
       "syntacore-scr7", "tt-ascalon-d8", "veyron-v1", "xiangshan-nanhu"]
      ```
   
   The flags (older LLVM) would be:  *llvm -device=riscv_cpu -vector-width=256 
-mtriple=riscv64-linux-gnu -mcpu=generic-rv64 -mattr=+64bit,+a,+c,+d,+f,+m,+v* 
(orcjit is already default, vector-with informs booth TVM and LLVM).
   
   
   > 2. I searched zvl flags for a given matrix mul problem; I changed the zvl 
and measured the performance. The best performance appeared at the vector 
length that the chip should not support. For example if I set zvl256b, the 
execution takes 491ms to complete, if I set zvl8192b, the execution takes 384 
ms to finish, which has over 20% speed up. There are something wrong here.> 
   > Any comments on this? Thanks.
   
   Performance also depends on how LLVM optimizes things out, TVM have no 
highly-specialized optimizations for RISCV.
   
   TVM emmits candidates/iterations as intermediate proposals (in auto-tunnig 
flow) and forwards to LLVM, while electing only the best performing ones. Not 
sure if you are also tring to tune your model/function,  but without a tuning 
process TVM likely emits a subperforming variant, even for a simple matmul 
operation, there should be a warn on this:
   ```
   WARNING:autotvm:One or more operators have not been tuned. 
   Please tune your model for better performance. Use DEBUG logging level to 
see more details.
   ```
   
   The work done in https://github.com/apache/tvm/pull/17631 only informs TVM 
about VLEN intentions from LLVM side.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [I] [Bug] TVM/LLVM sets RISC-V VLEN to 128 bits instead of 256 on Banana Pi K1 [tvm]

Reply via email to