cbalint13 commented on PR #104:
URL: https://github.com/apache/tvm-rfcs/pull/104#issuecomment-1772429695

   
   Thanks @ekalda for the nice work of the proposal, permit few personal point 
of views supporting the initiative:
   
   ### Pros
   > 1. When eyeballing the TIR, the meaning of the `vscale` intrinsic is 
intuitive since it's matches LLVM
   > 2. It makes translating the expressions involving `vscale` that exist 
outside of the vectors in codegen very easy since we just have to map 
`tir.vscale` -> `llvm.vscale`
   > 3. Since we can pull the information about the vector element data type 
from the ramp node, we can deduce the minimum vector length from the multiplier
   > 4. Makes it simpler to support arbitrarily long vectors*
   > 
   > ### Cons
   > 1. Representing `lanes` in runtime data type is very awkward (see the 
comments above)
   
   * I don't see `lanes` information being awkward, it is already happening for 
classical x86, see: [x86 unrolled tensorizers]( 
https://github.com/apache/tvm/blob/0d338828eebaa3ff705e8521f2a1b3530f73dc7d/python/tvm/topi/x86/tensor_intrin.py#L94-L117)
   * Also given `lanes` information now even the schedulers starts to be aware 
of this, see recent fragment: [x86 
proposal](https://github.com/apache/tvm/blob/0d338828eebaa3ff705e8521f2a1b3530f73dc7d/python/tvm/topi/x86/dense.py#L326-L355)
   
   
   > 2. It's harder to place restriction on what `ramp->lanes` can be so it can 
get accidentally set to something nonsensical. This could be alleviated by 
using `vscale(4)` though as recommended by @kparzysz-quic
   
   > ```
   > ramp(base, stride, vfactor)
   > ```
    
   > ### Cons
   > 1. We don't know the implicit data type of `vfactor` that is outside of 
the vector (this is a big problem)
   
   * Why not have both `vfactor` (abstract) concept along with `vscale` (real), 
where the `vfactor` would be a "virtual" teller of how a single true type 
`vscale`  ramps ? This make the "implicit data type to be know" on one hand, 
and also would be expressive enough for "vectors with multiple vector width".
   
   ---
   
   Personal note:
   
     I would keep going (a +1 ✌️) to align with llvm concepts regarding the 
`vscale` type, even with the price to have a native data type implemented from 
the very bottom of dlpack  stack up to the top TVM endings of the llvm emmiters.
   
     From ASIC point of view, in the very CPU design, there is a clear trend 
that these single-shot atomic "reductors" are becoming increasingly 
parametrizable w.r.t to data (the veclen/lanes concept), easily trading between 
bandwidth needs and specific data access in their hottest possible pipeline 
path.
   
     There is also the ["v" RISCV 
extension](https://eupilot.eu/wp-content/uploads/2022/11/RISC-V-VectorExtension-1-1.pdf)
 that I think is well aligned to these recent concepts (if not they were even 
the first introducing these) so it looks like it is becoming a defacto thing in 
the SIMD design trends.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to