tqchen edited a comment on pull request #18:
URL: https://github.com/apache/tvm-rfcs/pull/18#issuecomment-916553296


   Thanks @MeeraN7 @giuseros, to make the discussion more concrete, right now 
the IR after legalization looks like
   
   ```c++
     for (i: int32, 0, 17;i+=VL) {
       C_2[ramp(i, 1, VL)] = ((int8xVL*)A_2[ramp(i, 1, VL)] + 
(int8xVL*)B_2[ramp(i, 1, VL)])
     }
   ```
   
   This would require changes such as the ramp data structure and data type to 
support the VL vector types, which can be a bit adhoc because there are also 
additional information needed to be encoded(e.g. this VL and that VL are the 
same) but nevertheless not clearly encoded here. 
   
   Given we are mostly matching a for pattern, I also wonder if they are really 
necessary.  Since we could represent a VLA loop with some form of restricted 
for loop, with special annotations. Here is a possible alternative way to do so
   
   ```c++
     for (i: int32, 0, 17;i, annotation={"VLA"}) {
       C_2[i] = A_2[i] + B_2[i];
     }
   ```
   And we will be defering the vectorized instruction generation to the codegen 
phase, by specially handling the patterns in the for that is annotated with VLA 
loop. Of course we can only support a limited set of patterns(such as 
read/write to the same vector index or limited reduction support), that is why 
legalize is needed to make sure the body of VLA for loop satiesfies the pattern.
   
   In this way we can likely get a similar set of things without hacking into 
get a ramp with VL size
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to