[GitHub] [tvm-rfcs] sjoerdmeijer commented on pull request #18: [RFC] Adding initial SVE implementation

GitBox Sun, 12 Sep 2021 00:34:26 -0700


sjoerdmeijer commented on pull request #18:
URL: https://github.com/apache/tvm-rfcs/pull/18#issuecomment-917576240



   Thanks for commenting @tqchen
   Could you further clarify a few things for me please? See remarks inlined.
   
   > Thanks @MeeraN7 . Yes I get what you mean. Right now we are adding a 
"is_scalable" field to indicate that the broadcast and ramp are "context 
dependent" on VL. Additionally, we might need to update DataType to indicate a 
scalable data type.
   > 
   > This context dependency is the missing information I mentioned here. 
   
   I don't think I understand what you mean by context dependency. Basically, 
in my view of the world, the Ramp node means that we can process elements in a 
data parallel way. How exactly this is done, is up to backends depending on the 
architecture and the code. What we are doing here is annotating this Ramp node 
with a bit of state, which is a hint that we want a special kind of 
vectorisation. From this point of view it is syntactic sugar: if the hint is 
dropped or ignored, it could still be vectorised, but not in a vector length 
agnostic way. 
   
   I don't think we need to encode much more information than one boolean that 
specifies the vectorisation style. You could indeed argue that this is ad-hoc, 
but if we are going to do it in a different way, we would still need to keep a 
bit of state around, like the `annotation={"VLA"}` example that you gave 
earlier. From that point of view, I don't see any difference.
   
   > The set of code is really undefined and should be parameterized by VL. 
   
   What do you mean by undefined here?
   
   > Additionally, considering the case of two loops with different VL1 and VL2 
and want to do some transformations, we might fall into the trap of thinking 
them as same type(because only "is_scalable" is marked) but in reality they are 
not, as a implicit dependency on VL can be ignored.
   > 
   > I can understand that the additional flag can be helpful as we could reuse 
some of the vectorization logic. However, the "is_scalable" field might 
introduce additional confusion as above, and the additional ramp node may not 
carry too much additional information(apart from the fact that we use a scalar 
vs a vector type). So my main question is that whether or not we could use a 
separate normal form to hint the code generator without changing the current 
DataType, ramp and broadcast.
   
   Correct me if I am wrong, but your assumption is that explicit scalable 
state is bad. Looking at your example:
   
   > **N1: A possible loop normal form via annotation**
   > 
   > ```
   >   for (i: int32, 0, 17;i, annotation={"VLA"}) {
   >     C_2[i] = A_2[i] + B_2[i];
   >   } 
   > ```
   
   This annotation looks equivalent to a loop pragma. In Clang you can for 
example do:
   
    ```
     #pragma loop vectorize_width(4, scalable)
      for (I =0; I<17; ++i) {
        C_2[i] = A_2[i] + B_2[I];
      } 
   ```
   
   and thus request scalable vectorisation. If this annotation results in 
scalable vectorisation, the LLVM vectoriser will lower this to operations using 
`<n x 4 x i32>` scalable vector IR types. 
   
   What I would like to say with these examples, is that there are 2 things 
going on at different levels. I think:
   - The `annotation={"VLA"}` corresponds to a loop pragma in Clang,
   - The TIR Ramp node extension corresponds to LLVM IR scalable types, e.g. 
`<n x 4 x i32>`.
   
   And I think these 2 concepts are different things that both have their value 
and place.
   
   I we can only annotate loops as being scalable, we loose the finer grained 
control to request this on a statement level. I don't know if mixing fixed and 
scalable will be an important use-case, but I think it is possible. 
   
   Summarising, I don't think explicit encoding of scalable in TIR nodes is a 
bad thing, the opposite actually, I think we need it, and the annotation on the 
loop might be a complementary technique to this. 
   
   What do you think? 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [tvm-rfcs] sjoerdmeijer commented on pull request #18: [RFC] Adding initial SVE implementation

Reply via email to