ekalda commented on PR #16782:
URL: https://github.com/apache/tvm/pull/16782#issuecomment-2020983646

   Thank you for your feedback @Lunderberg, much appreciated!
   
   > The implementation looks reasonable, though I have one main question for 
it: What is the behavior of the updated pass for a target that doesn't support 
SVE? Prior SVE-commits enabled the functionality, but didn't produce SVE in any 
of the default lowering passes.
   > 
   > From [this 
line](https://github.com/apache/tvm/pull/16696/files#diff-f61b04b100f5145f2681340c81d3f2af221239594ed01e2e24896522329ce92cR598-R600),
 versions of LLVM before 11.0 do not support SVE, nor from my brief reading of 
the CUDA codegen 
[here](https://github.com/apache/tvm/blob/main/src/target/source/codegen_cuda.cc#L253)
 does CUDA.
   
   When it comes to targets that don't support SVE, I'd expect these targets to 
not trigger the creation of scalable vectors. In the current plan the creation 
of scalable vectors has to be intentional, i.e. it comes from splitting an axis 
by a `vscale` dependent expression in the (target dependent) schedules and 
vectorizing the resulting axis. If the `LoopVectorizer` is trying to create 
scalable vectors for target that doesn't support it, something has gone wrong 
and the compilation will fall over at some point:
   * If by some mistake a schedule that doesn't support VLA programming 
contains `vscale`, it will fall over latest in a target dependent codegen
   * If there is an attempt to vectorize loops with non-int extent that doesn't 
contain `vscale`, the "scalable ramp" creation will error since it expects the 
`PrimExpr lanes` in a form `vscale * int`. I realize though that this is a 
weird deviation from a current behaviour of
   ```
         if (!extent_as_int || extent_as_int->value < 1) {
           LOG(FATAL) << "Failed to vectorize loop with extent " << op->extent;
         }
   ```
   so I'll modify the patch such that it checks for a target and fails as 
before if the extent is not an int.
   
   > Since `VectorizeLoop` occurs after the `BindTarget` pass, we can check the 
function attribute to know which target will be executing each function. I 
think we should have the loop vectorization apply only to fixed-extent loops by 
default, but enable the scalable vectorization for targets that support it.
   
   In principle I'm not against making vectorizing for scalable vectors 
functionality more explicitly target specific, but it is not obvious to me what 
that would mean in terms of code? `ICHECK`s for the appropriate targets at the 
places where scalable vectors are created? 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to