ekalda commented on PR #104:
URL: https://github.com/apache/tvm-rfcs/pull/104#issuecomment-1753096151

   > What I'm aiming at is to be able to lower the TIR to a generic CPU, that 
is to an architecture that does not support SVE. The TIR will need to have some 
default lowering in CodeGenLLVM/CodeGenCPU, so being able to do that is 
important. For that, we should be able to assume that vscale is 1. The vscale 
would simply be an indicator to the codegen (in TVM) that the code may be 
lowered to SVE.
   
   Right, I see... Would we get any benefit form mapping the scalable TIR 
vectors to fixed length LLVM vectors for targets that don't support scalable 
vectors? At least for Arm's SVE implementations, all access to scalable vectors 
should be intentional, in this RFC proposal directed by target dependent 
schedules (SVE is not preferable over fixed length vectors in all cases). I 
think if I'm compiling code with scalable vectors to a target that doesn't 
support it, I'd rather it errored out since something has gone wrong somewhere.
   
   I was wondering if there is a case for schedules that would apply to all 
scalable architectures? My intuition would say no since the implementations are 
sufficiently different, but would be interesting to hear what others think. 
   
   >  If you're sticking with vscale, then it may seem like we don't need it, 
but the issue is with using "x * vscale" as an idiom: if you have several 
occurrences of "4 * vscale" in an expression, it may end up being rearranged to 
something like "(4*vi + 4) * vscale", or "ceildiv(128, 4 * vscale)" may end up 
being "ceildiv(32, vscale)". So, instead of "x * vscale", I suggest "vscale(x)".
   
   Yes that's a good point. I'll have to think about it a bit more, but I tend 
to agree. Besides the case you mentioned, I can think of some additional 
upsides - it will help with reliably handling the scalable vectors in the TVM 
passes since checking if something is `vscale` is easier than checking if it is 
an expression involving `vscale`. It also makes it easier to enforce that if 
`lanes` in the ramp is not integer, it is `vscale` and not just anything. 
Shouldn't create significantly more complexity for the codegen either (just 
need to emit an extra multiply when encountering the `vscale`). So I think it 
would give us more robust implementation.
   
   > Could it instead be in a target-dependent lowering pass? That is, since a 
lowering pass after BindTarget (here in driver_api.cc) would know whether the 
target CPU supports SVE or not, we could make a pass that either returns the 
IRModule unmodified for CPUs that support SVE, or converts it to non-SVE 
instructions otherwise.
   
   I suppose this is also related to whether we want to implicitly convert 
to/from scalable vectors. I think it is a cool idea, maybe an optional (command 
line triggered) `IRModule` -> `IRModule` pass to turn the fixed length vectors 
into scalable vectors (or vice versa) that users can experiment without having 
to write schedules. I think this would be a future improvement, the goal of 
this RFC is to add the tools to the toolbox, give the TVM users access to the 
scalable vectors and to unblock SME (which will bring very significant 
performance improvements).
   
   Regarding to predication... In my mind the changes to support predication 
are necessary for SVE, but in terms of the code changes tangential. So change 
`BufferLoad` and `BufferStore` nodes to accept a predicate and change the 
`LoopVectorizer` such that instead of scalarising the loop it can't exactly 
vectorize, it creates a predicated buffer operations. I haven't implemented it 
and I'm not much of a `BufferLoad` expert, so I might be missing something 
there, but to me it looks like predication could be used without any SVE infra.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to