manupa-arm commented on PR #18:
URL: https://github.com/apache/tvm-rfcs/pull/18#issuecomment-1162041631
Hi @tqchen @kparzysz-quic @kparzysz-quic @masahi @tkonolige @smeijer1234 ,
We are looking to revive this work. I have gone through the thread.
Summary so far is as follows :
* We want to introduce/enhance a scheduling vectorization primitive that
could be controlled by user/auto-tuner/auto-scheduler either to use scalable
vectors in the backend codegen.
* The conversation has resolved to be extending the existing vectorize
scheduling primitive i.e. : s[C].vectorize(..., scalable=True)
* Usage of this scheduling primitive should result in creating a for loop
with a Ramp nodes with either an additional argument "is_scalable" or special
number for lanes.
* I think @tqchen was suggesting to use the special lane number (-1) as
opposed to introducing an additional argument to all TIR nodes such as Ramp and
Broadcast as well as DataType (and to DLDataType) to avoid ABI breakages.
* Moreover, VectorizeLoopScalable will be modified to create a While node.
* The name of the RFC is confusing ? @kparzysz-quic . I suppose for TIR,
what we are adding is vector-length agnostic vectorization support for TIR,
while demonstrating the codegen of VLA vectorized TIR using Arm(R) SVE codegen.
Please confirm whether this is a right summary of the current state.
As for next steps, I would like to propose/resolve each of the outstanding
discussion points and update the RFC.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]