vinx13 commented on pull request #39:
URL: https://github.com/apache/tvm-rfcs/pull/39#issuecomment-935305808


   Thanks @Lunderberg for the RFC. Logical-physical mapping is definitely an 
important feature. I also implemented something similar for warp memory to 
support tensor core instructions on GPU, I'm happy to collaborate more to get 
an unified design.
   Some preliminary comments:
   The current representation of logical-physical layout mapping is to use an 
array of axis/factor to define how the logical axes are split/reordered/fused 
to form the physical axes. This works for the case of packed layout like 
`NCHW4c`, but we might need to think whether this is a generic way to represent 
the mapping. For example, another way is to use a mapping function: `(n, c, h, 
w) -> (n, tir.floordiv(c, 4), h, w, tir.floormod(c, 4))`. This would allow 
arbitrary mapping (we can add more restrictions like requiring affine mapping 
though, to make analysis easier). A possible use cases of more complex mapping 
is [permuted 
layout](https://github.com/NVIDIA/cutlass/blob/master/media/docs/implicit_gemm_convolution.md#shared-memory-layouts)
 for shared memory on CUDA.
   Also, there are related [affine analysis 
infrastructure](https://github.com/apache/tvm/blob/main/include/tvm/arith/iter_affine_map.h)
 available, it would be great if we can reuse it for loop analysis and 
rewriting.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to