tqchen commented on PR #12564:
URL: https://github.com/apache/tvm/pull/12564#issuecomment-1225915787
There are two parts of the perspective that comes into the story:
- P0: The alignment of the allocator
- P1: The alignment assumption of the compiler
P1 affects how compiler generates code, e.g. if we assume that the code is
aligned to 128bytes, then compiler can generate aligned store/load safely. P0
affects the array we allocate, we always need to allocate memory that is larger
than P1 assumption made by the compiler.
Right now the two are roughly tied together, with the exception of
`kHexagonAllocAlignment`, which enforces P0.
The PyTorch issue was caused by P1 -- we are making assumption of 128 byte
alignment while pytorch arrays are commonly aligned to 64 bytes.
Currently, P1 is supplied by the buffer construct, which do not (yet) depend
on the target. Agree with @kparzysz-quic that minimum alignment requirement
can be part of the target, and we need to find good ways to set them onto the
buffer.
As a short workaround, we can detect P1 settings(of buffers) in early stage
of TIR compilation and amend them for devices like HVX
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]