tqchen commented on PR #12564:
URL: https://github.com/apache/tvm/pull/12564#issuecomment-1225915787

   There are two parts of the perspective that comes into the story:
   
   - P0: The alignment of the allocator
   - P1: The alignment assumption of the compiler
   
   P1 affects how compiler generates code, e.g. if we assume that the code is 
aligned to 128bytes, then compiler can generate aligned store/load safely. P0 
affects the array we allocate, we always need to allocate memory that is larger 
than P1 assumption made by the compiler.
   
   Right now the two are roughly tied together, with the exception of 
`kHexagonAllocAlignment`, which enforces P0. 
   
   The PyTorch issue was caused by P1 -- we are making assumption of 128 byte 
alignment while pytorch arrays are commonly aligned to 64 bytes.
   
   Currently, P1 is supplied by the buffer construct, which do not (yet) depend 
on the target.  Agree with @kparzysz-quic that minimum alignment requirement 
can be part of the target, and we need to find good ways to set them onto the 
buffer. 
   
   As a short workaround, we can detect P1 settings(of buffers) in early stage 
of TIR compilation and amend them for devices like HVX
    


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to