kpuatamazon edited a comment on pull request #18885:
URL: https://github.com/apache/incubator-mxnet/pull/18885#issuecomment-671281484


   If we wanted to be really pedantic, alignment could be set based on CPUID.  
   
   I suppose MXNet has a larger problem that the Intel people might want to 
pontificate on: the tensors might be aligned but nobody told the compiler that 
e.g. with `__attribute__((aligned(64)))` so the kernels are still generating 
branches to handle unaligned data.  
   
   Also, I note your benchmark doesn't have a GEMM which is where the big costs 
come from.  
   
   In any case, I'd like to see this in because my code was written to depend 
on it.  


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to