One can do aligned allocations using `mm_malloc` from the SSE intrinsics 
`<immintrin.h>`. I just started testing and it seems to align to at least one 
cache-line on my old haswell machine. I'd assume to be smth similar for ARM in 
NEON/AVX2.

Reply via email to