On 7/22/20 4:04 PM, Bruno Haible wrote:
Probably most posix_memalign / memalign implementations will round up the request to a 16-bytes allocation. But if some implementation can give me just 8 bytes, properly aligned, without wasting the next 8 bytes, why should I not make use of it?
I suspect the scenario you're suggesting isn't worth the hassle of doing the optimization. And quite possibly it wouldn't be an optimization at all, as "wasting" the next 8 bytes could improve CPU performance significantly for some apps on hardware platforms with split i and d caches where writing to the d cache invalidates the i cache of the same cache line.
Anyway, thanks for letting me know the application you had in mind.