On Mon, Feb 25, 2013 at 7:34 PM, "René J.V. Bertin" <[email protected]> wrote: > On Feb 25, 2013, at 20:19, Claudio Freire wrote: > >> That's because __builtin_assume_aligned isn't being called (most >> likely, didn't check). That results in **far** sub-optimal >> vectorization. I don't know about the failing tests though. > > I doubt that call (or rather, token?) is required on OS X, where memory > allocations (and stack alignment) are aligned. I know of a case where the > absence of the token didn't prevent a very substantial performance gain, but > haven't checked if that's always the case.
I wouldn't assume. Even if they are in effect aligned, if the compiler doesn't know it (ie, if malloc doesn't mark them as such), vectorization will still assume out-of-alignment access. Architecture-mandated and SSE/2/3/MMX/Whatever alignment requirements tend to be different. You can write a very simple test case to check it out. _______________________________________________ Libav-user mailing list [email protected] http://ffmpeg.org/mailman/listinfo/libav-user
