"Ronald S. Bultje" <[email protected]> writes: > Hi, > > On Fri, Dec 7, 2012 at 2:01 PM, Måns Rullgård <[email protected]> wrote: >> "Ronald S. Bultje" <[email protected]> writes: >>> On Fri, Dec 7, 2012 at 1:01 PM, Måns Rullgård <[email protected]> wrote: >>>> "Ronald S. Bultje" <[email protected]> writes: >>>> >>>>> + %if mmsize <= 16 && HAVE_ALIGNED_STACK >>>> >>>> How much overhead would it be to drop HAVE_ALIGNED_STACK entirely? >>> >>> Well, for now, we still have a ton of functions that don't use the >>> cglobal-method of allocating stack. I only ported h264/vp8 loopfilter, >>> nothing else. >>> >>> But anyway, more generally, it's 4-5 instructions per function. For >>> typical functions with an inner loop, that's negligible, but for a >>> select small set of functions, it may be significant. >> >> The remaining functions are ff_h264_idct8_add(4)_10_{sse2,avx}, >> ff_hadamard8_diff(16)_{sse2,ssse3}, and something in swscale. > > lavr also.
There are no references to HAVE_ALIGNED_STACK there. >> Besides, does anyone still use 32-bit where performance is that >> critical? > > This is used for YMM (e.g. avx float) stack alignment (to 32-byte) > also, so it will affect 64-bit also. My personal point of view is that > the code to take advantage of an actual feature of the compiler/system > (alignment) is there. I don't see why we'd remove it. Tracking how different compilers align the stack is a pain. -- Måns Rullgård [email protected] _______________________________________________ libav-devel mailing list [email protected] https://lists.libav.org/mailman/listinfo/libav-devel
