On Sat, Aug 02, 2014 at 04:29:39PM -0300, James Almer wrote: > On 02/08/14 3:20 PM, Clément Bœsch wrote: > > + psrlq m0, m6, 32 > > + paddw m6, m0 > > + psrlq m0, m6, 16 > > + paddw m6, m0 > > + movd eax, m6 > > + movzx eax, ax > > You could use the HADDW macro here. >
error: undefined symbol `pw_1' (first use) sounds somehow constraining. I'll keep my version until you benchmark to prove me HADDW is faster on an old MMX cpu ;) > > +;------------------------------------------------------------------------------- > > +; int ff_pixelutils_sad_8x8_mmxext(const uint8_t *src1, ptrdiff_t stride1, > > +; const uint8_t *src2, ptrdiff_t stride2); > > +;------------------------------------------------------------------------------- > > +INIT_MMX mmxext > > +cglobal pixelutils_sad_8x8, 4,4,0, src1, stride1, src2, stride2 > > + pxor m2, m2 > > +%rep 4 > > + mova m0, [src1q] > > + mova m1, [src1q + stride1q] > > + psadbw m0, [src2q] > > + psadbw m1, [src2q + stride2q] > > + paddw m2, m0 > > + paddw m2, m1 > > + lea src1q, [src1q + 2*stride1q] > > + lea src2q, [src2q + 2*stride2q] > > +%endrep > > + movd eax, m2 > > + RET > > Adding sad16x16 mmxext should be a matter of using add instead of lea, > changing > the %rep amount, and using 8 instead of stride[12]q for the mova and psadbw. > Yeah right, added. Thanks. > > --- /dev/null > > +++ b/libavutil/x86/pixelutils.h > > @@ -0,0 +1,26 @@ > > +/* > > + * This file is part of FFmpeg. > > + * > > + * FFmpeg is free software; you can redistribute it and/or > > + * modify it under the terms of the GNU Lesser General Public > > + * License as published by the Free Software Foundation; either > > + * version 2.1 of the License, or (at your option) any later version. > > + * > > + * FFmpeg is distributed in the hope that it will be useful, > > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > > + * Lesser General Public License for more details. > > + * > > + * You should have received a copy of the GNU Lesser General Public > > + * License along with FFmpeg; if not, write to the Free Software > > + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA > > 02110-1301 USA > > + */ > > + > > +#ifndef AVUTIL_X86_PIXELUTILS_H > > +#define AVUTIL_X86_PIXELUTILS_H > > + > > +#include "libavutil/pixelutils.h" > > + > > +void ff_pixelutils_init_x86(AVPixelUtils *s); > > This prototype should be in libavutil/pixelutils.h > No need to make a whole new header just for it. > No, libavutil/pixelutils.h is public, I don't want to have private prototypes in it. > Maybe you could add a quick test for these functions? Look at > lavc/motion-test.c and > lavu/float-dsp.c Added. I'll resubmit a patchset in a moment. -- Clément B.
pgp450g7nvNHd.pgp
Description: PGP signature
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel