On 10/11/14 6:42 PM, Kieran Kunhya wrote: Can't test since it doesn't apply cleanly, but here are a few comments anyway.
> diff --git a/libavfilter/x86/vf_interlace.asm > b/libavfilter/x86/vf_interlace.asm > new file mode 100644 > index 0000000..40b10fc > --- /dev/null > +++ b/libavfilter/x86/vf_interlace.asm > @@ -0,0 +1,80 @@ > +;***************************************************************************** > +;* x86-optimized functions for interlace filter > +;* > +;* Copyright (C) 2014 Kieran Kunhya <kier...@obe.tv> > +;* > +;* This file is part of Libav. > +;* > +;* Libav is free software; you can redistribute it and/or modify > +;* it under the terms of the GNU General Public License as published by > +;* the Free Software Foundation; either version 2 of the License, or > +;* (at your option) any later version. > +;* > +;* Libav is distributed in the hope that it will be useful, > +;* but WITHOUT ANY WARRANTY; without even the implied warranty of > +;* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > +;* GNU General Public License for more details. > +;* > +;* You should have received a copy of the GNU General Public License along > +;* with Libav; if not, write to the Free Software Foundation, Inc., > +;* 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. > +;****************************************************************************** > + > +%include "libavutil/x86/x86util.asm" > + > +SECTION_RODATA > + > +pw_1: times 8 dw 1 > + > +SECTION .text > + > +%macro LOWPASS_LINE 0 > +cglobal lowpass_line, 5, 5 You're using m6, you need to declare 7 xmm regs. Also, naming the regs would be better than using r*. > + add r0, r1 > + add r2, r1 > + add r3, r1 > + add r4, r1 > + neg r1 > + > + pxor m6, m6 > + > +.loop > + mova m0, [r2+r1] > + punpcklbw m1, m0, m6 > + punpckhbw m0, m6 > + psllw m0, 1 > + psllw m1, 1 > + > + mova m2, [r3+r1] > + punpcklbw m3, m2, m6 > + punpckhbw m2, m6 > + > + mova m4, [r4+r1] > + punpcklbw m5, m4, m6 > + punpckhbw m4, m6 > + > + paddw m1, m3 > + paddw m1, m5 > + > + paddw m0, m2 > + paddw m0, m4 > + > + paddw m0, [pw_1] > + paddw m1, [pw_1] > + > + psrlw m0, 2 > + psrlw m1, 2 Can't pavgw be used here? > + > + packuswb m1, m0 > + mova [r0+r1], m1 > + > + add r1, mmsize > + jl .loop > +REP_RET > +%endmacro > + > +INIT_XMM sse2 > +LOWPASS_LINE > + > +INIT_XMM avx > +LOWPASS_LINE _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel