On 10/2/15, Henrik Gramner <hen...@gramner.com> wrote: > On Fri, Oct 2, 2015 at 6:57 PM, Paul B Mahol <one...@gmail.com> wrote: >> +INIT_XMM sse2 >> +cglobal blend_xor, 9, 10, 2, 0, top, top_linesize, bottom, >> bottom_linesize, dst, dst_linesize, width, start, end > [...] >> +cglobal blend_or, 9, 10, 2, 0, top, top_linesize, bottom, >> bottom_linesize, dst, dst_linesize, width, start, end > [...] >> +cglobal blend_and, 9, 10, 2, 0, top, top_linesize, bottom, >> bottom_linesize, dst, dst_linesize, width, start, end > > You could do those using floating point operations (xorps, orps, > andps), then you only need SSE instead of SSE2 (and AVX instead of > AVX2 if you want to make versions using ymm registers). > >> +cglobal blend_addition, 9, 10, 3, 0, top, top_linesize, bottom, >> bottom_linesize, dst, dst_linesize, width, start, end > [...] >> + punpcklbw m0, m2 >> + punpcklbw m1, m2 >> + paddw m0, m1 >> + packuswb m0, m0 >> + movh [dstq + x], m0 >> + add r10q, mmsize / 2 > > paddusb >
fixed locally. >> +cglobal blend_subtract, 9, 10, 3, 0, top, top_linesize, bottom, >> bottom_linesize, dst, dst_linesize, width, start, end > [...] >> + punpcklbw m0, m2 >> + punpcklbw m1, m2 >> + psubw m0, m1 >> + packuswb m0, m0 > > psubusb fixed locally. > >> +cglobal blend_darken, 9, 10, 2, 0, top, top_linesize, bottom, >> bottom_linesize, dst, dst_linesize, width, start, end > [...] >> + movh m0, [topq + x] >> + movh m1, [bottomq + x] >> + pminub m0, m1 >> + movh [dstq + x], m0 > [...] >> +cglobal blend_lighten, 9, 10, 2, 0, top, top_linesize, bottom, >> bottom_linesize, dst, dst_linesize, width, start, end > [...] >> + movh m0, [topq + x] >> + movh m1, [bottomq + x] >> + pmaxub m0, m1 >> + movh [dstq + x], m0 > > You're only utilizing the lower half the registers here. fixed locally. > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel > _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel