> James Almer <jamr...@gmail.com> schrieb am Fr, 10.3.2017: >> On 3/8/2017 1:58 PM, Thomas Mundt wrote: >> Hi, >> >> attached patch adds a complex (-1 2 6 2 -1) vertcal lowpassfilter to >> vf_interlace. This will better retain detail and reduce blurring compared to >> the existing (1 2 1) filter. >> >> Please comment. >> diff --git a/libavfilter/interlace.h b/libavfilter/interlace.h >> index da073ae..7ad457e 100644 >> --- a/libavfilter/interlace.h >> +++ b/libavfilter/interlace.h >> @@ -51,6 +51,8 @@ typedef struct InterlaceContext { >> AVFrame *cur, *next; // the two frames from which the new one is >> obtained >> void (*lowpass_line)(uint8_t *dstp, ptrdiff_t linesize, const uint8_t >> *srcp, >> const uint8_t *srcp_above, const uint8_t >> *srcp_below); >> + void (*lowpass_line_complex)(uint8_t *dstp, ptrdiff_t linesize, >> + const uint8_t *srcp, int mref, int pref); > > Why not keep a single prototype, passing mref and pref for both linear > and complex? You can calculate srcp_above and srcp_below for linear like > you're doing it for complex in both the c and asm versions. >
This would make sense. I just didn´t wanted to change more than necessary in one patch and I´m not such an expert in simd programming. This is my second attempt ever. Also there is the same function in tinterlace filter that also uses this asm. So I would also need to change the function in tinterlace. I could do all this as a subsequent patch. Do you have a suggestion how to deal with the option name of complex filter in tinterlace? There it´s a flag with the names "low_pass_filter" and "vlpf", so I could add "complex_low_pass_filter" and "vclpf"? > In any case, mref and pref should be ptrdiff_t and not int. OK, I´ll change this in patch v2. > >> } InterlaceContext; >> >> void ff_interlace_init_x86(InterlaceContext *interlace); >> diff --git a/libavfilter/vf_interlace.c b/libavfilter/vf_interlace.c >> index efa3128..e8d5de4 100644 >> --- a/libavfilter/vf_interlace.c >> +++ b/libavfilter/vf_interlace.c >> @@ -47,7 +47,7 @@ static const AVOption interlace_options[] = { >> { "bff", "bottom field first", 0, >> AV_OPT_TYPE_CONST, {.i64 = MODE_BFF }, INT_MIN, INT_MAX, .flags = >> FLAGS, .unit = "scan" }, >> { "lowpass", "set vertical low-pass filter", OFFSET(lowpass), >> - AV_OPT_TYPE_BOOL, {.i64 = 1 }, 0, 1, .flags = FLAGS }, >> + AV_OPT_TYPE_INT, {.i64 = 1 }, 0, 2, .flags = FLAGS }, > > Maybe add AV_OPT_TYPE_CONST values "off", "linear" and "complex". OK >> { NULL } >> }; >> >> @@ -67,6 +67,24 @@ static void lowpass_line_c(uint8_t *dstp, ptrdiff_t >> linesize, >> } >> +%endmacro >> + >> +%macro LOWPASS_LINE_COMPLEX 0 >> +cglobal lowpass_line_complex, 3, 5, 7, dst, h, src, mref, pref > > Make this 5, 5, 7 and remove the two movs below. OK, I´ll change this in patch v2. >> + mov r3, mrefmp >> + mov r4, prefmp >> + >> + pxor m6, m6 >> +.loop: >> + mova m0, [srcq+r3] > > [srcq+mrefq] OK >> + mova m2, [srcq+r4] > > [srcq+prefq] OK >> + mova m1, m0 >> + mova m3, m2 >> + punpcklbw m0, m6 >> + punpcklbw m2, m6 >> + punpckhbw m1, m6 >> + punpckhbw m3, m6 >> + paddw m0, m2 >> + paddw m1, m3 >> + mova m2, [srcq+r3*2] > > [srcq+mrefq*2] OK >> + mova m4, [srcq+r4*2] > [srcq+prefq*2] OK >> + add dstq, mmsize >> + add srcq, mmsize >> + sub DWORD hm, mmsize > > sub hd, mmsize OK >> + >> +INIT_XMM sse2 >> +LOWPASS_LINE_COMPLEX >> + >> +INIT_XMM avx >> +LOWPASS_LINE_COMPLEX > > There's no reason to add an AVX version. You're not taking advantage of > the 3 operand instruction format, or any of the new instructions, or the > 32 byte regs. > > If you merge some of the movas with punpck* instructions and can notice > a difference when benching, then AVX would make sense. > OK. Speed improvement is marginal so I´ll drop AVX version in patch v2 >> diff --git a/libavfilter/x86/vf_interlace_init.c >> b/libavfilter/x86/vf_interlace_init.c >> index 52a22f8..947ff9a 100644 >> --- a/libavfilter/x86/vf_interlace_init.c >> +++ b/libavfilter/x86/vf_interlace_init.c >> @@ -35,12 +35,21 @@ void ff_lowpass_line_avx (uint8_t *dstp, ptrdiff_t >> linesize, >> const uint8_t *srcp_above, >> const uint8_t *srcp_below); >> _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel