Re: [FFmpeg-devel] avcodec/utvideoenc : add SIMD (SSSE3) for sub_left_pred
2018-01-14 14:31 GMT+01:00 Martin Vignali: > Hello, > > new patch in attach > no more segfault if i remove movsxdifnidn, after change in the > "declare_func_emms" > part > > I also modify the checkasm test, following your comments > > Martin > > Pushed Martin ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] avcodec/utvideoenc : add SIMD (SSSE3) for sub_left_pred
Hello, new patch in attach no more segfault if i remove movsxdifnidn, after change in the "declare_func_emms" part I also modify the checkasm test, following your comments Martin 0001-avcodec-utvideoenc-add-SIMD-avx-for-sub_left_predict.patch Description: Binary data 0002-checkasm-add-test-for-losslessvideoencdsp-for-diff.patch Description: Binary data ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] avcodec/utvideoenc : add SIMD (SSSE3) for sub_left_pred
On 1/14/2018 7:46 AM, Henrik Gramner wrote: > On Sat, Jan 13, 2018 at 5:22 PM, Martin Vignali >wrote: >> +#define randomize_buffers(buf, size) \ >> +do { \ >> +int j; \ >> +uint8_t *tmp_buf = (uint8_t *)buf;\ >> +for (j = 0; j < size; j++) \ >> +tmp_buf[j] = rnd() & 0xFF; \ >> +} while (0) > > Typecast is redundant since everything is already uint8_t (unless you > plan to add more things that aren't). > > & 0xFF is redundant for bytes. AV_WN32 should be used here to write all four bytes instead of wasting 24 bits from every random value generated. > >> +static void check_diff_bytes(LLVidEncDSPContext c) > [...] >> +static void check_sub_left_pred(LLVidEncDSPContext c) > > Struct arguments should generallly be passed by reference (pointer). > >> +if (!dst0 || !dst1) >> +fail(); > [...] >> +if (!dst0 || !dst1 || !src0) >> +fail(); > > fail() is used to signal that the currently tested function has failed > and shouldn't be used outside of the check_func() scope. In this case > it wouldn't even prevent segfaults by dereferencing the NULL pointers > anyway in case of malloc failure. > > Actually, you could get rid of malloc and just use fixed-size stack > buffers to simplify things. > ___ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel > ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] avcodec/utvideoenc : add SIMD (SSSE3) for sub_left_pred
On Sat, Jan 13, 2018 at 5:22 PM, Martin Vignaliwrote: > +#define randomize_buffers(buf, size) \ > +do { \ > +int j; \ > +uint8_t *tmp_buf = (uint8_t *)buf;\ > +for (j = 0; j < size; j++) \ > +tmp_buf[j] = rnd() & 0xFF; \ > +} while (0) Typecast is redundant since everything is already uint8_t (unless you plan to add more things that aren't). & 0xFF is redundant for bytes. > +static void check_diff_bytes(LLVidEncDSPContext c) [...] > +static void check_sub_left_pred(LLVidEncDSPContext c) Struct arguments should generallly be passed by reference (pointer). > +if (!dst0 || !dst1) > +fail(); [...] > +if (!dst0 || !dst1 || !src0) > +fail(); fail() is used to signal that the currently tested function has failed and shouldn't be used outside of the check_func() scope. In this case it wouldn't even prevent segfaults by dereferencing the NULL pointers anyway in case of malloc failure. Actually, you could get rid of malloc and just use fixed-size stack buffers to simplify things. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] avcodec/utvideoenc : add SIMD (SSSE3) for sub_left_pred
On Sat, Jan 13, 2018 at 5:22 PM, Martin Vignaliwrote: > i try to change int width -> ptrdiff_t width to remove movsxdifnidn > but i have a segfault if height > 1 I'm guessing due to > +declare_func_emms(AV_CPU_FLAG_MMX, void, uint8_t *dst, const uint8_t > *src, > + ptrdiff_t stride, int width, int height); where it's still specified as int. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] avcodec/utvideoenc : add SIMD (SSSE3) for sub_left_pred
Hello, Following Henrik Gramner's comments, new patch in attach i try to change int width -> ptrdiff_t width to remove movsxdifnidn but i have a segfault if height > 1 pass fate test for me. Martin 0001-avcodec-utvideoenc-add-SIMD-avx-for.patch Description: Binary data 0002-checkasm-add-test-for-losslessvideoencdsp-for-diff.patch Description: Binary data ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] avcodec/utvideoenc : add SIMD (SSSE3) for sub_left_pred
On Fri, Jan 12, 2018 at 09:57:03AM +0100, Martin Vignali wrote: > > > > this changes the output: > > make -j12 && ./ffmpeg -i ~/videos/matrixbench_mpeg2.mpg -an -vcodec > > utvideo -t 1 -pix_fmt yuv420p -pred left -t 1 test2.avi > > > > -rw-r- 1 michael michael 3744402 Jan 12 04:20 test2.avi > > -rw-r- 1 michael michael 3753358 Jan 12 04:19 test.avi > > > > > > Hello, > > Thanks for testing. > Is it possible to have the file ? the matrixbench file ? thats here: https://samples.ffmpeg.org/benchmark/testsuite1/matrixbench_mpeg2.mpg [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB When you are offended at any man's fault, turn to yourself and study your own failings. Then you will forget your anger. -- Epictetus signature.asc Description: PGP signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] avcodec/utvideoenc : add SIMD (SSSE3) for sub_left_pred
On Thu, Jan 11, 2018 at 9:45 PM, Martin Vignaliwrote: > +if (check_func(c.sub_left_predict, "sub_left_predict")) { > +call_ref(dst0, src0, stride, width, height); > +call_new(dst1, src0, stride, width, height); > +if (memcmp(dst0, dst1, width)) > +fail(); > +bench_new(dst1, src0, stride, width, height); > +} You're only verifying the results of the first row here. Changing it to test all rows results in test failures. > +int width = av_clip(rnd(), 16, 128); > +int height = av_clip(rnd(), 16, 128); This kind of clipping will result in the values being 128 almost every run. You should also use constant sizes instead of random ones because random ones will make benchmarking inconsistent since you'll measure different things for the C and asm versions. You could do something along the lines of static const struct { uint8_t w, h, s; } planes[] = { {16,16,16}, {21,23,25}, {32,17,48}, {15,128,16}, {128,127,128} }; and just test all of those every run. > +%if ARCH_X86_64 > +INIT_XMM ssse3 > +cglobal sub_left_predict, 4,5,5, dst, src, stride, width, height, x > +mova m0, [pb_15] ; shuffle for last byte > +mova m1, [pb_80] ; prev initial > +.nextrow: > +xor xq, xq > + > +.loop: > +movu m2, [srcq + xq] > +psubb m1, m2 ; - prev > +pslldq m3, m1, 1 > +psubb m3, m1 > +movu[dstq+xq], m3 > +pshufb m1, m2, m0 > +addxq, mmsize > +cmpxd, widthd > +jl .loop > + > +addsrcq, strideq > +adddstq, widthq > +sub heightq, 1 > +jg .nextrow > +REP_RET > +%endif There's no need to restrict this to x86-64 only. The register specification is wrong and will fail on Windows (and 32-bit). Using a constant 15 for pshufb will be be wrong for the first byte of every row except for the first with non-mod16 widths. Try something like this: INIT_XMM avx cglobal sub_left_predict, 5,6,5, dst, src, stride, width, height, x movsxdifnidn widthq, widthd ; Change width from int to ptrdiff_t to get rid of this mova m1, [pb_80] ; prev adddstq, widthq addsrcq, widthq lea xd, [widthq-1] neg widthq and xd, 15 pinsrb m4, m1, xd, 15 mov xq, widthq .loop: movu m0, [srcq+widthq] palignr m2, m0, m1, 15 movu m1, [srcq+widthq+16] palignr m3, m1, m0, 15 psubbm2, m0, m2 psubbm3, m1, m3 movu [dstq+widthq], m2 movu [dstq+widthq+16], m3 add widthq, 2*16 jl .loop addsrcq, strideq subdstq, xq test xd, 16 jz .mod32 mova m1, m0 .mod32: pshufb m1, m4 mov widthq, xq dec heightd jg .loop RET ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] avcodec/utvideoenc : add SIMD (SSSE3) for sub_left_pred
> > this changes the output: > make -j12 && ./ffmpeg -i ~/videos/matrixbench_mpeg2.mpg -an -vcodec > utvideo -t 1 -pix_fmt yuv420p -pred left -t 1 test2.avi > > -rw-r- 1 michael michael 3744402 Jan 12 04:20 test2.avi > -rw-r- 1 michael michael 3753358 Jan 12 04:19 test.avi > > > Hello, Thanks for testing. Is it possible to have the file ? Martin ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] avcodec/utvideoenc : add SIMD (SSSE3) for sub_left_pred
On Thu, Jan 11, 2018 at 09:45:07PM +0100, Martin Vignali wrote: > Hello, > > in attach patch to add SIMD for sub_left_pred in utvideoenc > > 001 : add SIMD for utvideoenc > 002 : add checkasm for llviddspenc (diff bytes and sub_left_pred) > > Encoding result : > ./ffmpeg -i utvideo_file.avi -c:v utvideo -pred left res.avi > > Without > > frame= 3316 fps=194 q=-0.0 Lsize= 3613675kB time=00:02:12.64 > bitrate=223184.8kbits/s speed=7.74x > > With > frame= 3316 fps=206 q=-0.0 Lsize= 3613675kB time=00:02:12.64 > bitrate=223184.8kbits/s speed=8.24x > > > Checkasm result : > sub_left_predict_c: 2497.2 > sub_left_predict_ssse3: 455.7 > > > pass fate test for me (os X, x86_64) this changes the output: make -j12 && ./ffmpeg -i ~/videos/matrixbench_mpeg2.mpg -an -vcodec utvideo -t 1 -pix_fmt yuv420p -pred left -t 1 test2.avi -rw-r- 1 michael michael 3744402 Jan 12 04:20 test2.avi -rw-r- 1 michael michael 3753358 Jan 12 04:19 test.avi [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB When the tyrant has disposed of foreign enemies by conquest or treaty, and there is nothing more to fear from them, then he is always stirring up some war or other, in order that the people may require a leader. -- Plato signature.asc Description: PGP signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel