Re: [FFmpeg-devel] libavcodec/utvideodsp : add avx2 version
Hello, > > > Checkasm result (Kaby Lake, os 10.12) > > restore_rgb_planes_c: 8371.0 > > restore_rgb_planes_sse2: 6583.7 > > restore_rgb_planes_avx2: 3596.5 > > > > restore_rgb_planes10_c: 16735.7 > > restore_rgb_planes10_sse2: 11478.5 > > restore_rgb_planes10_avx2: 7193.7 > > Curious, on my Haswell (mingw-w64 Win10) i get > > restore_rgb_planes_c: 79500.7 > restore_rgb_planes_sse2: 6872.7 > restore_rgb_planes_avx2: 6715.7 > > restore_rgb_planes10_c: 91394.7 > restore_rgb_planes10_sse2: 14494.0 > restore_rgb_planes10_avx2: 13468.7 > > I check again, i have the same kind of result, than before Strange, that the speed improvment is so small in Haswell > > > > > > Pass fate test for me > > > > > > 0001-checkasm-add-utvideodsp-test : > > I'm not entirely sure of mine, for this checkasm, > > > > 0002-libavcodec-x86-utvideodsp-make-macro-for-func > > Code reorganization > > > > 0003-libavcodec-utvideodsp-add-avx2-version-for-the-dsp > > AVX2 version > > > > 0004-libavcodec-x86-utvideodsp.asm-cosmetic > > Cosmetic > > > > Martin > > Jokyo Images > > Sorry i missed this set. The asm changes look simple and good. Only > thing I'd have done was making sure the constants were wide enough to > avoid having to use vpbroadcast instructions. > I noticed for that matter that said constants already exist in > constants.c, so i just made it use them instead. > Thanks for all the fix. Your comments, for the use of vpbroadcast for constantes load, seems similar to a previous comment by James Darnley (in discussion libavcodec/bswapdsp : add AVX2 for bswap_buf) I use here the same way use by Henrik Gramner in exr_dsp.predictor func (but i'm ok to modify that part if need) Do you think we need to replace all %if cpuflag(avx2) vbroadcasti128 mm, [constantes] %else mova mm, [constantes] %endif by your method ? (for exr_dsp, the answer is probably yes, because it's also use pb_80 (i will send a patch for that)) If yes, is it better to use in asm (for example for bswapdsp) SECTION_RODATA 32 pb_bswap32: times 2 db 3, 2, 1, 0, 7, 6, 5, 4, 11, 10, 9, 8, 15, 14, 13, 12 or adding a constantes (if not exists), in constant.c/h ? Seems like this case will be common for AVX2 version of dsp func. > > The checkasm test is a bit ugly and could use some cosmetics, though. > > Except one thing, (WIDTH_PADDED calc is strange (doesn't remember why i write this, and only works by "luck"), need to be WIDTH + 16 Do you think, it's need more modification (considering your recent patchs) ? Martin ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] libavcodec/utvideodsp : add avx2 version
On 10/22/2017 9:05 AM, Martin Vignali wrote: > Hello, > > In attach patch to add AVX2 version for the utvideodsp > > Checkasm result (Kaby Lake, os 10.12) > restore_rgb_planes_c: 8371.0 > restore_rgb_planes_sse2: 6583.7 > restore_rgb_planes_avx2: 3596.5 > > restore_rgb_planes10_c: 16735.7 > restore_rgb_planes10_sse2: 11478.5 > restore_rgb_planes10_avx2: 7193.7 Curious, on my Haswell (mingw-w64 Win10) i get restore_rgb_planes_c: 79500.7 restore_rgb_planes_sse2: 6872.7 restore_rgb_planes_avx2: 6715.7 restore_rgb_planes10_c: 91394.7 restore_rgb_planes10_sse2: 14494.0 restore_rgb_planes10_avx2: 13468.7 > > > Pass fate test for me > > > 0001-checkasm-add-utvideodsp-test : > I'm not entirely sure of mine, for this checkasm, > > 0002-libavcodec-x86-utvideodsp-make-macro-for-func > Code reorganization > > 0003-libavcodec-utvideodsp-add-avx2-version-for-the-dsp > AVX2 version > > 0004-libavcodec-x86-utvideodsp.asm-cosmetic > Cosmetic > > Martin > Jokyo Images Sorry i missed this set. The asm changes look simple and good. Only thing I'd have done was making sure the constants were wide enough to avoid having to use vpbroadcast instructions. I noticed for that matter that said constants already exist in constants.c, so i just made it use them instead. The checkasm test is a bit ugly and could use some cosmetics, though. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] libavcodec/utvideodsp : add avx2 version
Hello, > > If noone reviews, and you tested&reviewed it then it should be ok to > apply Ok will apply AVX2 for utvideodsp, huffyuv(enc)dsp and hapqa decoding (and fate) > especially considering you waited a month (which is longer than > needed generally) > > > > What is the recommanded time, to wait a review before appling (2 weeks, one ping ?) ? Thanks Martin ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] libavcodec/utvideodsp : add avx2 version
On Mon, Nov 20, 2017 at 10:07:54PM +0100, Martin Vignali wrote: > 2017-11-04 19:33 GMT+01:00 Martin Vignali : > > > > > > > 2017-10-25 21:53 GMT+02:00 Martin Vignali : > > > >> > >> > >> 2017-10-22 14:05 GMT+02:00 Martin Vignali : > >> > >>> Hello, > >>> > >>> In attach patch to add AVX2 version for the utvideodsp > >>> > >>> Checkasm result (Kaby Lake, os 10.12) > >>> restore_rgb_planes_c: 8371.0 > >>> restore_rgb_planes_sse2: 6583.7 > >>> restore_rgb_planes_avx2: 3596.5 > >>> > >>> restore_rgb_planes10_c: 16735.7 > >>> restore_rgb_planes10_sse2: 11478.5 > >>> restore_rgb_planes10_avx2: 7193.7 > >>> > >>> > >>> Pass fate test for me > >>> > >>> > >>> 0001-checkasm-add-utvideodsp-test : > >>> I'm not entirely sure of mine, for this checkasm, > >>> > >>> 0002-libavcodec-x86-utvideodsp-make-macro-for-func > >>> Code reorganization > >>> > >>> 0003-libavcodec-utvideodsp-add-avx2-version-for-the-dsp > >>> AVX2 version > >>> > >>> 0004-libavcodec-x86-utvideodsp.asm-cosmetic > >>> Cosmetic > >>> > >>> > >> ping > > > > ping If noone reviews, and you tested&reviewed it then it should be ok to apply especially considering you waited a month (which is longer than needed generally) Thanks [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Dictatorship: All citizens are under surveillance, all their steps and actions recorded, for the politicians to enforce control. Democracy: All politicians are under surveillance, all their steps and actions recorded, for the citizens to enforce control. signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] libavcodec/utvideodsp : add avx2 version
2017-11-04 19:33 GMT+01:00 Martin Vignali : > > > 2017-10-25 21:53 GMT+02:00 Martin Vignali : > >> >> >> 2017-10-22 14:05 GMT+02:00 Martin Vignali : >> >>> Hello, >>> >>> In attach patch to add AVX2 version for the utvideodsp >>> >>> Checkasm result (Kaby Lake, os 10.12) >>> restore_rgb_planes_c: 8371.0 >>> restore_rgb_planes_sse2: 6583.7 >>> restore_rgb_planes_avx2: 3596.5 >>> >>> restore_rgb_planes10_c: 16735.7 >>> restore_rgb_planes10_sse2: 11478.5 >>> restore_rgb_planes10_avx2: 7193.7 >>> >>> >>> Pass fate test for me >>> >>> >>> 0001-checkasm-add-utvideodsp-test : >>> I'm not entirely sure of mine, for this checkasm, >>> >>> 0002-libavcodec-x86-utvideodsp-make-macro-for-func >>> Code reorganization >>> >>> 0003-libavcodec-utvideodsp-add-avx2-version-for-the-dsp >>> AVX2 version >>> >>> 0004-libavcodec-x86-utvideodsp.asm-cosmetic >>> Cosmetic >>> >>> >> ping > > ping ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] libavcodec/utvideodsp : add avx2 version
2017-10-25 21:53 GMT+02:00 Martin Vignali : > > > 2017-10-22 14:05 GMT+02:00 Martin Vignali : > >> Hello, >> >> In attach patch to add AVX2 version for the utvideodsp >> >> Checkasm result (Kaby Lake, os 10.12) >> restore_rgb_planes_c: 8371.0 >> restore_rgb_planes_sse2: 6583.7 >> restore_rgb_planes_avx2: 3596.5 >> >> restore_rgb_planes10_c: 16735.7 >> restore_rgb_planes10_sse2: 11478.5 >> restore_rgb_planes10_avx2: 7193.7 >> >> >> Pass fate test for me >> >> >> 0001-checkasm-add-utvideodsp-test : >> I'm not entirely sure of mine, for this checkasm, >> >> 0002-libavcodec-x86-utvideodsp-make-macro-for-func >> Code reorganization >> >> 0003-libavcodec-utvideodsp-add-avx2-version-for-the-dsp >> AVX2 version >> >> 0004-libavcodec-x86-utvideodsp.asm-cosmetic >> Cosmetic >> >> > ping ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] libavcodec/utvideodsp : add avx2 version
2017-10-22 14:05 GMT+02:00 Martin Vignali : > Hello, > > In attach patch to add AVX2 version for the utvideodsp > > Checkasm result (Kaby Lake, os 10.12) > restore_rgb_planes_c: 8371.0 > restore_rgb_planes_sse2: 6583.7 > restore_rgb_planes_avx2: 3596.5 > > restore_rgb_planes10_c: 16735.7 > restore_rgb_planes10_sse2: 11478.5 > restore_rgb_planes10_avx2: 7193.7 > > > Pass fate test for me > > > 0001-checkasm-add-utvideodsp-test : > I'm not entirely sure of mine, for this checkasm, > > 0002-libavcodec-x86-utvideodsp-make-macro-for-func > Code reorganization > > 0003-libavcodec-utvideodsp-add-avx2-version-for-the-dsp > AVX2 version > > 0004-libavcodec-x86-utvideodsp.asm-cosmetic > Cosmetic > > Martin > Jokyo Images > > Following comments of Paul B Mahol in another thread Here too, data alignment "is made" with AVFrame->data / AVFrame->linesize these func are call with this code : c->utdsp.restore_rgb_planes(frame.f->data[2], frame.f->data[0], frame.f->data[1], frame.f->linesize[2], frame.f->linesize[0], frame.f->linesize[1], avctx->width, avctx->height); and c->utdsp.restore_rgb_planes10((uint16_t *)frame.f->data[2], (uint16_t *)frame.f->data[0], (uint16_t *)frame.f->data[1], frame.f->linesize[2] / 2, frame.f->linesize[0] / 2, frame.f->linesize[1] / 2, avctx->width, avctx->height); Martin ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] libavcodec/utvideodsp : add avx2 version
Hello, In attach patch to add AVX2 version for the utvideodsp Checkasm result (Kaby Lake, os 10.12) restore_rgb_planes_c: 8371.0 restore_rgb_planes_sse2: 6583.7 restore_rgb_planes_avx2: 3596.5 restore_rgb_planes10_c: 16735.7 restore_rgb_planes10_sse2: 11478.5 restore_rgb_planes10_avx2: 7193.7 Pass fate test for me 0001-checkasm-add-utvideodsp-test : I'm not entirely sure of mine, for this checkasm, 0002-libavcodec-x86-utvideodsp-make-macro-for-func Code reorganization 0003-libavcodec-utvideodsp-add-avx2-version-for-the-dsp AVX2 version 0004-libavcodec-x86-utvideodsp.asm-cosmetic Cosmetic Martin Jokyo Images 0001-checkasm-add-utvideodsp-test.patch Description: Binary data 0002-libavcodec-x86-utvideodsp-make-macro-for-func.patch Description: Binary data 0003-libavcodec-utvideodsp-add-avx2-version-for-the-dsp.patch Description: Binary data 0004-libavcodec-x86-utvideodsp.asm-cosmetic.patch Description: Binary data ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel