On 10/25/17, Martin Vignali <martin.vign...@gmail.com> wrote: > 2017-10-25 22:08 GMT+02:00 Paul B Mahol <one...@gmail.com>: > >> On 10/25/17, Martin Vignali <martin.vign...@gmail.com> wrote: >> > 2017-10-25 21:53 GMT+02:00 Paul B Mahol <one...@gmail.com>: >> > >> >> On 10/25/17, Martin Vignali <martin.vign...@gmail.com> wrote: >> >> > 2017-10-25 9:43 GMT+02:00 Paul B Mahol <one...@gmail.com>: >> >> > >> >> >> On 10/21/17, Martin Vignali <martin.vign...@gmail.com> wrote: >> >> >> > Hello, >> >> >> > >> >> >> > In attach patch to add AVX2 version for add_bytes >> >> >> > >> >> >> > 0001-libavcodec-lossless_videodsp-add-add_bytes-avx2-vers : >> >> >> > add AVX2 version >> >> >> > >> >> >> > pass fate-test for me (os 10.12, x86_64) >> >> >> > >> >> >> > checkasm result : (Kaby Lake) (run 10 times, and i took the >> >> >> > fastest >> >> >> > version) >> >> >> > checkasm: all 2 tests passed >> >> >> > add_bytes_c: 108.7 >> >> >> > add_bytes_sse2: 26.5 >> >> >> > add_bytes_avx2: 15.5 >> >> >> > >> >> >> > >> >> >> > 0002-libavcodec-lossless_video_dsp-cosmetic-add-better-se: >> >> >> > only cosmetic >> >> >> > like the ref c function declaration in asm file is not consistent >> >> >> > between >> >> >> > each asm file >> >> >> > i think a better separator for each function make the file easier >> to >> >> >> > read >> >> >> > >> >> >> > also add the c declaration for add bytes in comment >> >> >> > >> >> >> > >> >> >> > Martin >> >> >> > >> >> >> >> >> >> Are you sure 32bit alignment is actually enforced? >> >> >> >> >> >> >> >> > Hello, >> >> > >> >> > I think, data used by add_bytes is always aligned >> >> > because dst and src, are start of a line of an AvFrame >> >> >> >> Yes, but try width thats not multiple of 32. >> >> _______________________________________________ >> >> >> >> >> > Sorry, not sure i understand. >> > following the doc, AVFrame->linesize, is multiple of max alignment >> > >> > and in the asm, loop will be repeat until, val < width >> > >> > Can you indicate me, the part, where you think, it's not ok ? >> >> I dunno. You should test it with widths not divisible by 32. >> > > Tested with the fate sample : vsynth3-huffyuvbgra.avi (34x34) > ./ffmpeg -i ./tests/data/fate/vsynth3-huffyuvbgra.avi -f framecrc - > > generate same crc than > ./ffmpeg -i ./tests/data/fate/vsynth3-huffyuvbgra.avi -f framecrc - > -cpuflags 0 > > >> >> also try encoding cropped video. >> > > Are you sure, encoding cropped video, have a link to the decoding dsp func ? > > these patch only take care about the decoding func > > > And the encoding func of huffyuvenc (in huffyuv add add/diff_bytes16 AVX2 > discussion) > and losslessencdsp (not made for now), have a test for alignment of dst and > src > > > Martin > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel >
ok then _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel