2017-10-25 21:53 GMT+02:00 Paul B Mahol <one...@gmail.com>: > On 10/25/17, Martin Vignali <martin.vign...@gmail.com> wrote: > > 2017-10-25 9:43 GMT+02:00 Paul B Mahol <one...@gmail.com>: > > > >> On 10/21/17, Martin Vignali <martin.vign...@gmail.com> wrote: > >> > Hello, > >> > > >> > In attach patch to add AVX2 version for add_bytes > >> > > >> > 0001-libavcodec-lossless_videodsp-add-add_bytes-avx2-vers : > >> > add AVX2 version > >> > > >> > pass fate-test for me (os 10.12, x86_64) > >> > > >> > checkasm result : (Kaby Lake) (run 10 times, and i took the fastest > >> > version) > >> > checkasm: all 2 tests passed > >> > add_bytes_c: 108.7 > >> > add_bytes_sse2: 26.5 > >> > add_bytes_avx2: 15.5 > >> > > >> > > >> > 0002-libavcodec-lossless_video_dsp-cosmetic-add-better-se: > >> > only cosmetic > >> > like the ref c function declaration in asm file is not consistent > >> > between > >> > each asm file > >> > i think a better separator for each function make the file easier to > >> > read > >> > > >> > also add the c declaration for add bytes in comment > >> > > >> > > >> > Martin > >> > > >> > >> Are you sure 32bit alignment is actually enforced? > >> > >> > > Hello, > > > > I think, data used by add_bytes is always aligned > > because dst and src, are start of a line of an AvFrame > > Yes, but try width thats not multiple of 32. > _______________________________________________ > > Sorry, not sure i understand. following the doc, AVFrame->linesize, is multiple of max alignment
and in the asm, loop will be repeat until, val < width Can you indicate me, the part, where you think, it's not ok ? Martin _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel