Re: [FFmpeg-devel] avcodec/huffyuvenc : try to call dsp with aligned data, and remove code duplication

2017-12-09 Thread Martin Vignali
2017-12-02 18:59 GMT+01:00 Martin Vignali : > > >> requiring FFMIN() to be evaluated per iteration could be slower >> if the compiler fails to factor it out >> >> >> >> New patchs in attach : > > 001 : unchanged > 002 : add "int min_width = FFMIN(w, 32)" at the start of the func > 003 : add "int m

Re: [FFmpeg-devel] avcodec/huffyuvenc : try to call dsp with aligned data, and remove code duplication

2017-12-02 Thread Martin Vignali
> > requiring FFMIN() to be evaluated per iteration could be slower > if the compiler fails to factor it out > > > > New patchs in attach : 001 : unchanged 002 : add "int min_width = FFMIN(w, 32)" at the start of the func 003 : add "int min_width = FFMIN(w, 8)" at the start of the func Pass fate

Re: [FFmpeg-devel] avcodec/huffyuvenc : try to call dsp with aligned data, and remove code duplication

2017-12-01 Thread Michael Niedermayer
On Sun, Nov 26, 2017 at 07:07:41PM +0100, Martin Vignali wrote: > Hello, > > in attach patchs > > 0001-avcodec-huffyuvenc-increase-scalar-loop-count > and > 0003-avcodec-huffyuvenc-sub_left_prediction_bgr32-call-ds > > like diff_bytes and diff_bytes16, have AVX2 version, increase the scalar > lo

Re: [FFmpeg-devel] avcodec/huffyuvenc : try to call dsp with aligned data, and remove code duplication

2017-12-01 Thread Martin Vignali
2017-11-26 19:07 GMT+01:00 Martin Vignali : > Hello, > > in attach patchs > > 0001-avcodec-huffyuvenc-increase-scalar-loop-count > and > 0003-avcodec-huffyuvenc-sub_left_prediction_bgr32-call-ds > > like diff_bytes and diff_bytes16, have AVX2 version, increase the scalar > loop > to call the align

[FFmpeg-devel] avcodec/huffyuvenc : try to call dsp with aligned data, and remove code duplication

2017-11-26 Thread Martin Vignali
Hello, in attach patchs 0001-avcodec-huffyuvenc-increase-scalar-loop-count and 0003-avcodec-huffyuvenc-sub_left_prediction_bgr32-call-ds like diff_bytes and diff_bytes16, have AVX2 version, increase the scalar loop to call the aligned version in most case 0002-avcodec-huffyuvenc-remove-code-d