Hi Martin, On Sat, Oct 7, 2017 at 11:49 AM, Martin Vignali <martin.vign...@gmail.com> wrote:
> 2017-10-07 17:30 GMT+02:00 Ronald S. Bultje <rsbul...@gmail.com>: > > On Sat, Oct 7, 2017 at 10:22 AM, Martin Vignali < > martin.vign...@gmail.com> > > wrote: > > > Patch in attach add a new dsp > > > for manipulation of qmat > > > > > > for now, i move this code inside > > > > > > for (i = 0; i < 64; i++) { > > > qmat_luma_scaled [i] = ctx->qmat_luma [i] * qscale; > > > qmat_chroma_scaled[i] = ctx->qmat_chroma[i] * qscale; > > > } > > > > > > i add a special case for qscale == 1 > > > and SSE2, AVX2 optimization > > > > This loop only executes once per slice. We typically do not SIMD-optimize > > at that level, because it won't give significant speed gains... > > Ok didn't know that. > I mostly follow, what there are already done, like in blockdsp.clear_block > Right, so consider that blockdsp is done per block (16x16 pixels), not per slice. You could remove this entirely from the slice processing code by simply pre-calculating the values in the init function once for the whole stream, there's only 224 qscale values so it's 224*64*2 multiplications, which is (in the context of prores) virtually negligible. Ronald _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel