Re: [FFmpeg-devel] libavcodec/proresdec : add qmat dsp with SSE2, AVX2 simd

Ronald S. Bultje Sat, 07 Oct 2017 09:17:20 -0700

Hi Martin,

On Sat, Oct 7, 2017 at 11:49 AM, Martin Vignali <[email protected]>
wrote:


> 2017-10-07 17:30 GMT+02:00 Ronald S. Bultje <[email protected]>:
> > On Sat, Oct 7, 2017 at 10:22 AM, Martin Vignali <
> [email protected]>
> > wrote:
> > > Patch in attach add a new dsp
> > > for manipulation of qmat
> > >
> > > for now, i move this code inside
> > >
> > > for (i = 0; i < 64; i++) {
> > >         qmat_luma_scaled  [i] = ctx->qmat_luma  [i] * qscale;
> > >         qmat_chroma_scaled[i] = ctx->qmat_chroma[i] * qscale;
> > > }
> > >
> > > i add a special case for qscale == 1
> > > and SSE2, AVX2 optimization
> >
> > This loop only executes once per slice. We typically do not SIMD-optimize
> > at that level, because it won't give significant speed gains...
>
> Ok didn't know that.
> I mostly follow, what there are already done, like in blockdsp.clear_block
>

Right, so consider that blockdsp is done per block (16x16 pixels), not per
slice.

You could remove this entirely from the slice processing code by simply
pre-calculating the values in the init function once for the whole stream,
there's only 224 qscale values so it's 224*64*2 multiplications, which is
(in the context of prores) virtually negligible.

Ronald
_______________________________________________
ffmpeg-devel mailing list
[email protected]
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] libavcodec/proresdec : add qmat dsp with SSE2, AVX2 simd

Reply via email to