Hi,

On Thu, Jul 7, 2016 at 9:52 AM, Alexandra Hájková <
alexandra.khirn...@gmail.com> wrote:

> On Thu, Jul 7, 2016 at 1:53 PM, Ronald S. Bultje <rsbul...@gmail.com>
> wrote:
> > On Thu, Jul 7, 2016 at 5:25 AM, Alexandra Hájková <
> > alexandra.khirn...@gmail.com> wrote:
> > > +    s->hevcdsp.add_residual[log2_trafo_size - 2](dst, coeffs, stride);
> >
> > Won't this be slower since there's a memory store intermediate?
> >
> > (I know it's faster now because you don't have inverse transform simd,
> but
> > you should fix that by writing inverse transform simd, not by splitting
> the
> > transform and the add.)
>
> Separating adding residual from the transform seems to cause certain
> slow down but  is needed to separate dc from idct which is faster overall,
> which I consider a good reason to do this.


I'm not sure I understand why, could you elaborate on this?

Sure, simd IDCT is needed and I'm working on it.


Great!

Ronald
_______________________________________________
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to