On Thu, Mar 24, 2016 at 09:35:01AM -0400, Ronald S. Bultje wrote: > Hi, > > On Mar 24, 2016 8:28 AM, "Clément Bœsch" <u...@pkh.me> wrote: > > > > From: Clément Bœsch <clem...@stupeflix.com> > > > > ./ffmpeg -nostats -f lavfi -i testsrc2=4k:d=2 -vf > bench=start,scale=1024x1024,bench=stop -f null - > > > > before: t:0.489726 avg:0.489883 max:0.491852 min:0.489482 > > after: t:0.256515 avg:0.256458 max:0.256999 min:0.253755 > > --- > > Changes: > > - FIX: not using the v8-v15 registers > > - writing directly from the SIMD register (thx Martin) > > - misc reordering > > > > I'm looking at the vscale part now. > > --- > > libswscale/aarch64/Makefile | 6 +++-- > > libswscale/aarch64/hscale.S | 59 > +++++++++++++++++++++++++++++++++++++++++++ > > libswscale/aarch64/swscale.c | 37 +++++++++++++++++++++++++++ > > libswscale/swscale.c | 2 ++ > > libswscale/swscale_internal.h | 1 + > > libswscale/utils.c | 4 ++- > > 6 files changed, 106 insertions(+), 3 deletions(-) > > create mode 100644 libswscale/aarch64/hscale.S > > create mode 100644 libswscale/aarch64/swscale.c > Do you intend to create special versions for specific filter widths (e.g. > x86 has special versions for filter_width=4 and 8). That helped speed up > the default filters (bicubic) a little more. > > This version looks OK already for the default case. >
I don't need these cases immediately (my use case is filter size of 11 and 26), so no plan so far. I'm actually looking at yuv2planeX_8 to get more impact on that specific case. -- Clément B.
signature.asc
Description: PGP signature
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel