Re: [FFmpeg-devel] [PATCH v3] libswscale/ppc: VSX-optimize 9-16 bit yuv2planeX

2019-01-10 Thread Carl Eugen Hoyos
2019-01-10 10:48 GMT+01:00, Lauri Kasanen : > On Wed, 9 Jan 2019 22:26:25 +0100 > Carl Eugen Hoyos wrote: > >> > +#ifdef __GNUC__ >> > +// GCC does not support vmuluwm yet. Bug open. >> > +__asm__("vmuluwm %0, %1, %2" : "=v"(vtmp) : "v"(vin32l), >> > "v"(vfilter[j])); >> >

Re: [FFmpeg-devel] [PATCH v3] libswscale/ppc: VSX-optimize 9-16 bit yuv2planeX

2019-01-10 Thread Lauri Kasanen
On Wed, 9 Jan 2019 22:26:25 +0100 Carl Eugen Hoyos wrote: > > +#ifdef __GNUC__ > > +// GCC does not support vmuluwm yet. Bug open. > > +__asm__("vmuluwm %0, %1, %2" : "=v"(vtmp) : "v"(vin32l), > > "v"(vfilter[j])); > > +vleft = vec_add(vleft, vtmp); > > +

Re: [FFmpeg-devel] [PATCH v3] libswscale/ppc: VSX-optimize 9-16 bit yuv2planeX

2019-01-09 Thread Michael Niedermayer
On Tue, Jan 08, 2019 at 11:11:56AM +0200, Lauri Kasanen wrote: > ./ffmpeg_g -f rawvideo -pix_fmt rgb24 -s hd1080 -i /dev/zero -pix_fmt > yuv420p16be \ > -s 1920x1728 -f null -vframes 100 -v error -nostats - > > 9-14 bit funcs get about 6x speedup, 16-bit gets about 15x. > Fate passes, each

Re: [FFmpeg-devel] [PATCH v3] libswscale/ppc: VSX-optimize 9-16 bit yuv2planeX

2019-01-09 Thread Carl Eugen Hoyos
2019-01-09 22:26 GMT+01:00, Carl Eugen Hoyos : > 2019-01-08 10:11 GMT+01:00, Lauri Kasanen : >> ./ffmpeg_g -f rawvideo -pix_fmt rgb24 -s hd1080 -i /dev/zero -pix_fmt >> yuv420p16be \ >> -s 1920x1728 -f null -vframes 100 -v error -nostats - >> >> 9-14 bit funcs get about 6x speedup, 16-bit gets

Re: [FFmpeg-devel] [PATCH v3] libswscale/ppc: VSX-optimize 9-16 bit yuv2planeX

2019-01-09 Thread Carl Eugen Hoyos
2019-01-08 10:11 GMT+01:00, Lauri Kasanen : > ./ffmpeg_g -f rawvideo -pix_fmt rgb24 -s hd1080 -i /dev/zero -pix_fmt > yuv420p16be \ > -s 1920x1728 -f null -vframes 100 -v error -nostats - > > 9-14 bit funcs get about 6x speedup, 16-bit gets about 15x. > Fate passes, each format tested with an

[FFmpeg-devel] [PATCH v3] libswscale/ppc: VSX-optimize 9-16 bit yuv2planeX

2019-01-08 Thread Lauri Kasanen
./ffmpeg_g -f rawvideo -pix_fmt rgb24 -s hd1080 -i /dev/zero -pix_fmt yuv420p16be \ -s 1920x1728 -f null -vframes 100 -v error -nostats - 9-14 bit funcs get about 6x speedup, 16-bit gets about 15x. Fate passes, each format tested with an image to video conversion. Only POWER8 includes 32-bit