Re: [FFmpeg-devel] [PATCH 2/2] libswscale: Adds ff_hscale8to15_4_avx2 and ff_hscale8to15_X4_avx2 for all filter sizes.

2021-12-15 Thread James Almer
On 12/15/2021 6:35 AM, Alan Kelly wrote: Fixes so that fate under 64 bit Windows passes. These functions replace all ff_hscale8to15_*_ssse3 when avx2 is available. --- libswscale/swscale_internal.h | 2 + libswscale/utils.c| 37 +++ libswscale/x86/Makefile | 1

Re: [FFmpeg-devel] [PATCH 2/2] libswscale: Adds ff_hscale8to15_4_avx2 and ff_hscale8to15_X4_avx2 for all filter sizes.

2021-12-15 Thread James Almer
On 12/15/2021 6:05 AM, Alan Kelly wrote: On Tue, Dec 14, 2021 at 6:07 PM James Almer wrote: On 12/14/2021 12:23 PM, Alan Kelly wrote: Patch has been rebased from latest commits. These functions replace all ff_hscale8to15_*_ssse3 when avx2 is available. ---

[FFmpeg-devel] [PATCH 2/2] libswscale: Adds ff_hscale8to15_4_avx2 and ff_hscale8to15_X4_avx2 for all filter sizes.

2021-12-15 Thread Alan Kelly
Fixes so that fate under 64 bit Windows passes. These functions replace all ff_hscale8to15_*_ssse3 when avx2 is available. --- libswscale/swscale_internal.h | 2 + libswscale/utils.c| 37 +++ libswscale/x86/Makefile | 1 + libswscale/x86/scale_avx2.asm | 112

Re: [FFmpeg-devel] [PATCH 2/2] libswscale: Adds ff_hscale8to15_4_avx2 and ff_hscale8to15_X4_avx2 for all filter sizes.

2021-12-15 Thread Alan Kelly
On Tue, Dec 14, 2021 at 6:07 PM James Almer wrote: > On 12/14/2021 12:23 PM, Alan Kelly wrote: > > Patch has been rebased from latest commits. > > These functions replace all ff_hscale8to15_*_ssse3 when avx2 is > available. > > --- > > libswscale/swscale_internal.h | 2 + > >

Re: [FFmpeg-devel] [PATCH 2/2] libswscale: Adds ff_hscale8to15_4_avx2 and ff_hscale8to15_X4_avx2 for all filter sizes.

2021-12-14 Thread James Almer
On 12/14/2021 12:23 PM, Alan Kelly wrote: Patch has been rebased from latest commits. These functions replace all ff_hscale8to15_*_ssse3 when avx2 is available. --- libswscale/swscale_internal.h | 2 + libswscale/utils.c| 37 +++ libswscale/x86/Makefile | 1 +

[FFmpeg-devel] [PATCH 2/2] libswscale: Adds ff_hscale8to15_4_avx2 and ff_hscale8to15_X4_avx2 for all filter sizes.

2021-12-14 Thread Alan Kelly
Patch has been rebased from latest commits. These functions replace all ff_hscale8to15_*_ssse3 when avx2 is available. --- libswscale/swscale_internal.h | 2 + libswscale/utils.c| 37 +++ libswscale/x86/Makefile | 1 + libswscale/x86/scale_avx2.asm | 112

Re: [FFmpeg-devel] [PATCH 2/2] libswscale: Adds ff_hscale8to15_4_avx2 and ff_hscale8to15_X4_avx2 for all filter sizes.

2021-07-26 Thread Alan Kelly
On Wed, Jul 21, 2021 at 11:11 AM Alan Kelly wrote: > > > On Fri, Jul 16, 2021 at 3:48 PM Alan Kelly wrote: > >> These functions replace all ff_hscale8to15_*_ssse3 when avx2 is available. >> --- >> EXTERNAL_AVX2_FAST is now used instead of EXTERNAL_AVX2_FAST_GATHER as >> discussed in the email

Re: [FFmpeg-devel] [PATCH 2/2] libswscale: Adds ff_hscale8to15_4_avx2 and ff_hscale8to15_X4_avx2 for all filter sizes.

2021-07-21 Thread Alan Kelly
On Fri, Jul 16, 2021 at 3:48 PM Alan Kelly wrote: > These functions replace all ff_hscale8to15_*_ssse3 when avx2 is available. > --- > EXTERNAL_AVX2_FAST is now used instead of EXTERNAL_AVX2_FAST_GATHER as > discussed in the email thread for part 1 of this patch. > > Benchmark results on Skylake

[FFmpeg-devel] [PATCH 2/2] libswscale: Adds ff_hscale8to15_4_avx2 and ff_hscale8to15_X4_avx2 for all filter sizes.

2021-07-16 Thread Alan Kelly
These functions replace all ff_hscale8to15_*_ssse3 when avx2 is available. --- EXTERNAL_AVX2_FAST is now used instead of EXTERNAL_AVX2_FAST_GATHER as discussed in the email thread for part 1 of this patch. Benchmark results on Skylake and Haswell: Skylake Haswell

Re: [FFmpeg-devel] [PATCH 2/2] libswscale: Adds ff_hscale8to15_4_avx2 and ff_hscale8to15_X4_avx2 for all filter sizes.

2021-06-25 Thread Ronald S. Bultje
Hi Alan, On Fri, Jun 25, 2021 at 7:53 AM Alan Kelly wrote: > > > On Fri, Jun 25, 2021 at 1:26 PM Ronald S. Bultje > wrote: > >> Hi Alan, >> >> On Fri, Jun 25, 2021 at 3:59 AM Alan Kelly < >> alankelly-at-google@ffmpeg.org> wrote: >> >>> These functions replace all ff_hscale8to15_*_ssse3

Re: [FFmpeg-devel] [PATCH 2/2] libswscale: Adds ff_hscale8to15_4_avx2 and ff_hscale8to15_X4_avx2 for all filter sizes.

2021-06-25 Thread Alan Kelly
On Fri, Jun 25, 2021 at 1:26 PM Ronald S. Bultje wrote: > Hi Alan, > > On Fri, Jun 25, 2021 at 3:59 AM Alan Kelly < > alankelly-at-google@ffmpeg.org> wrote: > >> These functions replace all ff_hscale8to15_*_ssse3 when avx2 is available. >> > > Re-asking a question I asked before in the other

Re: [FFmpeg-devel] [PATCH 2/2] libswscale: Adds ff_hscale8to15_4_avx2 and ff_hscale8to15_X4_avx2 for all filter sizes.

2021-06-25 Thread Ronald S. Bultje
Hi Alan, On Fri, Jun 25, 2021 at 3:59 AM Alan Kelly < alankelly-at-google@ffmpeg.org> wrote: > These functions replace all ff_hscale8to15_*_ssse3 when avx2 is available. > Re-asking a question I asked before in the other thread: Also, what is the cycle count of ssse3/avx2 implementation

[FFmpeg-devel] [PATCH 2/2] libswscale: Adds ff_hscale8to15_4_avx2 and ff_hscale8to15_X4_avx2 for all filter sizes.

2021-06-25 Thread Alan Kelly
These functions replace all ff_hscale8to15_*_ssse3 when avx2 is available. --- libswscale/swscale_internal.h | 2 + libswscale/utils.c| 37 +++ libswscale/x86/Makefile | 1 + libswscale/x86/scale_avx2.asm | 112 ++

[FFmpeg-devel] [PATCH 2/2] libswscale: Adds ff_hscale8to15_4_avx2 and ff_hscale8to15_X4_avx2 for all filter sizes.

2021-06-14 Thread Alan Kelly
These functions replace all ff_hscale8to15_*_ssse3 when avx2 is available. --- libswscale/swscale_internal.h | 2 + libswscale/utils.c| 37 +++ libswscale/x86/Makefile | 1 + libswscale/x86/scale_avx2.asm | 112 ++