Re: [FFmpeg-devel] [PATCH] avcodec/mips: [loongson] optimize theora decoding with mmi.
On Fri, Feb 15, 2019 at 12:13:43AM +0100, Michael Niedermayer wrote: > On Wed, Feb 13, 2019 at 05:56:50PM +0800, Shiyou Yin wrote: > > >-Original Message- > > >From: ffmpeg-devel-boun...@ffmpeg.org > > >[mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of gxw > > >Sent: Tuesday, February 12, 2019 6:56 PM > > >To: ffmpeg-devel@ffmpeg.org > > >Cc: gxw > > >Subject: [FFmpeg-devel] [PATCH] avcodec/mips: [loongson] optimize theora > > >decoding with mmi. > > > > > >Optimize theora decoding with mmi in functions: > > >1. ff_vp3_idct_add_mmi > > >2. ff_vp3_idct_put_mmi > > >3. ff_vp3_idct_dc_add_mmi > > >4. ff_put_no_rnd_pixels_l2_mmi > > > > > >Theora decoding speed improved about 32%(from 88fps to 116fps, Tested on > > >loongson 3A3000). > > >--- > > > libavcodec/mips/Makefile | 1 + > > > libavcodec/mips/vp3dsp_idct_mmi.c | 769 > > > + > > > libavcodec/mips/vp3dsp_init_mips.c | 14 + > > > libavcodec/mips/vp3dsp_mips.h | 6 + > > > 4 files changed, 790 insertions(+) > > > create mode 100644 libavcodec/mips/vp3dsp_idct_mmi.c > > > > > > > Verified + 1, LGTM. > > will apply one last minute issue i noticed The author looks like a nick name or user name, is that intended: "gxw " ? I mean do you want "gxw" instead of your full name ? (iam asking as it cannot be changed after pushing ...) Thanks [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety -- Benjamin Franklin signature.asc Description: PGP signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] avcodec/mips: [loongson] optimize theora decoding with mmi.
On Wed, Feb 13, 2019 at 05:56:50PM +0800, Shiyou Yin wrote: > >-Original Message- > >From: ffmpeg-devel-boun...@ffmpeg.org > >[mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of gxw > >Sent: Tuesday, February 12, 2019 6:56 PM > >To: ffmpeg-devel@ffmpeg.org > >Cc: gxw > >Subject: [FFmpeg-devel] [PATCH] avcodec/mips: [loongson] optimize theora > >decoding with mmi. > > > >Optimize theora decoding with mmi in functions: > >1. ff_vp3_idct_add_mmi > >2. ff_vp3_idct_put_mmi > >3. ff_vp3_idct_dc_add_mmi > >4. ff_put_no_rnd_pixels_l2_mmi > > > >Theora decoding speed improved about 32%(from 88fps to 116fps, Tested on > >loongson 3A3000). > >--- > > libavcodec/mips/Makefile | 1 + > > libavcodec/mips/vp3dsp_idct_mmi.c | 769 > > + > > libavcodec/mips/vp3dsp_init_mips.c | 14 + > > libavcodec/mips/vp3dsp_mips.h | 6 + > > 4 files changed, 790 insertions(+) > > create mode 100644 libavcodec/mips/vp3dsp_idct_mmi.c > > > > Verified + 1, LGTM. will apply thx [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB If you fake or manipulate statistics in a paper in physics you will never get a job again. If you fake or manipulate statistics in a paper in medicin you will get a job for life at the pharma industry. signature.asc Description: PGP signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] avcodec/mips: [loongson] optimize theora decoding with mmi.
>-Original Message- >From: ffmpeg-devel-boun...@ffmpeg.org [mailto:ffmpeg-devel-boun...@ffmpeg.org] >On Behalf Of gxw >Sent: Tuesday, February 12, 2019 6:56 PM >To: ffmpeg-devel@ffmpeg.org >Cc: gxw >Subject: [FFmpeg-devel] [PATCH] avcodec/mips: [loongson] optimize theora >decoding with mmi. > >Optimize theora decoding with mmi in functions: >1. ff_vp3_idct_add_mmi >2. ff_vp3_idct_put_mmi >3. ff_vp3_idct_dc_add_mmi >4. ff_put_no_rnd_pixels_l2_mmi > >Theora decoding speed improved about 32%(from 88fps to 116fps, Tested on >loongson 3A3000). >--- > libavcodec/mips/Makefile | 1 + > libavcodec/mips/vp3dsp_idct_mmi.c | 769 + > libavcodec/mips/vp3dsp_init_mips.c | 14 + > libavcodec/mips/vp3dsp_mips.h | 6 + > 4 files changed, 790 insertions(+) > create mode 100644 libavcodec/mips/vp3dsp_idct_mmi.c > Verified + 1, LGTM. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH] avcodec/mips: [loongson] optimize theora decoding with mmi.
Optimize theora decoding with mmi in functions: 1. ff_vp3_idct_add_mmi 2. ff_vp3_idct_put_mmi 3. ff_vp3_idct_dc_add_mmi 4. ff_put_no_rnd_pixels_l2_mmi Theora decoding speed improved about 32%(from 88fps to 116fps, Tested on loongson 3A3000). --- libavcodec/mips/Makefile | 1 + libavcodec/mips/vp3dsp_idct_mmi.c | 769 + libavcodec/mips/vp3dsp_init_mips.c | 14 + libavcodec/mips/vp3dsp_mips.h | 6 + 4 files changed, 790 insertions(+) create mode 100644 libavcodec/mips/vp3dsp_idct_mmi.c diff --git a/libavcodec/mips/Makefile b/libavcodec/mips/Makefile index 3029872..c827649 100644 --- a/libavcodec/mips/Makefile +++ b/libavcodec/mips/Makefile @@ -87,3 +87,4 @@ MMI-OBJS-$(CONFIG_HPELDSP)+= mips/hpeldsp_mmi.o MMI-OBJS-$(CONFIG_VC1_DECODER)+= mips/vc1dsp_mmi.o MMI-OBJS-$(CONFIG_WMV2DSP)+= mips/wmv2dsp_mmi.o MMI-OBJS-$(CONFIG_HEVC_DECODER) += mips/hevcdsp_mmi.o +MMI-OBJS-$(CONFIG_VP3DSP) += mips/vp3dsp_idct_mmi.o diff --git a/libavcodec/mips/vp3dsp_idct_mmi.c b/libavcodec/mips/vp3dsp_idct_mmi.c new file mode 100644 index 000..c5c4cf3 --- /dev/null +++ b/libavcodec/mips/vp3dsp_idct_mmi.c @@ -0,0 +1,769 @@ +/* + * Copyright (c) 2018 gxw + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "vp3dsp_mips.h" +#include "libavutil/intreadwrite.h" +#include "libavutil/mips/mmiutils.h" +#include "libavutil/common.h" +#include "libavcodec/rnd_avg.h" + +#define LOAD_CONST(dst, value)\ +"li %[tmp1], "#value" \n\t" \ +"dmtc1 %[tmp1], "#dst"\n\t" \ +"pshufh "#dst", "#dst", %[ftmp10] \n\t" + +static void idct_row_mmi(int16_t *input) +{ +double ftmp[23]; +uint64_t tmp[2]; +__asm__ volatile ( +"xor%[ftmp10], %[ftmp10],%[ftmp10] \n\t" +LOAD_CONST(%[csth_1], 1) +"li %[tmp0],0x02\n\t" +"1: \n\t" +/* Load input */ +"ldc1 %[ftmp0], 0x00(%[input]) \n\t" +"ldc1 %[ftmp1], 0x10(%[input]) \n\t" +"ldc1 %[ftmp2], 0x20(%[input]) \n\t" +"ldc1 %[ftmp3], 0x30(%[input]) \n\t" +"ldc1 %[ftmp4], 0x40(%[input]) \n\t" +"ldc1 %[ftmp5], 0x50(%[input]) \n\t" +"ldc1 %[ftmp6], 0x60(%[input]) \n\t" +"ldc1 %[ftmp7], 0x70(%[input]) \n\t" +LOAD_CONST(%[ftmp8], 64277) +LOAD_CONST(%[ftmp9], 12785) +"pmulhh %[A], %[ftmp9], %[ftmp7] \n\t" +"pcmpgth%[C], %[ftmp10],%[ftmp1] \n\t" +"or %[mask],%[C], %[csth_1] \n\t" +"pmullh %[B], %[ftmp1], %[mask] \n\t" +"pmulhuh%[B], %[ftmp8], %[B] \n\t" +"pmullh %[B], %[B], %[mask] \n\t" +"paddh %[A], %[A], %[B] \n\t" +"paddh %[A], %[A], %[C] \n\t" +"pcmpgth%[D], %[ftmp10],%[ftmp7] \n\t" +"or %[mask],%[D], %[csth_1] \n\t" +"pmullh %[ftmp7], %[ftmp7], %[mask] \n\t" +"pmulhuh%[B], %[ftmp8], %[ftmp7] \n\t" +"pmullh %[B], %[B], %[mask] \n\t" +"pmulhh %[C], %[ftmp9], %[ftmp1] \n\t" +"psubh %[B], %[C], %[B] \n\t" +"psubh %[B], %[B], %[D] \n\t" + +LOAD_CONST(%[ftmp8], 54491) +LOAD_CONST(%[ftmp9], 36410) +"pcmpgth%[Ad], %[ftmp10],%[ftmp5] \n\t" +"or %[mask],%[Ad],%[csth_1] \n\t" +"pmullh %[ftmp1], %[ftmp5], %[mask] \n\t" +"pmulhuh%[C], %[ftmp9], %[ftmp1] \n\t" +"pm