On 6/26/17, Henrik Gramner wrote:
> On Sat, Jun 24, 2017 at 10:39 PM, Ivan Kalvachev
> wrote:
>> +%define HADDPS_IS_FAST 0
>> +%define PHADDD_IS_FAST 0
> [...]
>> +haddps %1, %1
>> +haddps %1, %1
> [...]
>> + phaddd
On Sat, Jun 24, 2017 at 10:39 PM, Ivan Kalvachev wrote:
> +%define HADDPS_IS_FAST 0
> +%define PHADDD_IS_FAST 0
[...]
> +haddps %1, %1
> +haddps %1, %1
[...]
> + phaddd xmm%1,xmm%1
> + phaddd xmm%1,xmm%1
You can safely
On Sat, Jun 24, 2017 at 11:39:03PM +0300, Ivan Kalvachev wrote:
[...]
> diff --git a/libavcodec/x86/opus_pvq_search.asm
> b/libavcodec/x86/opus_pvq_search.asm
> new file mode 100644
> index 00..36b679b75e
> --- /dev/null
> +++ b/libavcodec/x86/opus_pvq_search.asm
> @@ -0,0 +1,628 @@
> +;
This is the second version of my work.
Nobody posted any benchmarks, so
the old code remains for this round too.
The proper PIC handling code is included.
Small cosmetics, e.g. using tmpY,
to separate (semantically) from the output outY.
Now the tmpX buffer is fixed at 256*sizeof(float) size