On Mon, Dec 16, 2019 at 10:53:26PM +0100, Jean-Baptiste Kempf wrote:
> On Mon, Dec 9, 2019, at 18:42, Sebastian Pop wrote:
> > On Mon, Dec 9, 2019 at 5:01 AM Clément Bœsch wrote:
> > >
> > > On Sun, Dec 08, 2019 at 11:08:31PM +0200, Martin Storsjö wrote:
> > > > On Sun, 8 Dec 2019, Clément Bœsch
On Mon, Dec 9, 2019, at 18:42, Sebastian Pop wrote:
> On Mon, Dec 9, 2019 at 5:01 AM Clément Bœsch wrote:
> >
> > On Sun, Dec 08, 2019 at 11:08:31PM +0200, Martin Storsjö wrote:
> > > On Sun, 8 Dec 2019, Clément Bœsch wrote:
> > >
> > > > On Wed, Dec 04, 2019 at 05:24:46PM -0600, Sebastian Pop
On Mon, Dec 9, 2019 at 5:01 AM Clément Bœsch wrote:
>
> On Sun, Dec 08, 2019 at 11:08:31PM +0200, Martin Storsjö wrote:
> > On Sun, 8 Dec 2019, Clément Bœsch wrote:
> >
> > > On Wed, Dec 04, 2019 at 05:24:46PM -0600, Sebastian Pop wrote:
> > > > Hi Clément,
> > > >
> > > > please find attached
On Sun, Dec 08, 2019 at 11:08:31PM +0200, Martin Storsjö wrote:
> On Sun, 8 Dec 2019, Clément Bœsch wrote:
>
> > On Wed, Dec 04, 2019 at 05:24:46PM -0600, Sebastian Pop wrote:
> > > Hi Clément,
> > >
> > > please find attached the updated patch addressing all your comments.
> > > Let me know if
On Sun, 8 Dec 2019, Clément Bœsch wrote:
On Wed, Dec 04, 2019 at 05:24:46PM -0600, Sebastian Pop wrote:
Hi Clément,
please find attached the updated patch addressing all your comments.
Let me know if there is anything else that I missed and that I need to address.
I can't test but patch
On Wed, Dec 04, 2019 at 05:24:46PM -0600, Sebastian Pop wrote:
> Hi Clément,
>
> please find attached the updated patch addressing all your comments.
> Let me know if there is anything else that I missed and that I need to
> address.
>
I can't test but patch LGTM. Aside from the commit
Hi Clément,
please find attached the updated patch addressing all your comments.
Let me know if there is anything else that I missed and that I need to address.
Thanks,
Sebastian
On Sun, Dec 1, 2019 at 3:01 PM Martin Storsjö wrote:
>
> On Sun, 1 Dec 2019, Clément Bœsch wrote:
>
> > On Wed, Nov
On Sun, 1 Dec 2019, Clément Bœsch wrote:
On Wed, Nov 27, 2019 at 12:30:35PM -0600, Sebastian Pop wrote:
[...]
From 9ecaa99fab4b8bedf3884344774162636eaa5389 Mon Sep 17 00:00:00 2001
From: Sebastian Pop
Date: Sun, 17 Nov 2019 14:13:13 -0600
Subject: [PATCH] [aarch64] use FMA and increase vector
On Wed, Nov 27, 2019 at 12:30:35PM -0600, Sebastian Pop wrote:
[...]
> From 9ecaa99fab4b8bedf3884344774162636eaa5389 Mon Sep 17 00:00:00 2001
> From: Sebastian Pop
> Date: Sun, 17 Nov 2019 14:13:13 -0600
> Subject: [PATCH] [aarch64] use FMA and increase vector factor to 4
>
> This patch
Hi,
On Thu, Nov 28, 2019 at 2:08 AM Ronald S. Bultje wrote:
> Hi,
>
> On Wed, Nov 27, 2019 at 3:28 PM Sebastian Pop wrote:
>
>> On Wed, Nov 27, 2019 at 2:13 PM Clément Bœsch wrote:
>> > Yeah I will by the end of the week. I wrote that a few years ago so I
>> need
>> > to take some time to get
Hi,
On Wed, Nov 27, 2019 at 3:28 PM Sebastian Pop wrote:
> On Wed, Nov 27, 2019 at 2:13 PM Clément Bœsch wrote:
> > Yeah I will by the end of the week. I wrote that a few years ago so I
> need
> > to take some time to get back in the context.
>
> Thanks Clément for your help.
>
> >
> > BTW,
On Wed, Nov 27, 2019 at 2:13 PM Clément Bœsch wrote:
> Yeah I will by the end of the week. I wrote that a few years ago so I need
> to take some time to get back in the context.
Thanks Clément for your help.
>
> BTW, that's quite a huge speed improvement you're bringing in, are you
> sure you
On Wed, Nov 27, 2019 at 07:36:01PM +, Pop, Sebastian wrote:
> Thanks Jean-Baptiste for your review and suggestions on how to improve my
> patch submission.
> From the git logs I found out that Clément Bœsch wrote the original aarch64
> vectorization for that function.
> Maybe Clément could
On Wed, Nov 27, 2019, at 19:46, Sebastian Pop wrote:
> On Wed, Nov 27, 2019 at 12:37 PM Jean-Baptiste Kempf
> wrote:
> > > Please let me know if I can make the patch better.
> >
> > Remove the commented lines.
>
> Attached the updated patch.
OK for me.
Cannot comment on the content.
--
On Wed, Nov 27, 2019 at 12:37 PM Jean-Baptiste Kempf wrote:
> > Please let me know if I can make the patch better.
>
> Remove the commented lines.
Attached the updated patch.
Thank you,
Sebastian
0001-aarch64-use-FMA-and-increase-vector-factor-to-4.patch
Description: Binary data
On Mon, Nov 25, 2019 at 11:20 PM Jean-Baptiste Kempf wrote:
> > Is there a coding rule in ffmpeg that restricts the use of intrinsics?
>
> Yes. See doc/optimization.txt.
> Use external asm (nasm/yasm) or inline asm (__asm__()), do not use intrinsics.
Thanks for the pointer.
> Also, here, you're
Hello,
On Wed, Nov 27, 2019, at 19:30, Sebastian Pop wrote:
> Please find attached a patch that improves the existing code in
> aarch64/hscale.S
> Performance test with gcc and clang shows that the patch improves
> performance by 34% on Graviton A1 instances:
So, that is better than the
On Tue, Nov 26, 2019, at 05:51, Sebastian Pop wrote:
> On Mon, Nov 25, 2019 at 4:18 PM Jean-Baptiste Kempf wrote:
> > Why adding a new version, in intrinsics, instead of changing the existing
> > implementation?
> >
>
> Personal preference: I like to read c code instead of asm.
> Also I find it
On Mon, Nov 25, 2019 at 4:18 PM Jean-Baptiste Kempf wrote:
> Why adding a new version, in intrinsics, instead of changing the existing
> implementation?
>
Personal preference: I like to read c code instead of asm.
Also I find it much easier to experiment by changing c code rather than asm.
Is
Hello,
On Mon, Nov 25, 2019, at 22:59, Sebastian Pop wrote:
> This patch implements ff_hscale_8_to_15_neon with NEON fused multiply
> accumulate
> and bumps the vectorization factor from 2 to 4. I have seen speedups up to 15%
> on Graviton A1 instances based on A-72 cpus.
Why adding a new
Hi,
This patch implements ff_hscale_8_to_15_neon with NEON fused multiply accumulate
and bumps the vectorization factor from 2 to 4. I have seen speedups up to 15%
on Graviton A1 instances based on A-72 cpus.
$ ffmpeg -nostats -f lavfi -i testsrc2=4k:d=2 -vf
21 matches
Mail list logo