Re: [FFmpeg-devel] [PATCH] Ticket #8750 Add inline function for the vec_xl intrinsic in non-VSX environments

2021-02-20 Thread Andriy Gelman
Hi Reimar,

On Sat, 20. Feb 12:34, Reimar Döffinger wrote:
> Hi!
> 
> > On 24 Jan 2021, at 17:35, Andriy Gelman  wrote:
> > 
> > On Sat, 07. Nov 16:31, Andriy Gelman wrote:
> >> On Sat, 31. Oct 10:17, Andriy Gelman wrote:
> >>> On Fri, 16. Oct 00:02, Andriy Gelman wrote:
>  On Fri, 09. Oct 20:17, Andriy Gelman wrote:
> > From: Chip Kerchner 
> > 
> > ---
> > libswscale/ppc/yuv2rgb_altivec.c | 10 ++
> > 1 file changed, 10 insertions(+)
> > 
> > diff --git a/libswscale/ppc/yuv2rgb_altivec.c 
> > b/libswscale/ppc/yuv2rgb_altivec.c
> > index 536545293d..930ef6b98f 100644
> > --- a/libswscale/ppc/yuv2rgb_altivec.c
> > +++ b/libswscale/ppc/yuv2rgb_altivec.c
> > @@ -283,6 +283,16 @@ static inline void cvtyuvtoRGB(SwsContext *c, 
> > vector signed short Y,
> >  * 
> > --
> >  */
> > 
> > +#if !HAVE_VSX
> > +static inline vector unsigned char vec_xl(signed long long offset, 
> > const ubyte *addr)
> > +{
> > +const vector unsigned char *v_addr = (const vector unsigned char 
> > *) (addr + offset);
> > +vector unsigned char align_perm = vec_lvsl(offset, addr);
> > +
> > +return (vector unsigned char) vec_perm(v_addr[0], v_addr[1], 
> > align_perm);
> > +}
> > +#endif /* !HAVE_VSX */
> > +
> > #define DEFCSP420_CVT(name, out_pixels) 
> >   \
> > static int altivec_ ## name(SwsContext *c, const unsigned char **in,
> >   \
> > int *instrides, int srcSliceY, int 
> > srcSliceH, \
>  
>  Ping.
>  This patch fixes PPC64 build on patchwork.
>  
> >>> 
> >>> ping
> >>> 
> >> 
> >> ping 
> >> 
> > 
> > I asked Reimar to have a look at the patch. He didn't have a link to 
> > original
> > post, so I'm forwarding his feedback: 
> > 
> > - a few comments sure would be good
> > - the function likely should be always_inline
> > - the offset is always 0 in the code, so that could be optimized
> > - I am not sure if the addresses even can be unaligned (the whole magic 
> > code is about fixing up unaligned loads since altivec simply ignores the 
> > low bits of the address, but it looks like the x86 asm also assumes 
> > unaligned)
> > - the extra load needed to fix the alignment can read outside the bounds of 
> > the buffer I think? Especially for chroma and if the load is aligned. 
> > Though chroma seems to have an issue already today...
> > 

> 
> I had a look at the code before the vec_xl was introduced, and besides the 
> unnecessary extra offset it seems it was pretty much like this.
> So worst case this patch would restore the behaviour to before the vec_xl 
> patch, which I
> guess is a reasonable argument to merge it whether it’s perfect or not.
> Personally I would have added a vecload (dropping the offset argument) or 
> similar function
> that either does vec_xl or this, instead of introducing a “fake” vec_xl 
> function,
> but that is just nitpicking I guess…
> 

Thanks for looking into this. 
Then I'll apply the patch in a few days if no one objects. 

I'll slightly reword the title/commit message:

lsws/ppc/yuv2rgb_altivec: Fix build in non-VSX environments

Fixes #8750
vec_xl intrinsic is only available on POWER 7 or higher. Add inline function to
fix builds if VSX is not supported.

-- 
Andriy
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] Ticket #8750 Add inline function for the vec_xl intrinsic in non-VSX environments

2021-02-20 Thread Reimar Döffinger
Hi!

> On 24 Jan 2021, at 17:35, Andriy Gelman  wrote:
> 
> On Sat, 07. Nov 16:31, Andriy Gelman wrote:
>> On Sat, 31. Oct 10:17, Andriy Gelman wrote:
>>> On Fri, 16. Oct 00:02, Andriy Gelman wrote:
 On Fri, 09. Oct 20:17, Andriy Gelman wrote:
> From: Chip Kerchner 
> 
> ---
> libswscale/ppc/yuv2rgb_altivec.c | 10 ++
> 1 file changed, 10 insertions(+)
> 
> diff --git a/libswscale/ppc/yuv2rgb_altivec.c 
> b/libswscale/ppc/yuv2rgb_altivec.c
> index 536545293d..930ef6b98f 100644
> --- a/libswscale/ppc/yuv2rgb_altivec.c
> +++ b/libswscale/ppc/yuv2rgb_altivec.c
> @@ -283,6 +283,16 @@ static inline void cvtyuvtoRGB(SwsContext *c, vector 
> signed short Y,
>  * 
> --
>  */
> 
> +#if !HAVE_VSX
> +static inline vector unsigned char vec_xl(signed long long offset, const 
> ubyte *addr)
> +{
> +const vector unsigned char *v_addr = (const vector unsigned char *) 
> (addr + offset);
> +vector unsigned char align_perm = vec_lvsl(offset, addr);
> +
> +return (vector unsigned char) vec_perm(v_addr[0], v_addr[1], 
> align_perm);
> +}
> +#endif /* !HAVE_VSX */
> +
> #define DEFCSP420_CVT(name, out_pixels)   
> \
> static int altivec_ ## name(SwsContext *c, const unsigned char **in,  
> \
> int *instrides, int srcSliceY, int srcSliceH, 
> \
 
 Ping.
 This patch fixes PPC64 build on patchwork.
 
>>> 
>>> ping
>>> 
>> 
>> ping 
>> 
> 
> I asked Reimar to have a look at the patch. He didn't have a link to original
> post, so I'm forwarding his feedback: 
> 
> - a few comments sure would be good
> - the function likely should be always_inline
> - the offset is always 0 in the code, so that could be optimized
> - I am not sure if the addresses even can be unaligned (the whole magic code 
> is about fixing up unaligned loads since altivec simply ignores the low bits 
> of the address, but it looks like the x86 asm also assumes unaligned)
> - the extra load needed to fix the alignment can read outside the bounds of 
> the buffer I think? Especially for chroma and if the load is aligned. Though 
> chroma seems to have an issue already today...
> 

I had a look at the code before the vec_xl was introduced, and besides the 
unnecessary extra offset it seems it was pretty much like this.
So worst case this patch would restore the behaviour to before the vec_xl 
patch, which I
guess is a reasonable argument to merge it whether it’s perfect or not.
Personally I would have added a vecload (dropping the offset argument) or 
similar function
that either does vec_xl or this, instead of introducing a “fake” vec_xl 
function,
but that is just nitpicking I guess…

Best regards,
Reimar

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] Ticket #8750 Add inline function for the vec_xl intrinsic in non-VSX environments

2021-01-24 Thread Carl Eugen Hoyos
Am Sa., 10. Okt. 2020 um 02:44 Uhr schrieb Andriy Gelman
:
>
> From: Chip Kerchner 
>
> ---
>  libswscale/ppc/yuv2rgb_altivec.c | 10 ++
>  1 file changed, 10 insertions(+)
>
> diff --git a/libswscale/ppc/yuv2rgb_altivec.c 
> b/libswscale/ppc/yuv2rgb_altivec.c
> index 536545293d..930ef6b98f 100644
> --- a/libswscale/ppc/yuv2rgb_altivec.c
> +++ b/libswscale/ppc/yuv2rgb_altivec.c
> @@ -283,6 +283,16 @@ static inline void cvtyuvtoRGB(SwsContext *c, vector 
> signed short Y,
>   * 
> --
>   */
>
> +#if !HAVE_VSX
> +static inline vector unsigned char vec_xl(signed long long offset, const 
> ubyte *addr)
> +{
> +const vector unsigned char *v_addr = (const vector unsigned char *) 
> (addr + offset);
> +vector unsigned char align_perm = vec_lvsl(offset, addr);
> +
> +return (vector unsigned char) vec_perm(v_addr[0], v_addr[1], align_perm);
> +}
> +#endif /* !HAVE_VSX */

Is there a speed impact if this function is used?

Carl Eugen
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] Ticket #8750 Add inline function for the vec_xl intrinsic in non-VSX environments

2021-01-24 Thread Andriy Gelman
On Sat, 07. Nov 16:31, Andriy Gelman wrote:
> On Sat, 31. Oct 10:17, Andriy Gelman wrote:
> > On Fri, 16. Oct 00:02, Andriy Gelman wrote:
> > > On Fri, 09. Oct 20:17, Andriy Gelman wrote:
> > > > From: Chip Kerchner 
> > > > 
> > > > ---
> > > >  libswscale/ppc/yuv2rgb_altivec.c | 10 ++
> > > >  1 file changed, 10 insertions(+)
> > > > 
> > > > diff --git a/libswscale/ppc/yuv2rgb_altivec.c 
> > > > b/libswscale/ppc/yuv2rgb_altivec.c
> > > > index 536545293d..930ef6b98f 100644
> > > > --- a/libswscale/ppc/yuv2rgb_altivec.c
> > > > +++ b/libswscale/ppc/yuv2rgb_altivec.c
> > > > @@ -283,6 +283,16 @@ static inline void cvtyuvtoRGB(SwsContext *c, 
> > > > vector signed short Y,
> > > >   * 
> > > > --
> > > >   */
> > > >  
> > > > +#if !HAVE_VSX
> > > > +static inline vector unsigned char vec_xl(signed long long offset, 
> > > > const ubyte *addr)
> > > > +{
> > > > +const vector unsigned char *v_addr = (const vector unsigned char 
> > > > *) (addr + offset);
> > > > +vector unsigned char align_perm = vec_lvsl(offset, addr);
> > > > +
> > > > +return (vector unsigned char) vec_perm(v_addr[0], v_addr[1], 
> > > > align_perm);
> > > > +}
> > > > +#endif /* !HAVE_VSX */
> > > > +
> > > >  #define DEFCSP420_CVT(name, out_pixels)
> > > >\
> > > >  static int altivec_ ## name(SwsContext *c, const unsigned char **in,   
> > > >\
> > > >  int *instrides, int srcSliceY, int 
> > > > srcSliceH, \
> > > 
> > > Ping.
> > > This patch fixes PPC64 build on patchwork.
> > > 
> > 
> > ping
> > 
> 
> ping 
> 

I asked Reimar to have a look at the patch. He didn't have a link to original
post, so I'm forwarding his feedback: 

- a few comments sure would be good
- the function likely should be always_inline
- the offset is always 0 in the code, so that could be optimized
- I am not sure if the addresses even can be unaligned (the whole magic code is 
about fixing up unaligned loads since altivec simply ignores the low bits of 
the address, but it looks like the x86 asm also assumes unaligned)
- the extra load needed to fix the alignment can read outside the bounds of the 
buffer I think? Especially for chroma and if the load is aligned. Though chroma 
seems to have an issue already today...

-- 
Andriy
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] Ticket #8750 Add inline function for the vec_xl intrinsic in non-VSX environments

2020-11-07 Thread Andriy Gelman
On Sat, 31. Oct 10:17, Andriy Gelman wrote:
> On Fri, 16. Oct 00:02, Andriy Gelman wrote:
> > On Fri, 09. Oct 20:17, Andriy Gelman wrote:
> > > From: Chip Kerchner 
> > > 
> > > ---
> > >  libswscale/ppc/yuv2rgb_altivec.c | 10 ++
> > >  1 file changed, 10 insertions(+)
> > > 
> > > diff --git a/libswscale/ppc/yuv2rgb_altivec.c 
> > > b/libswscale/ppc/yuv2rgb_altivec.c
> > > index 536545293d..930ef6b98f 100644
> > > --- a/libswscale/ppc/yuv2rgb_altivec.c
> > > +++ b/libswscale/ppc/yuv2rgb_altivec.c
> > > @@ -283,6 +283,16 @@ static inline void cvtyuvtoRGB(SwsContext *c, vector 
> > > signed short Y,
> > >   * 
> > > --
> > >   */
> > >  
> > > +#if !HAVE_VSX
> > > +static inline vector unsigned char vec_xl(signed long long offset, const 
> > > ubyte *addr)
> > > +{
> > > +const vector unsigned char *v_addr = (const vector unsigned char *) 
> > > (addr + offset);
> > > +vector unsigned char align_perm = vec_lvsl(offset, addr);
> > > +
> > > +return (vector unsigned char) vec_perm(v_addr[0], v_addr[1], 
> > > align_perm);
> > > +}
> > > +#endif /* !HAVE_VSX */
> > > +
> > >  #define DEFCSP420_CVT(name, out_pixels)  
> > >  \
> > >  static int altivec_ ## name(SwsContext *c, const unsigned char **in, 
> > >  \
> > >  int *instrides, int srcSliceY, int 
> > > srcSliceH, \
> > 
> > Ping.
> > This patch fixes PPC64 build on patchwork.
> > 
> 
> ping
> 

ping 

> -- 
> Andriy
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] Ticket #8750 Add inline function for the vec_xl intrinsic in non-VSX environments

2020-10-31 Thread Andriy Gelman
On Fri, 16. Oct 00:02, Andriy Gelman wrote:
> On Fri, 09. Oct 20:17, Andriy Gelman wrote:
> > From: Chip Kerchner 
> > 
> > ---
> >  libswscale/ppc/yuv2rgb_altivec.c | 10 ++
> >  1 file changed, 10 insertions(+)
> > 
> > diff --git a/libswscale/ppc/yuv2rgb_altivec.c 
> > b/libswscale/ppc/yuv2rgb_altivec.c
> > index 536545293d..930ef6b98f 100644
> > --- a/libswscale/ppc/yuv2rgb_altivec.c
> > +++ b/libswscale/ppc/yuv2rgb_altivec.c
> > @@ -283,6 +283,16 @@ static inline void cvtyuvtoRGB(SwsContext *c, vector 
> > signed short Y,
> >   * 
> > --
> >   */
> >  
> > +#if !HAVE_VSX
> > +static inline vector unsigned char vec_xl(signed long long offset, const 
> > ubyte *addr)
> > +{
> > +const vector unsigned char *v_addr = (const vector unsigned char *) 
> > (addr + offset);
> > +vector unsigned char align_perm = vec_lvsl(offset, addr);
> > +
> > +return (vector unsigned char) vec_perm(v_addr[0], v_addr[1], 
> > align_perm);
> > +}
> > +#endif /* !HAVE_VSX */
> > +
> >  #define DEFCSP420_CVT(name, out_pixels)
> >\
> >  static int altivec_ ## name(SwsContext *c, const unsigned char **in,   
> >\
> >  int *instrides, int srcSliceY, int srcSliceH,  
> >\
> 
> Ping.
> This patch fixes PPC64 build on patchwork.
> 

ping

-- 
Andriy
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] Ticket #8750 Add inline function for the vec_xl intrinsic in non-VSX environments

2020-10-15 Thread Andriy Gelman
On Fri, 09. Oct 20:17, Andriy Gelman wrote:
> From: Chip Kerchner 
> 
> ---
>  libswscale/ppc/yuv2rgb_altivec.c | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/libswscale/ppc/yuv2rgb_altivec.c 
> b/libswscale/ppc/yuv2rgb_altivec.c
> index 536545293d..930ef6b98f 100644
> --- a/libswscale/ppc/yuv2rgb_altivec.c
> +++ b/libswscale/ppc/yuv2rgb_altivec.c
> @@ -283,6 +283,16 @@ static inline void cvtyuvtoRGB(SwsContext *c, vector 
> signed short Y,
>   * 
> --
>   */
>  
> +#if !HAVE_VSX
> +static inline vector unsigned char vec_xl(signed long long offset, const 
> ubyte *addr)
> +{
> +const vector unsigned char *v_addr = (const vector unsigned char *) 
> (addr + offset);
> +vector unsigned char align_perm = vec_lvsl(offset, addr);
> +
> +return (vector unsigned char) vec_perm(v_addr[0], v_addr[1], align_perm);
> +}
> +#endif /* !HAVE_VSX */
> +
>  #define DEFCSP420_CVT(name, out_pixels)  
>  \
>  static int altivec_ ## name(SwsContext *c, const unsigned char **in, 
>  \
>  int *instrides, int srcSliceY, int srcSliceH,
>  \

Ping.
This patch fixes PPC64 build on patchwork.

-- 
Andriy
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH] Ticket #8750 Add inline function for the vec_xl intrinsic in non-VSX environments

2020-10-09 Thread Andriy Gelman
From: Chip Kerchner 

---
 libswscale/ppc/yuv2rgb_altivec.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/libswscale/ppc/yuv2rgb_altivec.c b/libswscale/ppc/yuv2rgb_altivec.c
index 536545293d..930ef6b98f 100644
--- a/libswscale/ppc/yuv2rgb_altivec.c
+++ b/libswscale/ppc/yuv2rgb_altivec.c
@@ -283,6 +283,16 @@ static inline void cvtyuvtoRGB(SwsContext *c, vector 
signed short Y,
  * 
--
  */
 
+#if !HAVE_VSX
+static inline vector unsigned char vec_xl(signed long long offset, const ubyte 
*addr)
+{
+const vector unsigned char *v_addr = (const vector unsigned char *) (addr 
+ offset);
+vector unsigned char align_perm = vec_lvsl(offset, addr);
+
+return (vector unsigned char) vec_perm(v_addr[0], v_addr[1], align_perm);
+}
+#endif /* !HAVE_VSX */
+
 #define DEFCSP420_CVT(name, out_pixels)   \
 static int altivec_ ## name(SwsContext *c, const unsigned char **in,  \
 int *instrides, int srcSliceY, int srcSliceH, \
-- 
2.28.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] Ticket #8750 Add inline function for the vec_xl intrinsic in non-VSX environments

2020-07-06 Thread Chip Kerchner


ffmpeg_altivec_yuv2rgb_novsx.patch
Description: Binary data
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] Ticket #8750 Add inline function for the vec_xl intrinsic in non-VSX environments

2020-07-06 Thread Chip Kerchner


ffmpeg_altivec_yuv2rgb_novsx.patch
Description: Binary data
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] Ticket #8750 Add inline function for the vec_xl intrinsic in non-VSX environments

2020-07-01 Thread Chip Kerchner


ffmpeg_altivec_yuv2rgb_novsx.patch
Description: Binary data
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] Ticket #8750 Add inline function for the vec_xl intrinsic in non-VSX environments

2020-06-30 Thread Michael Niedermayer
On Tue, Jun 30, 2020 at 12:06:10PM +, Chip Kerchner wrote:
>  yuv2rgb_altivec.c |   10 ++
>  1 file changed, 10 insertions(+)
> 42288c448d2dadce913a969945ee36afc510d200  ffmpeg_altivec_yuv2rgb_novsx.patch
> diff --git a/libswscale/ppc/yuv2rgb_altivec.c 
> b/libswscale/ppc/yuv2rgb_altivec.c
> index 536545293d..0cd4a1db13 100644
> --- a/libswscale/ppc/yuv2rgb_altivec.c
> +++ b/libswscale/ppc/yuv2rgb_altivec.c

it seems git here doesnt like this attachment
Applying: Ticket #8750 Add inline function for the vec_xl intrinsic in non-VSX 
environments
error: corrupt patch at line 8
error: could not build fake ancestor

But the __VSX__ check looks wrong, nothing else checks for vsx that way

configure checks for vsx and there is HAVE_VSX, pleaase check if that works
too (iam not ppc maintainer and havnt tested. this suggestion is just from
looking and guessing)

Thanks

[...]

-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Opposition brings concord. Out of discord comes the fairest harmony.
-- Heraclitus


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH] Ticket #8750 Add inline function for the vec_xl intrinsic in non-VSX environments

2020-06-30 Thread Chip Kerchner


ffmpeg_altivec_yuv2rgb_novsx.patch
Description: Binary data
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".