Re: [FFmpeg-devel] [PATCH] avcodec/ppc/hevcdsp: Fix build failures with powerpc-linux-gnu-gcc-4.8 with --disable-optimizations

2018-12-05 Thread Michael Niedermayer
On Wed, Dec 05, 2018 at 12:15:14AM +0100, Carl Eugen Hoyos wrote:
> 2018-12-04 18:02 GMT+01:00, Michael Niedermayer :
> > On Tue, Dec 04, 2018 at 04:33:03PM +0100, Carl Eugen Hoyos wrote:
> >> 2018-12-04 16:29 GMT+01:00, Michael Niedermayer :
> >> > The affected functions could also be changed into macros, this is the
> >> > smaller change to fix it though. And avoids (probably) less readable
> >> > macros
> >>
> >> > The extra code should be optimized out when optimizations are done as
> >> > all
> >> > values are known at build after inlining.
> >>
> >> Shouldn't this be verified?
> >
> > ive verified it with the patch below
> 
> Thank you!

will apply

thx

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Rewriting code that is poorly written but fully understood is good.
Rewriting code that one doesnt understand is a sign that one is less smart
then the original author, trying to rewrite it will not make it better.


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] avcodec/ppc/hevcdsp: Fix build failures with powerpc-linux-gnu-gcc-4.8 with --disable-optimizations

2018-12-04 Thread Carl Eugen Hoyos
2018-12-04 18:02 GMT+01:00, Michael Niedermayer :
> On Tue, Dec 04, 2018 at 04:33:03PM +0100, Carl Eugen Hoyos wrote:
>> 2018-12-04 16:29 GMT+01:00, Michael Niedermayer :
>> > The affected functions could also be changed into macros, this is the
>> > smaller change to fix it though. And avoids (probably) less readable
>> > macros
>>
>> > The extra code should be optimized out when optimizations are done as
>> > all
>> > values are known at build after inlining.
>>
>> Shouldn't this be verified?
>
> ive verified it with the patch below

Thank you!

Carl Eugen
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] avcodec/ppc/hevcdsp: Fix build failures with powerpc-linux-gnu-gcc-4.8 with --disable-optimizations

2018-12-04 Thread Michael Niedermayer
On Tue, Dec 04, 2018 at 04:33:03PM +0100, Carl Eugen Hoyos wrote:
> 2018-12-04 16:29 GMT+01:00, Michael Niedermayer :
> > The affected functions could also be changed into macros, this is the
> > smaller change to fix it though. And avoids (probably) less readable macros
> 
> > The extra code should be optimized out when optimizations are done as all
> > values are known at build after inlining.
> 
> Shouldn't this be verified?

ive verified it with the patch below
only SIMD instructions are between the markers, so 
powerpc-linux-gnu-gcc-4.8 (Ubuntu 4.8.4-2ubuntu1~14.04.1) 4.8.4
seems to optimize all conditional code out.
All bets are off with a different compiler though. That could be
worse or even better after the patch

But i can do more tests if you want me to test something specific ?


> This is speed-critical code, no?

Yes, and the way this code is written depends on the compiler, if the compiler
makes a mistake the function can be alot slower.
Thats one of the reasons we use nasm/yasm on x86, that always produces
the same result.

thx





[...]

diff --git a/libavcodec/ppc/hevcdsp.c b/libavcodec/ppc/hevcdsp.c
index c1d562a409..47246ed42d 100644
--- a/libavcodec/ppc/hevcdsp.c
+++ b/libavcodec/ppc/hevcdsp.c
@@ -57,13 +57,14 @@ static av_always_inline void transform4x4(vec_s16 src_01, 
vec_s16 src_23,
 o0 = vec_msums(src_13, trans4[1], zero);
 e1 = vec_msums(src_02, trans4[2], zero);
 o1 = vec_msums(src_13, trans4[3], zero);
-
+__asm volatile ("MARK\n\t");
 switch(shift) {
 case  7: add = vec_sl(vec_splat_s32(1), vec_splat_u32( 7 - 1)); break;
 case 10: add = vec_sl(vec_splat_s32(1), vec_splat_u32(10 - 1)); break;
 case 12: add = vec_sl(vec_splat_s32(1), vec_splat_u32(12 - 1)); break;
 default: abort();
 }
+__asm volatile ("MARK-E\n\t");
 
 e0 = vec_add(e0, add);
 e1 = vec_add(e1, add);
@@ -79,13 +80,14 @@ static av_always_inline void scale(vec_s32 res[4], vec_s16 
res_packed[2],
 {
 int i;
 vec_u32 v_shift;
-
+__asm volatile ("MARK\n\t");
 switch(shift) {
 case  7: v_shift = vec_splat_u32(7) ; break;
 case 10: v_shift = vec_splat_u32(10); break;
 case 12: v_shift = vec_splat_u32(12); break;
 default: abort();
 }
+__asm volatile ("MARK-E2\n\t");
 
 for (i = 0; i < 4; i++)
 res[i] = vec_sra(res[i], v_shift);

-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

If you think the mosad wants you dead since a long time then you are either
wrong or dead since a long time.


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] avcodec/ppc/hevcdsp: Fix build failures with powerpc-linux-gnu-gcc-4.8 with --disable-optimizations

2018-12-04 Thread Carl Eugen Hoyos
2018-12-04 16:29 GMT+01:00, Michael Niedermayer :
> The affected functions could also be changed into macros, this is the
> smaller change to fix it though. And avoids (probably) less readable macros

> The extra code should be optimized out when optimizations are done as all
> values are known at build after inlining.

Shouldn't this be verified?
This is speed-critical code, no?

Carl Eugen
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH] avcodec/ppc/hevcdsp: Fix build failures with powerpc-linux-gnu-gcc-4.8 with --disable-optimizations

2018-12-04 Thread Michael Niedermayer
The affected functions could also be changed into macros, this is the
smaller change to fix it though. And avoids (probably) less readable macros
The extra code should be optimized out when optimizations are done as all values
are known at build after inlining.

Signed-off-by: Michael Niedermayer 
---
 libavcodec/ppc/hevcdsp.c | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/libavcodec/ppc/hevcdsp.c b/libavcodec/ppc/hevcdsp.c
index dcae43305a..c1d562a409 100644
--- a/libavcodec/ppc/hevcdsp.c
+++ b/libavcodec/ppc/hevcdsp.c
@@ -58,7 +58,13 @@ static av_always_inline void transform4x4(vec_s16 src_01, 
vec_s16 src_23,
 e1 = vec_msums(src_02, trans4[2], zero);
 o1 = vec_msums(src_13, trans4[3], zero);
 
-add = vec_sl(vec_splat_s32(1), vec_splat_u32(shift - 1));
+switch(shift) {
+case  7: add = vec_sl(vec_splat_s32(1), vec_splat_u32( 7 - 1)); break;
+case 10: add = vec_sl(vec_splat_s32(1), vec_splat_u32(10 - 1)); break;
+case 12: add = vec_sl(vec_splat_s32(1), vec_splat_u32(12 - 1)); break;
+default: abort();
+}
+
 e0 = vec_add(e0, add);
 e1 = vec_add(e1, add);
 
@@ -72,7 +78,14 @@ static av_always_inline void scale(vec_s32 res[4], vec_s16 
res_packed[2],
const int shift)
 {
 int i;
-vec_u32 v_shift = vec_splat_u32(shift);
+vec_u32 v_shift;
+
+switch(shift) {
+case  7: v_shift = vec_splat_u32(7) ; break;
+case 10: v_shift = vec_splat_u32(10); break;
+case 12: v_shift = vec_splat_u32(12); break;
+default: abort();
+}
 
 for (i = 0; i < 4; i++)
 res[i] = vec_sra(res[i], v_shift);
-- 
2.19.2

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel