On 2017-02-09 09:50:48 +0200, Martin Storsjö wrote:
> On Thu, 9 Feb 2017, Janne Grunau wrote:
> 
> >On 2017-02-05 14:05:49 +0200, Martin Storsjö wrote:
> >>On Sun, 5 Feb 2017, Janne Grunau wrote:
> >>
> >>>> // out1 = in1 + in2
> >>>> // out2 = in1 - in2
> >>>> .macro butterfly_8h out1, out2, in1, in2
> >>>>@@ -463,7 +510,7 @@ function idct16x16_dc_add_neon
> >>>>         ret
> >>>> endfunc
> >>>>
> >>>>-function idct16
> >>>>+.macro idct16_full
> >>>>         dmbutterfly0    v16, v24, v16, v24, v2, v3, v4, v5, v6, v7 // 
> >>>> v16 = t0a,  v24 = t1a
> >>>>         dmbutterfly     v20, v28, v0.h[1], v0.h[2], v2, v3, v4, v5 // 
> >>>> v20 = t2a,  v28 = t3a
> >>>>         dmbutterfly     v18, v30, v0.h[3], v0.h[4], v2, v3, v4, v5 // 
> >>>> v18 = t4a,  v30 = t7a
> >>>>@@ -485,7 +532,10 @@ function idct16
> >>>>         dmbutterfly0    v22, v26, v22, v26, v2, v3, v18, v19, v30, v31   
> >>>>      // v22 = t6a,  v26 = t5a
> >>>>         dmbutterfly     v23, v25, v0.h[1], v0.h[2], v18, v19, v30, v31   
> >>>>      // v23 = t9a,  v25 = t14a
> >>>>         dmbutterfly     v27, v21, v0.h[1], v0.h[2], v18, v19, v30, v31, 
> >>>> neg=1 // v27 = t13a, v21 = t10a
> >>>>+        idct16_end
> >>>
> >>>I think it would be clearer if idct16_end is used directly from the macro.
> >>>it would probably also make sense to move idct16_end and avoid the
> >>>idct16_full macro. The patch might be smaller and it is immediately
> >>>obvious that there is no code change but the resulting code is more
> >>>comlicated than it needs to be. same applies to arm if we go with
> >>>alternative 1.
> >>
> >>Ok, so you mean like this?
> >>
> >>function idct16
> >>        dmbutterfly...
> >>        ....
> >>        idct16_end
> >>endfunc
> >
> >that would be one option, the other would be to move the idct_end
> >instructions as a macro out of the the existing idct16 function and use it
> >as macro. That would make the full idct structural identical to the half
> >and quarter version and avoid a macro only used once.
> 
> I'm not really following what you're suggesting here - can you outline it
> with a code sample like mine above?

sorry, it seems I wasn't fully awake. I misread your code snipped. To 
avoid any confusing here is what I ment outlined as pseudo patch:

@@
+.macro idct16_end
+    [code from the existing idct16 function]
+.endm
+
 function idct16
@@ ...
 
+    idct16_end
-    [code moved to the idct16_end macro]
 endfunc

Janne
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to