On 2017-02-09 09:50:48 +0200, Martin Storsjö wrote: > On Thu, 9 Feb 2017, Janne Grunau wrote: > > >On 2017-02-05 14:05:49 +0200, Martin Storsjö wrote: > >>On Sun, 5 Feb 2017, Janne Grunau wrote: > >> > >>>> // out1 = in1 + in2 > >>>> // out2 = in1 - in2 > >>>> .macro butterfly_8h out1, out2, in1, in2 > >>>>@@ -463,7 +510,7 @@ function idct16x16_dc_add_neon > >>>> ret > >>>> endfunc > >>>> > >>>>-function idct16 > >>>>+.macro idct16_full > >>>> dmbutterfly0 v16, v24, v16, v24, v2, v3, v4, v5, v6, v7 // > >>>> v16 = t0a, v24 = t1a > >>>> dmbutterfly v20, v28, v0.h[1], v0.h[2], v2, v3, v4, v5 // > >>>> v20 = t2a, v28 = t3a > >>>> dmbutterfly v18, v30, v0.h[3], v0.h[4], v2, v3, v4, v5 // > >>>> v18 = t4a, v30 = t7a > >>>>@@ -485,7 +532,10 @@ function idct16 > >>>> dmbutterfly0 v22, v26, v22, v26, v2, v3, v18, v19, v30, v31 > >>>> // v22 = t6a, v26 = t5a > >>>> dmbutterfly v23, v25, v0.h[1], v0.h[2], v18, v19, v30, v31 > >>>> // v23 = t9a, v25 = t14a > >>>> dmbutterfly v27, v21, v0.h[1], v0.h[2], v18, v19, v30, v31, > >>>> neg=1 // v27 = t13a, v21 = t10a > >>>>+ idct16_end > >>> > >>>I think it would be clearer if idct16_end is used directly from the macro. > >>>it would probably also make sense to move idct16_end and avoid the > >>>idct16_full macro. The patch might be smaller and it is immediately > >>>obvious that there is no code change but the resulting code is more > >>>comlicated than it needs to be. same applies to arm if we go with > >>>alternative 1. > >> > >>Ok, so you mean like this? > >> > >>function idct16 > >> dmbutterfly... > >> .... > >> idct16_end > >>endfunc > > > >that would be one option, the other would be to move the idct_end > >instructions as a macro out of the the existing idct16 function and use it > >as macro. That would make the full idct structural identical to the half > >and quarter version and avoid a macro only used once. > > I'm not really following what you're suggesting here - can you outline it > with a code sample like mine above?
sorry, it seems I wasn't fully awake. I misread your code snipped. To avoid any confusing here is what I ment outlined as pseudo patch: @@ +.macro idct16_end + [code from the existing idct16 function] +.endm + function idct16 @@ ... + idct16_end - [code moved to the idct16_end macro] endfunc Janne _______________________________________________ libav-devel mailing list [email protected] https://lists.libav.org/mailman/listinfo/libav-devel
