On Sat, Dec 1, 2012 at 2:59 AM, Christophe Gisquet <[email protected]> wrote: > 2012/11/30 Loren Merritt <[email protected]>: >> cpu is more relevant than os. > > Will amend commit message, but then I may as well put in each commit both, > then. > >>> +; r0q=Y r1q=s_m r2q=q_filt r3q=noise r4q=max_m >>> +cglobal hf_apply_noise_main >> >> You can invoke DEFINE_ARGS even if not generating a prologue. > > I didn't know about DEFINE_ARGS, will use. > >>> + movh m3, [r1q + r4q] >>> + movh m4, [r1q + r4q + 8] >> >> Can these be a single aligned load? > > Yes, but then I'm probably missing a trick here, because altering the > above and following code like that: > movu m3, [s_mq + max_mq] > mova m4, m3 > unpcklps m3, m3 > unpckhps m4, m4 > is slower. (movhlps/unpcklps is even slower) > Is there a way to do that in 3 insns then?
movu doesn't look like an aligned load to me... Jason _______________________________________________ libav-devel mailing list [email protected] https://lists.libav.org/mailman/listinfo/libav-devel
