On Sat, Dec 1, 2012 at 2:59 AM, Christophe Gisquet
<[email protected]> wrote:
> 2012/11/30 Loren Merritt <[email protected]>:
>> cpu is more relevant than os.
>
> Will amend commit message, but then I may as well put in each commit both, 
> then.
>
>>> +; r0q=Y   r1q=s_m   r2q=q_filt   r3q=noise  r4q=max_m
>>> +cglobal hf_apply_noise_main
>>
>> You can invoke DEFINE_ARGS even if not generating a prologue.
>
> I didn't know about DEFINE_ARGS, will use.
>
>>> +  movh       m3, [r1q + r4q]
>>> +  movh       m4, [r1q + r4q + 8]
>>
>> Can these be a single aligned load?
>
> Yes, but then I'm probably missing a trick here, because altering the
> above and following code like that:
>     movu       m3, [s_mq + max_mq]
>     mova       m4, m3
>     unpcklps   m3, m3
>     unpckhps   m4, m4
> is slower. (movhlps/unpcklps is even slower)
> Is there a way to do that in 3 insns then?

movu doesn't look like an aligned load to me...

Jason
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to