Hi,

On Sat, Oct 15, 2011 at 2:53 AM, Loren Merritt <[email protected]> wrote:
> On Fri, 14 Oct 2011, Ronald S. Bultje wrote:
>
>> +    packusdw        m0, m1
>> +    packusdw        m2, m3
>
> sse4

Ah, that's why Kieran's assembly was marked sse4. I'll make a
sse2-version that needs a pmaxsw x, zero also then.

> Are things usually unaligned?

No, I'm a little too pessimistic in this patch. In fact, the src in
this function is always aligned, so these should be mova. I'm not sure
about dest, in my tests they tend to be aligned but I'm not sure if
the API guarantees that. I don't think it does. I can test for
alignment at function start and split the loop into two copies, one
for aligned dest and one for unaligned dest.

Ronald
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to