Alex Converse <[email protected]> writes:

> On Mon, Mar 19, 2012 at 1:51 PM, Jason Garrett-Glaser <[email protected]>wrote:
>
>> On Mon, Mar 19, 2012 at 1:43 PM, Ronald S. Bultje <[email protected]>
>> wrote:
>> > Hi,
>> >
>> > On Mon, Mar 19, 2012 at 1:12 PM, Jason Garrett-Glaser <[email protected]>
>> wrote:
>> >> On Mon, Mar 19, 2012 at 12:25 PM, Ronald S. Bultje <[email protected]>
>> wrote:
>> >>> On Mon, Mar 19, 2012 at 12:03 PM, Jason Garrett-Glaser <[email protected]>
>> wrote:
>> >>>> On Mon, Mar 19, 2012 at 10:14 AM, Ronald S. Bultje <
>> [email protected]> wrote:
>> >>>>> On Mon, Mar 19, 2012 at 9:48 AM, Jason Garrett-Glaser <
>> [email protected]> wrote:
>> >>>>>> On Sat, Mar 17, 2012 at 9:34 AM, Ronald S. Bultje <
>> [email protected]> wrote:
>> >>>>>>> ---
>> >>>>>>>  libavcodec/x86/cabac.h |   17 ++++++++++-------
>> >>>>>>>  1 files changed, 10 insertions(+), 7 deletions(-)
>> >>>>>>>
>> >>>>>>> diff --git a/libavcodec/x86/cabac.h b/libavcodec/x86/cabac.h
>> >>>>>>> index 3c3652d..c4832c3 100644
>> >>>>>>> --- a/libavcodec/x86/cabac.h
>> >>>>>>> +++ b/libavcodec/x86/cabac.h
>> >>>>>>> @@ -105,8 +105,8 @@ static av_always_inline int
>> get_cabac_bypass_sign_x86(CABACContext *c, int val)
>> >>>>>>>  {
>> >>>>>>>     x86_reg tmp;
>> >>>>>>>     __asm__ volatile(
>> >>>>>>> -        "movl %4, %k1                           \n\t"
>> >>>>>>> -        "movl %2, %%eax                         \n\t"
>> >>>>>>> +        "movl %c5(%2), %k1                      \n\t"
>> >>>>>>> +        "movl %c3(%2), %%eax                    \n\t"
>> >>>>>>>         "shl $17, %k1                           \n\t"
>> >>>>>>>         "add %%eax, %%eax                       \n\t"
>> >>>>>>>         "sub %k1, %%eax                         \n\t"
>> >>>>>>> @@ -117,7 +117,7 @@ static av_always_inline int
>> get_cabac_bypass_sign_x86(CABACContext *c, int val)
>> >>>>>>>         "sub %%edx, %%ecx                       \n\t"
>> >>>>>>>         "test %%ax, %%ax                        \n\t"
>> >>>>>>>         " jnz 1f                                \n\t"
>> >>>>>>> -        "mov  %3, %1                            \n\t"
>> >>>>>>> +        "mov  %c4(%2), %1                       \n\t"
>> >>>>>>>         "subl $0xFFFF, %%eax                    \n\t"
>> >>>>>>>         "movzwl (%1), %%edx                     \n\t"
>> >>>>>>>         "bswap %%edx                            \n\t"
>> >>>>>>> @@ -126,11 +126,14 @@ static av_always_inline int
>> get_cabac_bypass_sign_x86(CABACContext *c, int val)
>> >>>>>>>         "addl %%edx, %%eax                      \n\t"
>> >>>>>>>         "mov  %1, %3                            \n\t"
>> >>>>>>>         "1:                                     \n\t"
>> >>>>>>> -        "movl %%eax, %2                         \n\t"
>> >>>>>>> +        "movl %%eax, %c4(%2)                    \n\t"
>> >>>>>>>
>> >>>>>>> -        :"+c"(val), "=&r"(tmp), "+m"(c->low), "+m"(c->bytestream)
>> >>>>>>> -        :"m"(c->range)
>> >>>>>>> -        : "%eax", "%edx"
>> >>>>>>> +        : "+c"(val), "=&r"(tmp)
>> >>>>>>> +        : "r"(c),
>> >>>>>>> +          "i"(offsetof(CABACContext, low)),
>> >>>>>>> +          "i"(offsetof(CABACContext, bytestream)),
>> >>>>>>> +          "i"(offsetof(CABACContext, range))
>> >>>>>>> +        : "%eax", "%edx", "memory"
>> >>>>>>
>> >>>>>> IMO clobbering memory looks very very hacky, and I don't like it.
>>  If
>> >>>>>> you need to clobber something, it'd be much better if we could
>> clobber
>> >>>>>> exactly what needs clobbering, and nothing more.
>> >>>>>
>> >>>>> Well, I don't think inline assembly supports explicitely clobbering
>> >>>>> variables without marking them as "+m" or "+r", which messes up the
>> >>>>> register allocator for at least gcc-4.2.1 (it uses a different
>> >>>>> register for each "m"(c->...), thus running out of registers; yes,
>> >>>>> there's many things wrong there).
>> >>>>
>> >>>> You can clobber a memory location without referencing it in the asm,
>> >>>> and thus without allocating a register for it.
>> >>>
>> >>> That sounds useful, how do I do that?
>> >>
>> >> Just add +m arguments and don't use them, that's all.
>> >>
>> >> Here's an example from an unfinished patch of mine:
>> >>
>> >> +static ALWAYS_INLINE void x264_cabac_encode_decision( x264_cabac_t
>> >> *cb, int i_ctx, int b )
>> >> +{
>> >> +    asm(
>> >> +        "call %P8\n"
>> >> +        :"+S"(i_ctx),"+d"(b), "+D"(cb->i_range), "+m"(cb->i_low),
>> >> "+m"(cb->i_queue), "+m"(cb->i_bytes_outstanding), "+m"(cb->p)
>> >> +        :"a"(cb),"X"(x264_cabac_encode_decision_asm)
>> >> +        :"%ecx"
>> >> +    );
>> >> +}
>> >
>> > That's how we got here in the first place. gcc-4.2.1 and clang-2.9
>> > allocate a register for "+m"(struct->val) pairs, causing the compiler
>> > to run out of registers. It simply won't compile, as silly as that
>> > sounds.
>>
>> That's a compiler bug, make them fix it.

Does it do that even if the operand is never used explicitly?

> From what I understand this is the system compiler on recent versions of OS
> X. If this were a fringe compiler

>From my point of view, OSX is fringe.  Besides, we have plain C code too.
The speed difference is not staggering.

-- 
Måns Rullgård
[email protected]
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to