Alex Converse <[email protected]> writes: > On Mon, Mar 19, 2012 at 1:51 PM, Jason Garrett-Glaser <[email protected]>wrote: > >> On Mon, Mar 19, 2012 at 1:43 PM, Ronald S. Bultje <[email protected]> >> wrote: >> > Hi, >> > >> > On Mon, Mar 19, 2012 at 1:12 PM, Jason Garrett-Glaser <[email protected]> >> wrote: >> >> On Mon, Mar 19, 2012 at 12:25 PM, Ronald S. Bultje <[email protected]> >> wrote: >> >>> On Mon, Mar 19, 2012 at 12:03 PM, Jason Garrett-Glaser <[email protected]> >> wrote: >> >>>> On Mon, Mar 19, 2012 at 10:14 AM, Ronald S. Bultje < >> [email protected]> wrote: >> >>>>> On Mon, Mar 19, 2012 at 9:48 AM, Jason Garrett-Glaser < >> [email protected]> wrote: >> >>>>>> On Sat, Mar 17, 2012 at 9:34 AM, Ronald S. Bultje < >> [email protected]> wrote: >> >>>>>>> --- >> >>>>>>> libavcodec/x86/cabac.h | 17 ++++++++++------- >> >>>>>>> 1 files changed, 10 insertions(+), 7 deletions(-) >> >>>>>>> >> >>>>>>> diff --git a/libavcodec/x86/cabac.h b/libavcodec/x86/cabac.h >> >>>>>>> index 3c3652d..c4832c3 100644 >> >>>>>>> --- a/libavcodec/x86/cabac.h >> >>>>>>> +++ b/libavcodec/x86/cabac.h >> >>>>>>> @@ -105,8 +105,8 @@ static av_always_inline int >> get_cabac_bypass_sign_x86(CABACContext *c, int val) >> >>>>>>> { >> >>>>>>> x86_reg tmp; >> >>>>>>> __asm__ volatile( >> >>>>>>> - "movl %4, %k1 \n\t" >> >>>>>>> - "movl %2, %%eax \n\t" >> >>>>>>> + "movl %c5(%2), %k1 \n\t" >> >>>>>>> + "movl %c3(%2), %%eax \n\t" >> >>>>>>> "shl $17, %k1 \n\t" >> >>>>>>> "add %%eax, %%eax \n\t" >> >>>>>>> "sub %k1, %%eax \n\t" >> >>>>>>> @@ -117,7 +117,7 @@ static av_always_inline int >> get_cabac_bypass_sign_x86(CABACContext *c, int val) >> >>>>>>> "sub %%edx, %%ecx \n\t" >> >>>>>>> "test %%ax, %%ax \n\t" >> >>>>>>> " jnz 1f \n\t" >> >>>>>>> - "mov %3, %1 \n\t" >> >>>>>>> + "mov %c4(%2), %1 \n\t" >> >>>>>>> "subl $0xFFFF, %%eax \n\t" >> >>>>>>> "movzwl (%1), %%edx \n\t" >> >>>>>>> "bswap %%edx \n\t" >> >>>>>>> @@ -126,11 +126,14 @@ static av_always_inline int >> get_cabac_bypass_sign_x86(CABACContext *c, int val) >> >>>>>>> "addl %%edx, %%eax \n\t" >> >>>>>>> "mov %1, %3 \n\t" >> >>>>>>> "1: \n\t" >> >>>>>>> - "movl %%eax, %2 \n\t" >> >>>>>>> + "movl %%eax, %c4(%2) \n\t" >> >>>>>>> >> >>>>>>> - :"+c"(val), "=&r"(tmp), "+m"(c->low), "+m"(c->bytestream) >> >>>>>>> - :"m"(c->range) >> >>>>>>> - : "%eax", "%edx" >> >>>>>>> + : "+c"(val), "=&r"(tmp) >> >>>>>>> + : "r"(c), >> >>>>>>> + "i"(offsetof(CABACContext, low)), >> >>>>>>> + "i"(offsetof(CABACContext, bytestream)), >> >>>>>>> + "i"(offsetof(CABACContext, range)) >> >>>>>>> + : "%eax", "%edx", "memory" >> >>>>>> >> >>>>>> IMO clobbering memory looks very very hacky, and I don't like it. >> If >> >>>>>> you need to clobber something, it'd be much better if we could >> clobber >> >>>>>> exactly what needs clobbering, and nothing more. >> >>>>> >> >>>>> Well, I don't think inline assembly supports explicitely clobbering >> >>>>> variables without marking them as "+m" or "+r", which messes up the >> >>>>> register allocator for at least gcc-4.2.1 (it uses a different >> >>>>> register for each "m"(c->...), thus running out of registers; yes, >> >>>>> there's many things wrong there). >> >>>> >> >>>> You can clobber a memory location without referencing it in the asm, >> >>>> and thus without allocating a register for it. >> >>> >> >>> That sounds useful, how do I do that? >> >> >> >> Just add +m arguments and don't use them, that's all. >> >> >> >> Here's an example from an unfinished patch of mine: >> >> >> >> +static ALWAYS_INLINE void x264_cabac_encode_decision( x264_cabac_t >> >> *cb, int i_ctx, int b ) >> >> +{ >> >> + asm( >> >> + "call %P8\n" >> >> + :"+S"(i_ctx),"+d"(b), "+D"(cb->i_range), "+m"(cb->i_low), >> >> "+m"(cb->i_queue), "+m"(cb->i_bytes_outstanding), "+m"(cb->p) >> >> + :"a"(cb),"X"(x264_cabac_encode_decision_asm) >> >> + :"%ecx" >> >> + ); >> >> +} >> > >> > That's how we got here in the first place. gcc-4.2.1 and clang-2.9 >> > allocate a register for "+m"(struct->val) pairs, causing the compiler >> > to run out of registers. It simply won't compile, as silly as that >> > sounds. >> >> That's a compiler bug, make them fix it.
Does it do that even if the operand is never used explicitly? > From what I understand this is the system compiler on recent versions of OS > X. If this were a fringe compiler >From my point of view, OSX is fringe. Besides, we have plain C code too. The speed difference is not staggering. -- Måns Rullgård [email protected] _______________________________________________ libav-devel mailing list [email protected] https://lists.libav.org/mailman/listinfo/libav-devel
