http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49611
Summary: Inline asm should support input/output of flags Product: gcc Version: 4.5.2 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: inline-asm AssignedTo: unassig...@gcc.gnu.org ReportedBy: scov...@gmail.com The main reason I find myself writing inline asm is to do "clever" things with the flags register, especially in conjunction with unusual instructions. Some examples: 1. Using the sparc brz instruction if the compiler doesn't emit it (e.g. bug #40067). 2. Using the carry flag in x86 to determine whether the unsigned comparison a != b was greater or less than, using subtract-with-borrow (seen in gnu libc): "sbb %eax, %eax; sbb $-1, %eax" leaves %eax containing -1 if a < b and +1 if a > b. 3. AMD's "Advanced Synchronization Facility" which proposes a jmp-like instruction for starting hardware transactions. Its effect is similar to fork(): on the first time past sets flags and eax to zero; a transaction failure resumes from the same PC, but with eax and flags set to reflect an error code. 4. In my experience, the main reason people would want asm goto to allow outputs is because they can't export flags (otherwise the goto can become control flow in C/C++). In all three cases the inline asm becomes needlessly long simply because uses of the flags generated within the asm block will only work reliably within that asm block (including branches, loops, etc.). Consider the following concrete example: #define EOL "\n" #define EOLT EOL "\t" long pstrcmp(unsigned char const* a, unsigned char const* b, long* pout, long pin=0) { long delta, tmp; asm("#" EOL "1:" EOLT "movzb (%[a], %[n]), %k[tmp]" EOLT "movzb (%[b], %[n]), %k[delta]" EOLT "cmpb %b[delta], %b[tmp]" EOLT "jnz 2f" EOLT "testb %b[tmp], %b[tmp]" EOLT "jz 3f" EOLT "sub %[m1], %[n]" EOLT "jmp 1b" EOL "2:" EOLT "sbb %[delta], %[delta]" EOLT "sbb %[m1], %[delta]" EOL "3:" : [a] "+r"(a), [b] "+r"(b), [n] "+r"(pin), [delta] "=&q"(delta), [tmp] "=&q"(tmp) : [m1] "i"(-1) ); *pout = pin; return delta; } With inline asm support for flags it would look more like this: long pstrcmp(unsigned char const* a, unsigned char const* b, long* pout, long pin=0) { long delta, tmp; again: if (a[pin] == b[pin]) { if (a[pin] != 0) { pin++; goto again; } else { delta = b[pin]; } } else { asm("sbb %[delta], %[delta]" EOLT "sbb %[m1], %[delta]" : [delta] "=&r" : [m1] "i"(-1), "flags"(a[pin] != b[pin]) ); } *pout = pin; return delta; } The intent is that the "flags" input specifier tells the compiler to arrange for flags to be set at entry to the asm block as if the expression passed to it had just completed (the compiler would warn/error if it were unclear the effect evaluating the expression would have on flags). In theory the optimizer should be able to eliminate common expressions and shuffle code to avoid materializing the flags at all. Using flags as output (perhaps to pass as input to another inline asm block) might look like this: asm("cmp %0, %1" : "=flags"(flags) : "r"(a), "r"(b)); ... asm("jz 1f" : : "flags"(flags)); The flags should probably take type 'int' in C. Ideally, the compiler could even recognize and optimize patterns like this: asm("cmp %0, %1" : "=flags"(flags) : "r"(a), "r"(b)); enum { CF=1 }; if (flags & CF) { ... } else if (flags & ZF) { ... }