[Bug inline-asm/40124] New: Inline asm should support limited control flow

scovich at gmail dot com Tue, 12 May 2009 09:01:19 -0700

Right now gcc does not officially support local control flow of any kind. The
user is free to jump around within an asm block using local labels but this is
cumbersome and bug-prone. Jumping out of an asm block in any way is
(un?)officially verboten.


This RFE is for support of jumping from an asm block to a label defined
somewhere in the enclosing function. It does not deal with jumps between asm
blocks or non-local control flow like function calls (both of which are
understandably nasty).

Rationale:

1. There is demand for this feature. MSVC supports it (see
http://msdn.microsoft.com/en-us/library/aa279405(VS.60).aspx), and people
currently kludge it in gcc by defining assembler labels in other asm blocks and
jumping to them, even though the gcc docs explicitly forbid it (e.g.
http://stackoverflow.com/questions/744055/gcc-inline-assembly-jump-to-label-outside-block).
Jumping to a C label is less error-prone, more general, and significantly
easier to implement than asm-asm jumps (see #4).

2. It's already implemented. The assembler label corresponding to a C label is
accessible like this -- asm("jmp %l0" ::"i"(&&label)) -- where "%l" emits the
label "with the syntax expected of a jump target" (Brennan's inline asm guide).
All that's left is for the compiler to mark such labels as
used-by-computed-goto so the optimizer plays nice; even that can be faked with
judicious use of computed gotos (see example code, below).

3. Its effect is basically the same as for a computed goto, which is already
supported. I suspect that every argument that jumping from asm to C is
inherently unsafe or difficult to implement applies equally well to computed
gotos (e.g. Bug #37722).

4. It would encourage smaller asm blocks because internal control flow would
seldom be necessary any more. The resulting code would be easier to write and
debug, more portable, and less fragile. Further, the example below suggests
that the compiler may generate better code as well. 

Example: Efficient use of condition codes

Currently the only way to use condition codes generated within an asm block is
to either convert them to an output value (which requires extra instructions to
create and test) or to embed the if/else inside the asm block (forcing the user
to write extra assembly code even if it was otherwise portable). Support for
jumping from asm to local labels would allow a better idiom:

void handle_overflow(int a, int b, int* dest);

int ideal_foo(int a, int b) {
    int rval = a;
    asm("adc    %[src], %[dest]\n\t"
        "jno    %l[no_overflow]"
        : [dest] "+r"(rval)
        : [src] "r"(b), [no_overflow] "i"(&&done));
        handle_overflow(a, b, &rval);
 done:
    return rval;
}

int current_foo(int a, int b) {
    int overflow = 0;
    int rval = a;
    asm("adc    %[src], %[dest]\n\t"
        "cmovo  %[true], %[overflow]"
        : [dest] "+r"(rval), [overflow] "+r"(overflow)
        : [src] "r"(b), [true] "r"(1));
    if(overflow)
        handle_overflow(a, b, &rval);
    return rval;
}

int hacked_foo(int a, int b) {
    static void* volatile labels[] = {&&overflow, &&no_overflow};
    int rval=a;
    asm("adc    %[src], %[dest]\n\t"
        "jno    %l[no_overflow]\n\t"
        "jmp    %l[overflow]"
        : [dest] "=r"(rval)
        : [src] "r"(b), "[dest]"(rval),
        [no_overflow] "i"(&&no_overflow), [overflow] "i"( &&overflow));
    goto *labels[0];
 overflow:
        handle_overflow(a, b, &rval);
 no_overflow:
    return rval;
}


Extracts of x86_64 assembler output for "x86_64-unknown-linux-gnu-gcc-4.2.2 -S
-O3" follows. Notice that ideal_foo is clearly the best, *except* that the
optimizer broke it by moving .L9 to the top of the loop (it should be just
above the addq). If it were optimized better it would take only four
instructions on the short path. hacked_foo would also beat current_foo by quite
a bit, except the compiler decided to set up a stack frame for it:


ideal_foo:
.L9:
        subq    $16, %rsp
        movl    %edi, %eax
        leaq    12(%rsp), %rdx
        movl    %edi, 12(%rsp)
#APP
        adc     %esi, %eax
        jno     .L9
#NO_APP
        movl    %eax, 12(%rsp)
        call    handle_overflow
        movl    12(%rsp), %eax
        addq    $16, %rsp
        ret


current_foo:
        pushq   %rbx
        xorl    %eax, %eax
        movl    $1, %edx
        movl    %eax, %ebx
        movl    %edi, %ecx
        subq    $16, %rsp
#APP
        adc     %esi, %ecx
        cmovo   %edx, %ebx
#NO_APP
        testl   %ebx, %ebx
        movl    %edi, 12(%rsp)
        movl    %ecx, %eax
        movl    %ecx, 12(%rsp)
        je      .L12
        leaq    12(%rsp), %rdx
        call    handle_overflow
        movl    12(%rsp), %eax
.L12:
        addq    $16, %rsp
        popq    %rbx
        ret


hacked_foo:
        movq    %rbx, -16(%rsp)
        movq    %rbp, -8(%rsp)
        subq    $32, %rsp
        movl    %edi, 12(%rsp)
        movl    %edi, %eax
#APP
        adc     %esi, %eax
        jno     .L4
        jmp     .L5
#NO_APP
        movl    %eax, 12(%rsp)
        movq    labels.1894(%rip), %rax
        jmp     *%rax
        .p2align 4,,7
.L5:
        leaq    12(%rsp), %rdx
        call    handle_overflow
.L4:
        movl    12(%rsp), %eax
        movq    16(%rsp), %rbx
        movq    24(%rsp), %rbp
        addq    $32, %rsp
        ret


-- 
           Summary: Inline asm should support limited control flow
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: inline-asm
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: scovich at gmail dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40124

[Bug inline-asm/40124] New: Inline asm should support limited control flow

Reply via email to