Hi,

Can a gatekeeper review the attached patch related to stdcall for me
please? Thank you very much.

Background
----------
The problem is related to gcc's extension, namely the stdcall function
attributes. It is exposed while using gnu m4 as the test case.
According to gcc,

         stdcall
    On the Intel 386, the stdcall attribute causes the compiler to assume that
    the called function will pop off the stack space used to pass
arguments, unless
    it takes a variable number of arguments.

For example, the following code

int  __attribute__ ((stdcall)) foo(int a, int b) {
    int j; j = a + b; return j;
}

is compiled into (with gcc -m32 -O0):

foo:
    pushl   %ebp
    movl    %esp, %ebp
    subl    $16, %esp
    movl    12(%ebp), %eax
    addl    8(%ebp), %eax
    movl    %eax, -4(%ebp)
    movl    -4(%ebp), %eax
    leave
    ret $8                         <== note $8 here, which is the space needed
for passing actual parameters, callee adjusts sp

and the callsite is compiled into:

    call    foo
    subl    $8, %esp               <== generated only when __attribute__
((stdcall)) is declared on `foo'


Bug 1: m4 runtime failure when compiled with `-O2' or `-Os' (-m32)
------------------------------------------------------------------
This bug is caused by tail call optimization in CG. Basically, in some cases,
CG decides if it can just transform a call of B --> C into a jump instruction.
In m4, we have the following kind of code:

static reg_errcode_t __attribute__ ((regparm (3), stdcall)) get_subexp_sub
(re_match_context_t *mctx, const re_sub_match_top_t *sub_top,
re_sub_match_last_t *sub_last, Idx bkref_node, Idx bkref_str) {
  ...
  return clean_state_log_if_needed (mctx, to_idx); }


static reg_errcode_t __attribute__ ((regparm (3), stdcall))
clean_state_log_if_needed (re_match_context_t *mctx, Idx next_state_log_idx)


For function `get_subexp_sub', `regparm(3)' causes the first three arguments to
be passed via registers and the last two via stack. So, the stack size needed
for the actual parameters is 8. Therefore, you will see a `ret $8' instruction
at the end of this function. On the other hand, all the parameters for
`clean_state_log_if_needed' are passed via registers because of the
`regparm(3)' declaration. So, you will see a `ret' instruction at the end
instead (no sp adjustment needed). Because the `sp' adjustment is different for
these two functions, the call to `clean_state_log_if_needed' cannot be
converted into a `jump' instruction in this case. The original code fails to
check to see if the stack adjustments are same.

Bug 2: m4 runtime failure when compiled with `-Ofast' (-m32)
------------------------------------------------------------
This bug is also caused by `stdcall'. In function `Initialize_Stack_Frame',
`Calc_Formal_Area' is called to compute the stack area from types of the formal
parameters. On the other hand, when processing a callsite, `Calc_Actual_Area'
is called to compute the stack area from types of the *actual* parameters.
Ideally, the results should be same for the same function. The compiler also
assumes so, since it uses `arg_area_size_array' to store the computation result
(regardless whether it is from `Calc_Formal_Area' or `Calc_Actual_Area'), so
that only one such computation is needed for each function type.

In m4 code, we have the following two functions:

static reg_errcode_t __attribute__ ((regparm (3), stdcall))
re_search_internal (const regex_t *preg, const char *string, Idx length,
      Idx start, Idx last_start, Idx stop, size_t nmatch, regmatch_t pmatch[],
int eflags) //calls set_regs

static reg_errcode_t __attribute__ ((regparm (3), stdcall))
set_regs (const regex_t *preg, const re_match_context_t *mctx, size_t nmatch,
regmatch_t *pmatch, _Bool fl_backtrack)

For `set_regs' function, because of `regparm(3)', the first 3 arguments are
passed via registers. So, for the call to `set_regs', only the last two
arguments need to be passed via stack. The last parameter of `set_regs' is of
_Bool type, which is of size 1 byte. However, the actual parameter needs to be
at least 4 byte (I4) even for _Bool. So, the stack size computed from
`Calc_Actual_Area' is 8 while it is 5 from `Calc_Formal_Area'. So, now what you
have in the generated code looks like:

re_search_internal:
           call set_regs
           sub $8, sp

set_regs:
         ...
         ret $5

Again, this is caused inconsistency in sp adjustment. The fix is to take
account of padding in `Calc_Formal_Area' as well because of stack alignment.

Additional Notes
----------------
1. Bug 2 does not happen in `-O3' because the compiler sees calls to `set_regs'
before compiling `set_regs'. Therefore, the stack size computed is 8. Then,
when `set_regs' function is being compiled, because the `arg_area_size_array'
has already been filled for this function type, `Calc_Formal_Area' does not
re-compute. So you get `ret $8' at the end of `set_regs' function.


-- Feng

Attachment: stdcall.patch
Description: Binary data

------------------------------------------------------------------------------
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
_______________________________________________
Open64-devel mailing list
Open64-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/open64-devel

Reply via email to