Hi, Can a gatekeeper review the attached patch related to stdcall for me please? Thank you very much.
Background ---------- The problem is related to gcc's extension, namely the stdcall function attributes. It is exposed while using gnu m4 as the test case. According to gcc, stdcall On the Intel 386, the stdcall attribute causes the compiler to assume that the called function will pop off the stack space used to pass arguments, unless it takes a variable number of arguments. For example, the following code int __attribute__ ((stdcall)) foo(int a, int b) { int j; j = a + b; return j; } is compiled into (with gcc -m32 -O0): foo: pushl %ebp movl %esp, %ebp subl $16, %esp movl 12(%ebp), %eax addl 8(%ebp), %eax movl %eax, -4(%ebp) movl -4(%ebp), %eax leave ret $8 <== note $8 here, which is the space needed for passing actual parameters, callee adjusts sp and the callsite is compiled into: call foo subl $8, %esp <== generated only when __attribute__ ((stdcall)) is declared on `foo' Bug 1: m4 runtime failure when compiled with `-O2' or `-Os' (-m32) ------------------------------------------------------------------ This bug is caused by tail call optimization in CG. Basically, in some cases, CG decides if it can just transform a call of B --> C into a jump instruction. In m4, we have the following kind of code: static reg_errcode_t __attribute__ ((regparm (3), stdcall)) get_subexp_sub (re_match_context_t *mctx, const re_sub_match_top_t *sub_top, re_sub_match_last_t *sub_last, Idx bkref_node, Idx bkref_str) { ... return clean_state_log_if_needed (mctx, to_idx); } static reg_errcode_t __attribute__ ((regparm (3), stdcall)) clean_state_log_if_needed (re_match_context_t *mctx, Idx next_state_log_idx) For function `get_subexp_sub', `regparm(3)' causes the first three arguments to be passed via registers and the last two via stack. So, the stack size needed for the actual parameters is 8. Therefore, you will see a `ret $8' instruction at the end of this function. On the other hand, all the parameters for `clean_state_log_if_needed' are passed via registers because of the `regparm(3)' declaration. So, you will see a `ret' instruction at the end instead (no sp adjustment needed). Because the `sp' adjustment is different for these two functions, the call to `clean_state_log_if_needed' cannot be converted into a `jump' instruction in this case. The original code fails to check to see if the stack adjustments are same. Bug 2: m4 runtime failure when compiled with `-Ofast' (-m32) ------------------------------------------------------------ This bug is also caused by `stdcall'. In function `Initialize_Stack_Frame', `Calc_Formal_Area' is called to compute the stack area from types of the formal parameters. On the other hand, when processing a callsite, `Calc_Actual_Area' is called to compute the stack area from types of the *actual* parameters. Ideally, the results should be same for the same function. The compiler also assumes so, since it uses `arg_area_size_array' to store the computation result (regardless whether it is from `Calc_Formal_Area' or `Calc_Actual_Area'), so that only one such computation is needed for each function type. In m4 code, we have the following two functions: static reg_errcode_t __attribute__ ((regparm (3), stdcall)) re_search_internal (const regex_t *preg, const char *string, Idx length, Idx start, Idx last_start, Idx stop, size_t nmatch, regmatch_t pmatch[], int eflags) //calls set_regs static reg_errcode_t __attribute__ ((regparm (3), stdcall)) set_regs (const regex_t *preg, const re_match_context_t *mctx, size_t nmatch, regmatch_t *pmatch, _Bool fl_backtrack) For `set_regs' function, because of `regparm(3)', the first 3 arguments are passed via registers. So, for the call to `set_regs', only the last two arguments need to be passed via stack. The last parameter of `set_regs' is of _Bool type, which is of size 1 byte. However, the actual parameter needs to be at least 4 byte (I4) even for _Bool. So, the stack size computed from `Calc_Actual_Area' is 8 while it is 5 from `Calc_Formal_Area'. So, now what you have in the generated code looks like: re_search_internal: call set_regs sub $8, sp set_regs: ... ret $5 Again, this is caused inconsistency in sp adjustment. The fix is to take account of padding in `Calc_Formal_Area' as well because of stack alignment. Additional Notes ---------------- 1. Bug 2 does not happen in `-O3' because the compiler sees calls to `set_regs' before compiling `set_regs'. Therefore, the stack size computed is 8. Then, when `set_regs' function is being compiled, because the `arg_area_size_array' has already been filled for this function type, `Calc_Formal_Area' does not re-compute. So you get `ret $8' at the end of `set_regs' function. -- Feng
stdcall.patch
Description: Binary data
------------------------------------------------------------------------------ WhatsUp Gold - Download Free Network Management Software The most intuitive, comprehensive, and cost-effective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgold-sd
_______________________________________________ Open64-devel mailing list Open64-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/open64-devel