This patch looks fine to me. Please go ahead.
2011/5/4 Feng Zhou <fengzho...@gmail.com>
> Hi,
>
> Can a gatekeeper review the attached patch related to stdcall for me
> please? Thank you very much.
>
> Background
> ----------
> The problem is related to gcc's extension, namely the stdcall function
> attributes. It is exposed while using gnu m4 as the test case.
> According to gcc,
>
> stdcall
> On the Intel 386, the stdcall attribute causes the compiler to assume
> that
> the called function will pop off the stack space used to pass
> arguments, unless
> it takes a variable number of arguments.
>
> For example, the following code
>
> int __attribute__ ((stdcall)) foo(int a, int b) {
> int j; j = a + b; return j;
> }
>
> is compiled into (with gcc -m32 -O0):
>
> foo:
> pushl %ebp
> movl %esp, %ebp
> subl $16, %esp
> movl 12(%ebp), %eax
> addl 8(%ebp), %eax
> movl %eax, -4(%ebp)
> movl -4(%ebp), %eax
> leave
> ret $8 <== note $8 here, which is the space
> needed
> for passing actual parameters, callee adjusts sp
>
> and the callsite is compiled into:
>
> call foo
> subl $8, %esp <== generated only when __attribute__
> ((stdcall)) is declared on `foo'
>
>
> Bug 1: m4 runtime failure when compiled with `-O2' or `-Os' (-m32)
> ------------------------------------------------------------------
> This bug is caused by tail call optimization in CG. Basically, in some
> cases,
> CG decides if it can just transform a call of B --> C into a jump
> instruction.
> In m4, we have the following kind of code:
>
> static reg_errcode_t __attribute__ ((regparm (3), stdcall)) get_subexp_sub
> (re_match_context_t *mctx, const re_sub_match_top_t *sub_top,
> re_sub_match_last_t *sub_last, Idx bkref_node, Idx bkref_str) {
> ...
> return clean_state_log_if_needed (mctx, to_idx); }
>
>
> static reg_errcode_t __attribute__ ((regparm (3), stdcall))
> clean_state_log_if_needed (re_match_context_t *mctx, Idx
> next_state_log_idx)
>
>
> For function `get_subexp_sub', `regparm(3)' causes the first three
> arguments to
> be passed via registers and the last two via stack. So, the stack size
> needed
> for the actual parameters is 8. Therefore, you will see a `ret $8'
> instruction
> at the end of this function. On the other hand, all the parameters for
> `clean_state_log_if_needed' are passed via registers because of the
> `regparm(3)' declaration. So, you will see a `ret' instruction at the end
> instead (no sp adjustment needed). Because the `sp' adjustment is different
> for
> these two functions, the call to `clean_state_log_if_needed' cannot be
> converted into a `jump' instruction in this case. The original code fails
> to
> check to see if the stack adjustments are same.
>
> Bug 2: m4 runtime failure when compiled with `-Ofast' (-m32)
> ------------------------------------------------------------
> This bug is also caused by `stdcall'. In function `Initialize_Stack_Frame',
> `Calc_Formal_Area' is called to compute the stack area from types of the
> formal
> parameters. On the other hand, when processing a callsite,
> `Calc_Actual_Area'
> is called to compute the stack area from types of the *actual* parameters.
> Ideally, the results should be same for the same function. The compiler
> also
> assumes so, since it uses `arg_area_size_array' to store the computation
> result
> (regardless whether it is from `Calc_Formal_Area' or `Calc_Actual_Area'),
> so
> that only one such computation is needed for each function type.
>
> In m4 code, we have the following two functions:
>
> static reg_errcode_t __attribute__ ((regparm (3), stdcall))
> re_search_internal (const regex_t *preg, const char *string, Idx length,
> Idx start, Idx last_start, Idx stop, size_t nmatch, regmatch_t
> pmatch[],
> int eflags) //calls set_regs
>
> static reg_errcode_t __attribute__ ((regparm (3), stdcall))
> set_regs (const regex_t *preg, const re_match_context_t *mctx, size_t
> nmatch,
> regmatch_t *pmatch, _Bool fl_backtrack)
>
> For `set_regs' function, because of `regparm(3)', the first 3 arguments are
> passed via registers. So, for the call to `set_regs', only the last two
> arguments need to be passed via stack. The last parameter of `set_regs' is
> of
> _Bool type, which is of size 1 byte. However, the actual parameter needs to
> be
> at least 4 byte (I4) even for _Bool. So, the stack size computed from
> `Calc_Actual_Area' is 8 while it is 5 from `Calc_Formal_Area'. So, now what
> you
> have in the generated code looks like:
>
> re_search_internal:
> call set_regs
> sub $8, sp
>
> set_regs:
> ...
> ret $5
>
> Again, this is caused inconsistency in sp adjustment. The fix is to take
> account of padding in `Calc_Formal_Area' as well because of stack
> alignment.
>
> Additional Notes
> ----------------
> 1. Bug 2 does not happen in `-O3' because the compiler sees calls to
> `set_regs'
> before compiling `set_regs'. Therefore, the stack size computed is 8. Then,
> when `set_regs' function is being compiled, because the
> `arg_area_size_array'
> has already been filled for this function type, `Calc_Formal_Area' does not
> re-compute. So you get `ret $8' at the end of `set_regs' function.
>
>
> -- Feng
>
>
> ------------------------------------------------------------------------------
> WhatsUp Gold - Download Free Network Management Software
> The most intuitive, comprehensive, and cost-effective network
> management toolset available today. Delivers lowest initial
> acquisition cost and overall TCO of any competing solution.
> http://p.sf.net/sfu/whatsupgold-sd
> _______________________________________________
> Open64-devel mailing list
> Open64-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/open64-devel
>
>
--
Regards,
Lai Jian-Xin
------------------------------------------------------------------------------
Achieve unprecedented app performance and reliability
What every C/C++ and Fortran developer should know.
Learn how Intel has extended the reach of its next-generation tools
to help boost performance applications - inlcuding clusters.
http://p.sf.net/sfu/intel-dev2devmay
_______________________________________________
Open64-devel mailing list
Open64-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/open64-devel