Michael Veksler <[EMAIL PROTECTED]> writes:

> I would like to find how to force gcc (and its optimizers):
> 
>    1. Not to move stuff across in_profiler=*  assignments, for all
>       optimizers. This was measured to skew profiling by 10-20% on x86
>       Linux, and more than x2 on tiny functions on PPC-AIX (adding an
>       __asm__ register+memory barrier lowered this to 1-4% on x86, but
>       not for PPC-AIX's ABI)
>    2. Instrument start_profile_function as early as possible in the
>       profiled function (if possible before most of the preamble).
>    3. Instrument finalize _profile_function as late as possible in the
>       profiled function (if possible after most of the preamble).
>    4. Take exceptions into account
>    5. Extract the __FUNCTION__ information from inside
>       start_profile_function() instead of passing it explicitly (debug
>       info?)
>          1. If __FUNCTION__ can be extracted from stack then maybe ditch
>             in_profiler=* altogether (this way the profiler will be
>             billed correctly for its time, without such variables).
>          2. Use specialized ABI to call *_profile_function, saving up
>             the need for register save/restore for profiler's function
>             calls.
> 
> It took me about a week to implement it in C++ using STL (using RAII
> and manual instrumentation). I hope that it would take me no more than
> two weeks doing this for C. Anything more than will shift my
> benefit/cost ration to finding a weaker solution.

It's really a lot easier to do this as a source code modification than
as a compiler change.  Unless you already have a lot of experience
with the compiler, I think you'd be lucky or very good to get it done
in two weeks.

For the prologue changes look at FUNCTION_PROFILER and friends in the
internal documentation.  There is currently no support for profiling
support in the epilogue.  It could be added along the same lines as
FUNCTION_PROFILER.

If I were you I wouldn't bother with __FUNCTION__ and would just deal
with PC addresses.  Use a post-processing pass to convert those back
into function names, using, e.g., addr2line.

Hope this helps.

Ian

Reply via email to