Hi:
   Thanks for your advise. I will study the fast trace this weekend. Hope it 
can help me.


Regards
chenggang
3x






At 2014-11-26 17:21:59, "Lassi Tuura" <[email protected]> wrote:

Howdy,


So that sounds like you are doing out-of-process stack walking. I have only 
ever measured in-process, so I am not sure I can usefully help you. In theory 
x86-64 fast trace could be used to speed up out-of-process stack walking, but 
that's just a theory - I don't have any data on where the time is spent. You 
might want to profile your profiler with another tool to get an idea where the 
time goes.


The patches I mentioned added unw_backtrace() API, which for in-process tracing 
should automatically use fast trace if it can. If you have a self-coded loop 
using unw_step(), it would be slower. Maybe try using the fast trace 
optimisations for out-of-process tracing too?


Cheers,
Lassi


On Wed, Nov 26, 2014 at 5:42 AM, Chenggang <[email protected]> wrote:

Hi, Lassi:
    The unwind cost is a major problem that bothers me now. I explain in detail 
about my system.
    I write a profiling service in a cloud system. This system will sampling 
the all CPUs in a machine, and the cloud system will have thousands machines. 
The sampling frequency is 10 Hz. All OS are rhel5u7, the x86_64 version. The 
libunwind is the latest version (I got it use git). 
    All programs are wrote by C/C++, compiler is GCC/G++, the version is 4.1.2.
    I got the sampling information like perf. The stack, registers are saved in 
kernel, then use the external walking to unwind the stack. The APIs I used is 
like:


unw_create_addr_space()
unw_init_remote()


static unw_accessors_t accessors = {
    .find_proc_info     = find_proc_info,
    .put_unwind_info    = put_unwind_info,
    .get_dyn_info_list_addr = get_dyn_info_list_addr,
    .access_mem     = access_mem,
    .access_reg     = access_reg,
    .access_fpreg       = access_fpreg,
    .resume         = resume,
    .get_proc_name      = get_proc_name,
};


    It must be a slow method.
    I read the email:  
https://lists.nongnu.org/archive/html/libunwind-devel/2011-03/msg00042.html
    It look like a faster method. Does it use different APIs?


Regards Chenggang
3x





At 2014-11-22 21:16:11, "Lassi Tuura" <[email protected]> wrote:

That doesn't sound normal to me, but how exactly are you doing the walking? 
What operating system are you using, is it 32- or 64-bit, which library 
version, how did you build it, are you using external (ptrace) or in-process 
(UNW_LOCAL_ONLY) walking, what exact API are you calling to walk, what language 
and compiler did you use for your program, etc.?


Here are some reference numbers from another profiling tool (igprof) using 
libunwind a few years back:


https://lists.nongnu.org/archive/html/libunwind-devel/2011-03/msg00042.html

https://lists.nongnu.org/archive/html/libunwind-devel/2011-03/msg00064.html

https://lists.nongnu.org/archive/html/libunwind-devel/2011-03/msg00079.html



The time in clock cycles to walk on average 30-ish stack frames, for very 
frequent walks (3M/sec) was in the ballpark of 2500, and 70000 for less 
frequent setitimer interrupts at 200/sec (~5 ms interrupt).


On Sat, Nov 22, 2014 at 10:05 AM, Chenggang <[email protected]> wrote:

Hi:
     I am a user of libunwind. I am developing a profiling system, "Bianque".
     I use libunwind to unwind the stack on the target machine. But the time 
cost is too expensive.
     While the layers of call chain is 130 and the stack size is 1MB, we need 
3.8 milliseconds to unwind it.
     My CPU is Xeon(R) CPU E5-2430 0 @ 2.20GHz.
     Is this cost normal?


Regards
Chenggang
     



_______________________________________________
Libunwind-devel mailing list
[email protected]
https://lists.nongnu.org/mailman/listinfo/libunwind-devel








_______________________________________________
Libunwind-devel mailing list
[email protected]
https://lists.nongnu.org/mailman/listinfo/libunwind-devel

Reply via email to