Howdy, So that sounds like you are doing out-of-process stack walking. I have only ever measured in-process, so I am not sure I can usefully help you. In theory x86-64 fast trace could be used to speed up out-of-process stack walking, but that's just a theory - I don't have any data on where the time is spent. You might want to profile your profiler with another tool to get an idea where the time goes.
The patches I mentioned added unw_backtrace() API, which for in-process tracing should automatically use fast trace if it can. If you have a self-coded loop using unw_step(), it would be slower. Maybe try using the fast trace optimisations for out-of-process tracing too? Cheers, Lassi On Wed, Nov 26, 2014 at 5:42 AM, Chenggang <[email protected]> wrote: > Hi, Lassi: > The unwind cost is a major problem that bothers me now. I explain in > detail about my system. > I write a profiling service in a cloud system. This system will > sampling the all CPUs in a machine, and the cloud system will have > thousands machines. The sampling frequency is 10 Hz. All OS are rhel5u7, > the x86_64 version. The libunwind is the latest version (I got it use git). > All programs are wrote by C/C++, compiler is GCC/G++, the version is > 4.1.2. > I got the sampling information like perf. The stack, registers are > saved in kernel, then use the external walking to unwind the stack. The > APIs I used is like: > > unw_create_addr_space() > unw_init_remote() > > static unw_accessors_t accessors = { > .find_proc_info = find_proc_info, > .put_unwind_info = put_unwind_info, > .get_dyn_info_list_addr = get_dyn_info_list_addr, > .access_mem = access_mem, > .access_reg = access_reg, > .access_fpreg = access_fpreg, > .resume = resume, > .get_proc_name = get_proc_name, > }; > > It must be a slow method. > I read the email: > https://lists.nongnu.org/archive/html/libunwind-devel/2011-03/msg00042.html > It look like a faster method. Does it use different APIs? > > Regards Chenggang > 3x > > > > At 2014-11-22 21:16:11, "Lassi Tuura" <[email protected]> wrote: > > That doesn't sound normal to me, but how exactly are you doing the > walking? What operating system are you using, is it 32- or 64-bit, which > library version, how did you build it, are you using external (ptrace) or > in-process (UNW_LOCAL_ONLY) walking, what exact API are you calling to > walk, what language and compiler did you use for your program, etc.? > > Here are some reference numbers from another profiling tool (igprof) using > libunwind a few years back: > > https://lists.nongnu.org/archive/html/libunwind-devel/2011-03/msg00042.html > https://lists.nongnu.org/archive/html/libunwind-devel/2011-03/msg00064.html > https://lists.nongnu.org/archive/html/libunwind-devel/2011-03/msg00079.html > > The time in clock cycles to walk on average 30-ish stack frames, for very > frequent walks (3M/sec) was in the ballpark of 2500, and 70000 for less > frequent setitimer interrupts at 200/sec (~5 ms interrupt). > > On Sat, Nov 22, 2014 at 10:05 AM, Chenggang <[email protected]> > wrote: > >> Hi: >> I am a user of libunwind. I am developing a profiling system, >> "Bianque". >> I use libunwind to unwind the stack on the target machine. But the >> time cost is too expensive. >> While the layers of call chain is 130 and the stack size is 1MB, we >> need 3.8 milliseconds to unwind it. >> My CPU is Xeon(R) CPU E5-2430 0 @ 2.20GHz. >> Is this cost normal? >> >> Regards >> Chenggang >> >> >> >> >> _______________________________________________ >> Libunwind-devel mailing list >> [email protected] >> https://lists.nongnu.org/mailman/listinfo/libunwind-devel >> >> > > >
_______________________________________________ Libunwind-devel mailing list [email protected] https://lists.nongnu.org/mailman/listinfo/libunwind-devel
