Hello, I’m working on an embedded application which is multithreaded running on Linux platform. It has an infinite 'for' loop to keep the main thread alive. Every time, each iteration of this loop takes a different amount of time to get executed. In some iteration it is taking too much time and there are spikes in the execution time now and then. I’m trying to improve the performance(getting a consistent execution time) by figuring out the reason for the spike in execution time. So, I decided to explore the profilers to understand which functions are taking too much time to get executed. Tried gprof, strace, perf etc.. But none of them gave me the expected profiling report.
*Question1:** My expectation from profilers*: I want to see time consumed by each function(user-space) of my application. Many of these functions are invoking system calls. So, I want to know the time consumed by each system call and who is invoking those time-consuming system calls. Is this possible with callgrind? I have followed these steps to generate a profiling data from callgrind: 1. I am limiting the infinite 'for' loop to a few thousands of iterations and returning from the main() function to get the callgrind output generated. 2. Compiled the program with these compiler flags: *-O0 -g -fno-inline-functions* 3. Running my application with this command: *valgrind --tool=callgrind * *-q * *--collect-systime=yes * *--trace-children=yes* * taskset 0x1 application_name* 1. Around 150 callgrind.out.X files are generated with different values for ‘X’. 2. I’m taking the callgrind.out.X file with the least value of X, assuming that this has the profiling data of the main thread. (When I checked other files, they did not have main() function in their profiled data). 3. Opening the output file with kcachegrind: *kcachegrind callgrind.out.X* After checking, the below points made me doubt the correctness of the profiling data: · There is a function that gets called inside the 'for' loop in my application which I know is taking a lot of time(as it is using ioctl() calls every time and confirmed that it takes too much time with testing). But callgrind output file shows that it is taking very less time to get executed. · Also, I added a test code (‘for’ loop that loops around for some time every time it gets called and consumes significant amount of time.) in one function. I confirmed that this function(after adding test code) consumes lot of time with gprof. But as per callgrind, this function is taking very less time. *Question2:* Please let me know where I'm going wrong or should I do anything more to get correct profiling data from callgrind. *Question3:* Why are so many *callgrind.out.X* generated? How to identify which file is for the main() thread? How to get only one output file generated like gprof? Thank you *Best Regards,* Pavankumar S V
_______________________________________________ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users