Hello,

I’m working on an embedded application which is multithreaded running on
Linux platform. It has an infinite 'for' loop to keep the main thread
alive. Every time, each iteration of this loop takes a different amount of
time to get executed. In some iteration it is taking too much time and
there are spikes in the execution time now and then. I’m trying to improve
the performance(getting a consistent execution time) by figuring out the
reason for the spike in execution time. So, I decided to explore the
profilers to understand which functions are taking too much time to get
executed. Tried gprof, strace, perf etc.. But none of them gave me the
expected profiling report.

*Question1:** My expectation from profilers*: I want to see time consumed
by each function(user-space) of my application. Many of these functions are
invoking system calls. So, I want to know the time consumed by each system
call and who is invoking those time-consuming system calls. Is this
possible with callgrind?

I have followed these steps to generate a profiling data from callgrind:

   1.  I am limiting the infinite 'for' loop to a few thousands of
   iterations and returning from the main() function to get the callgrind
   output generated.
   2. Compiled the program with these compiler flags:     *-O0   -g
    -fno-inline-functions*
   3. Running my application with this command:

*valgrind --tool=callgrind * *-q * *--collect-systime=yes *
*--trace-children=yes* * taskset 0x1 application_name*

   1. Around 150 callgrind.out.X files are generated with different values
   for ‘X’.
   2. I’m taking the callgrind.out.X file with the least value of X,
   assuming that this has the profiling data of the main thread. (When I
   checked other files, they did not have main() function in their profiled
   data).
   3. Opening the output file with kcachegrind:      *kcachegrind
   callgrind.out.X*

After checking, the below points made me doubt the correctness of the
profiling data:

·        There is a function that gets called inside the 'for' loop in my
application which I know is taking a lot of time(as it is using ioctl()
calls every time and confirmed that it takes too much time with testing).
But callgrind output file shows that it is taking very less time to get
executed.

·        Also, I added a test code (‘for’ loop that loops around for some
time every time it gets called and consumes significant amount of time.) in
one function. I confirmed that this function(after adding test code)
consumes lot of time with gprof. But as per callgrind, this function is
taking very less time.


*Question2:*  Please let me know where I'm going wrong or should I do
anything more to get correct profiling data from callgrind.


*Question3:*  Why are so many *callgrind.out.X* generated? How to identify
which file is for the main() thread? How to get only one output file
generated like gprof?


Thank you



*Best Regards,*

Pavankumar S V
_______________________________________________
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users

Reply via email to