On Tue, Mar 14, 2017 at 11:57:21PM -0700, Stephane Eranian wrote:
> This patch significantly improves the execution time of
> perf_event__synthesize_mmap_events() when running perf record
> on systems where processes have lots of threads. It just happens
> that cat /proc/pid/maps support uses a O(N^2) algorithm to generate
> each map line in the maps file.  If you have 1000 threads, then you have
> necessarily 1000 stacks.  For each vma, you need to check if it corresponds
> to a thread's stack.  With a large number of threads, this can take a very 
> long time. I have seen latencies >> 10mn.
> 
> As of today, perf does not use the fact that a mapping is a stack,
> therefore we can work around the issue by using /proc/pid/tasks/pid/maps.
> This entry does not try to map a vma to stack and is thus much
> faster with no loss of functonality.
> 
> The proc-map-timeout logic is kept in case user still want some uppre limit.
> 
> Signed-off-by: Stephane Eranian <[email protected]>
> ---
>  tools/perf/util/event.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
> index 4ea7ce7..b137566 100644
> --- a/tools/perf/util/event.c
> +++ b/tools/perf/util/event.c
> @@ -255,8 +255,8 @@ int perf_event__synthesize_mmap_events(struct perf_tool 
> *tool,
>       if (machine__is_default_guest(machine))
>               return 0;
>  
> -     snprintf(filename, sizeof(filename), "%s/proc/%d/maps",
> -              machine->root_dir, pid);
> +     snprintf(filename, sizeof(filename), "%s/proc/%d/tasks/%d/maps",
> +              machine->root_dir, pid, pid);
>  
>       fp = fopen(filename, "r");
>       if (fp == NULL) {
> -- 
> 2.5.0
> 

nice..

Acked-by: Jiri Olsa <[email protected]>

thanks,
jirka

Reply via email to