Re: [PATCH 2/2] trace-cmd: Use tracecmd_peek_next_data() in fgraph_ent_handler

2016-07-12 Thread Steven Rostedt
On Tue, 12 Jul 2016 09:17:25 +0900
Namhyung Kim  wrote:

> > Hmm, but what happens if the next data is just some random event on
> > another CPU. Do we want to break it up just because there's data on
> > another cpu?  
> 
> Yes, I think we should break.  Isn't it natural to show an event in
> the middle of a function if it occurred before returning from the
> function?  It would be more acccurate output IMHO.  I guess most leaf
> functions are small so the end result would almost same.

OK, that sounds fine then.

> 
> 
> > 
> > I wonder if we should grab a record from the same cpu and if it isn't
> > the return, then try another cpu?  
> 
> But in this case, it's a problem even if it's the return of the same
> function.  The task can be migrated to another cpu during the
> function, and then can be migrated back to the original cpu while
> calling same function again.  The entry of the first invocation would
> match to the exit of the Nth invocation..

I'll keep your patch as is then.

Thanks,

-- Steve


Re: [PATCH 2/2] trace-cmd: Use tracecmd_peek_next_data() in fgraph_ent_handler

2016-07-12 Thread Steven Rostedt
On Tue, 12 Jul 2016 09:17:25 +0900
Namhyung Kim  wrote:

> > Hmm, but what happens if the next data is just some random event on
> > another CPU. Do we want to break it up just because there's data on
> > another cpu?  
> 
> Yes, I think we should break.  Isn't it natural to show an event in
> the middle of a function if it occurred before returning from the
> function?  It would be more acccurate output IMHO.  I guess most leaf
> functions are small so the end result would almost same.

OK, that sounds fine then.

> 
> 
> > 
> > I wonder if we should grab a record from the same cpu and if it isn't
> > the return, then try another cpu?  
> 
> But in this case, it's a problem even if it's the return of the same
> function.  The task can be migrated to another cpu during the
> function, and then can be migrated back to the original cpu while
> calling same function again.  The entry of the first invocation would
> match to the exit of the Nth invocation..

I'll keep your patch as is then.

Thanks,

-- Steve


Re: [PATCH 2/2] trace-cmd: Use tracecmd_peek_next_data() in fgraph_ent_handler

2016-07-11 Thread Namhyung Kim
Hi Steve,

On Mon, Jul 11, 2016 at 12:55:22PM -0400, Steven Rostedt wrote:
> On Fri,  8 Jul 2016 14:56:12 +0900
> Namhyung Kim  wrote:
> 
> > When a task was migrated to other cpu in the middle of a function, the
> > fgraph_exit record will be in a different cpu than the enter record.
> > But currently fgraph_ent_handler() only peeks at the same cpu so it
> > could read an incorrect record.
> > 
> > For example, please see following raw records:
> > 
> >   bash-10478 [007]73.454273: funcgraph_entry:   func=0x8123bf90 
> > depth=0
> >   bash-10478 [000]73.454650: funcgraph_exit:func=0x8123bf90 
> > calltime=0x111a37483c rettime=0x111a3d0285 overrun=0x0 depth=0
> >   bash-10478 [000]74.456383: funcgraph_entry:   func=0x8123bf90 
> > depth=0
> >   bash-10478 [000]74.456655: funcgraph_exit:func=0x8123bf90 
> > calltime=0x1155f24337 rettime=0x1155f66559 overrun=0x0 depth=0
> >   bash-10478 [000]75.458517: funcgraph_entry:   func=0x8123bf90 
> > depth=0
> >   bash-10478 [001]75.458849: funcgraph_exit:func=0x8123bf90 
> > calltime=0x1191ad9de0 rettime=0x1191b2a6aa overrun=0x0 depth=0
> >   bash-10478 [001]76.460482: funcgraph_entry:   func=0x8123bf90 
> > depth=0
> >   bash-10478 [000]76.460679: funcgraph_exit:func=0x8123bf90 
> > calltime=0x11cd6662b4 rettime=0x11cd695e03 overrun=0x0 depth=0
> >   bash-10478 [000]77.462334: funcgraph_entry:   func=0x8123bf90 
> > depth=0
> >   bash-10478 [004]77.462564: funcgraph_exit:func=0x8123bf90 
> > calltime=0x12091d71c4 rettime=0x120920e977 overrun=0x0 depth=0
> >   bash-10478 [004]78.464315: funcgraph_entry:   func=0x8123bf90 
> > depth=0
> >   bash-10478 [001]78.464644: funcgraph_exit:func=0x8123bf90 
> > calltime=0x1244d674de rettime=0x1244db7329 overrun=0x0 depth=0
> >   bash-10478 [001]79.466018: funcgraph_entry:   func=0x8123bf90 
> > depth=0
> >   bash-10478 [004]79.466326: funcgraph_exit:func=0x8123bf90 
> > calltime=0x12808b3940 rettime=0x12808fe819 overrun=0x0 depth=0
> >   bash-10478 [004]80.468005: funcgraph_entry:   func=0x8123bf90 
> > depth=0
> >   bash-10478 [002]80.468291: funcgraph_exit:func=0x8123bf90 
> > calltime=0x12bc44551f rettime=0x12bc48ac9a overrun=0x0 depth=0
> >   bash-10478 [002]81.469718: funcgraph_entry:   func=0x8123bf90 
> > depth=0
> >   bash-10478 [007]81.470088: funcgraph_exit:func=0x8123bf90 
> > calltime=0x12f7f945b8 rettime=0x12f7fee028 overrun=0x0 depth=0
> > 
> > The first entry was call to cma_alloc function, it was on cpu 7 but the
> > task was migrated to cpu 0 before returning from the function.
> > Currently trace-cmd shows like below:
> > 
> >   bash-10478 [007]73.454273: funcgraph_entry:  ! 367.216 us |  
> > cma_alloc();
> >   bash-10478 [000]73.454650: funcgraph_exit:   ! 375.369 us |  }
> >   bash-10478 [000]74.456383: funcgraph_entry:  ! 270.882 us |  
> > cma_alloc();
> >   bash-10478 [000]75.458517: funcgraph_entry:  ! 195.407 us |  
> > cma_alloc();
> >   bash-10478 [001]75.458849: funcgraph_exit:   ! 329.930 us |  }
> >   bash-10478 [001]76.460482: funcgraph_entry:  ! 327.243 us |  
> > cma_alloc();
> >   bash-10478 [000]77.462334: funcgraph_entry:  ! 293.465 us |  
> > cma_alloc();
> >   bash-10478 [004]77.462564: funcgraph_exit:   ! 227.251 us |  }
> >   bash-10478 [004]78.464315: funcgraph_entry:  ! 306.905 us |  
> > cma_alloc();
> >   bash-10478 [001]79.466018: funcgraph_entry:  ! 303.196 us |  
> > cma_alloc();
> >   bash-10478 [004]80.468005: funcgraph_entry:   |  
> > cma_alloc() {
> >   bash-10478 [002]80.468291: funcgraph_exit:   ! 284.539 us |  }
> >   bash-10478 [002]81.469718: funcgraph_entry:  ! 323.215 us |  
> > cma_alloc();
> > 
> > This is because the first funcgraph_entry on cpu 7 matched to the last
> > funcgraph_exit on cpu 7.  And second funcgraph_exit on cpu 0 was shown
> > alone.  We need to match record from all cpu rather than the same cpu.
> > In this case, entry on cpu 7 should be paired with exit on cpu 0.
> > 
> > With this patch, the output look like below:
> > 
> >   bash-10478 [007]73.454273: funcgraph_entry:  ! 375.369 us |  
> > cma_alloc();
> >   bash-10478 [000]74.456383: funcgraph_entry:  ! 270.882 us |  
> > cma_alloc();
> >   bash-10478 [000]75.458517: funcgraph_entry:  ! 329.930 us |  
> > cma_alloc();
> >   bash-10478 [001]76.460482: funcgraph_entry:  ! 195.407 us |  
> > cma_alloc();
> >   bash-10478 [000]77.462334: funcgraph_entry:  ! 227.251 us |  
> > cma_alloc();
> >   bash-10478 [004]78.464315: funcgraph_entry:  ! 327.243 us |  
> > cma_alloc();
> >   bash-10478 [001]79.466018: funcgraph_entry:  ! 306.905 us |  
> > cma_alloc();
> > 

Re: [PATCH 2/2] trace-cmd: Use tracecmd_peek_next_data() in fgraph_ent_handler

2016-07-11 Thread Namhyung Kim
Hi Steve,

On Mon, Jul 11, 2016 at 12:55:22PM -0400, Steven Rostedt wrote:
> On Fri,  8 Jul 2016 14:56:12 +0900
> Namhyung Kim  wrote:
> 
> > When a task was migrated to other cpu in the middle of a function, the
> > fgraph_exit record will be in a different cpu than the enter record.
> > But currently fgraph_ent_handler() only peeks at the same cpu so it
> > could read an incorrect record.
> > 
> > For example, please see following raw records:
> > 
> >   bash-10478 [007]73.454273: funcgraph_entry:   func=0x8123bf90 
> > depth=0
> >   bash-10478 [000]73.454650: funcgraph_exit:func=0x8123bf90 
> > calltime=0x111a37483c rettime=0x111a3d0285 overrun=0x0 depth=0
> >   bash-10478 [000]74.456383: funcgraph_entry:   func=0x8123bf90 
> > depth=0
> >   bash-10478 [000]74.456655: funcgraph_exit:func=0x8123bf90 
> > calltime=0x1155f24337 rettime=0x1155f66559 overrun=0x0 depth=0
> >   bash-10478 [000]75.458517: funcgraph_entry:   func=0x8123bf90 
> > depth=0
> >   bash-10478 [001]75.458849: funcgraph_exit:func=0x8123bf90 
> > calltime=0x1191ad9de0 rettime=0x1191b2a6aa overrun=0x0 depth=0
> >   bash-10478 [001]76.460482: funcgraph_entry:   func=0x8123bf90 
> > depth=0
> >   bash-10478 [000]76.460679: funcgraph_exit:func=0x8123bf90 
> > calltime=0x11cd6662b4 rettime=0x11cd695e03 overrun=0x0 depth=0
> >   bash-10478 [000]77.462334: funcgraph_entry:   func=0x8123bf90 
> > depth=0
> >   bash-10478 [004]77.462564: funcgraph_exit:func=0x8123bf90 
> > calltime=0x12091d71c4 rettime=0x120920e977 overrun=0x0 depth=0
> >   bash-10478 [004]78.464315: funcgraph_entry:   func=0x8123bf90 
> > depth=0
> >   bash-10478 [001]78.464644: funcgraph_exit:func=0x8123bf90 
> > calltime=0x1244d674de rettime=0x1244db7329 overrun=0x0 depth=0
> >   bash-10478 [001]79.466018: funcgraph_entry:   func=0x8123bf90 
> > depth=0
> >   bash-10478 [004]79.466326: funcgraph_exit:func=0x8123bf90 
> > calltime=0x12808b3940 rettime=0x12808fe819 overrun=0x0 depth=0
> >   bash-10478 [004]80.468005: funcgraph_entry:   func=0x8123bf90 
> > depth=0
> >   bash-10478 [002]80.468291: funcgraph_exit:func=0x8123bf90 
> > calltime=0x12bc44551f rettime=0x12bc48ac9a overrun=0x0 depth=0
> >   bash-10478 [002]81.469718: funcgraph_entry:   func=0x8123bf90 
> > depth=0
> >   bash-10478 [007]81.470088: funcgraph_exit:func=0x8123bf90 
> > calltime=0x12f7f945b8 rettime=0x12f7fee028 overrun=0x0 depth=0
> > 
> > The first entry was call to cma_alloc function, it was on cpu 7 but the
> > task was migrated to cpu 0 before returning from the function.
> > Currently trace-cmd shows like below:
> > 
> >   bash-10478 [007]73.454273: funcgraph_entry:  ! 367.216 us |  
> > cma_alloc();
> >   bash-10478 [000]73.454650: funcgraph_exit:   ! 375.369 us |  }
> >   bash-10478 [000]74.456383: funcgraph_entry:  ! 270.882 us |  
> > cma_alloc();
> >   bash-10478 [000]75.458517: funcgraph_entry:  ! 195.407 us |  
> > cma_alloc();
> >   bash-10478 [001]75.458849: funcgraph_exit:   ! 329.930 us |  }
> >   bash-10478 [001]76.460482: funcgraph_entry:  ! 327.243 us |  
> > cma_alloc();
> >   bash-10478 [000]77.462334: funcgraph_entry:  ! 293.465 us |  
> > cma_alloc();
> >   bash-10478 [004]77.462564: funcgraph_exit:   ! 227.251 us |  }
> >   bash-10478 [004]78.464315: funcgraph_entry:  ! 306.905 us |  
> > cma_alloc();
> >   bash-10478 [001]79.466018: funcgraph_entry:  ! 303.196 us |  
> > cma_alloc();
> >   bash-10478 [004]80.468005: funcgraph_entry:   |  
> > cma_alloc() {
> >   bash-10478 [002]80.468291: funcgraph_exit:   ! 284.539 us |  }
> >   bash-10478 [002]81.469718: funcgraph_entry:  ! 323.215 us |  
> > cma_alloc();
> > 
> > This is because the first funcgraph_entry on cpu 7 matched to the last
> > funcgraph_exit on cpu 7.  And second funcgraph_exit on cpu 0 was shown
> > alone.  We need to match record from all cpu rather than the same cpu.
> > In this case, entry on cpu 7 should be paired with exit on cpu 0.
> > 
> > With this patch, the output look like below:
> > 
> >   bash-10478 [007]73.454273: funcgraph_entry:  ! 375.369 us |  
> > cma_alloc();
> >   bash-10478 [000]74.456383: funcgraph_entry:  ! 270.882 us |  
> > cma_alloc();
> >   bash-10478 [000]75.458517: funcgraph_entry:  ! 329.930 us |  
> > cma_alloc();
> >   bash-10478 [001]76.460482: funcgraph_entry:  ! 195.407 us |  
> > cma_alloc();
> >   bash-10478 [000]77.462334: funcgraph_entry:  ! 227.251 us |  
> > cma_alloc();
> >   bash-10478 [004]78.464315: funcgraph_entry:  ! 327.243 us |  
> > cma_alloc();
> >   bash-10478 [001]79.466018: funcgraph_entry:  ! 306.905 us |  
> > cma_alloc();
> >   bash-10478 [004]  

Re: [PATCH 2/2] trace-cmd: Use tracecmd_peek_next_data() in fgraph_ent_handler

2016-07-11 Thread Steven Rostedt
On Fri,  8 Jul 2016 14:56:12 +0900
Namhyung Kim  wrote:

> When a task was migrated to other cpu in the middle of a function, the
> fgraph_exit record will be in a different cpu than the enter record.
> But currently fgraph_ent_handler() only peeks at the same cpu so it
> could read an incorrect record.
> 
> For example, please see following raw records:
> 
>   bash-10478 [007]73.454273: funcgraph_entry:   func=0x8123bf90 
> depth=0
>   bash-10478 [000]73.454650: funcgraph_exit:func=0x8123bf90 
> calltime=0x111a37483c rettime=0x111a3d0285 overrun=0x0 depth=0
>   bash-10478 [000]74.456383: funcgraph_entry:   func=0x8123bf90 
> depth=0
>   bash-10478 [000]74.456655: funcgraph_exit:func=0x8123bf90 
> calltime=0x1155f24337 rettime=0x1155f66559 overrun=0x0 depth=0
>   bash-10478 [000]75.458517: funcgraph_entry:   func=0x8123bf90 
> depth=0
>   bash-10478 [001]75.458849: funcgraph_exit:func=0x8123bf90 
> calltime=0x1191ad9de0 rettime=0x1191b2a6aa overrun=0x0 depth=0
>   bash-10478 [001]76.460482: funcgraph_entry:   func=0x8123bf90 
> depth=0
>   bash-10478 [000]76.460679: funcgraph_exit:func=0x8123bf90 
> calltime=0x11cd6662b4 rettime=0x11cd695e03 overrun=0x0 depth=0
>   bash-10478 [000]77.462334: funcgraph_entry:   func=0x8123bf90 
> depth=0
>   bash-10478 [004]77.462564: funcgraph_exit:func=0x8123bf90 
> calltime=0x12091d71c4 rettime=0x120920e977 overrun=0x0 depth=0
>   bash-10478 [004]78.464315: funcgraph_entry:   func=0x8123bf90 
> depth=0
>   bash-10478 [001]78.464644: funcgraph_exit:func=0x8123bf90 
> calltime=0x1244d674de rettime=0x1244db7329 overrun=0x0 depth=0
>   bash-10478 [001]79.466018: funcgraph_entry:   func=0x8123bf90 
> depth=0
>   bash-10478 [004]79.466326: funcgraph_exit:func=0x8123bf90 
> calltime=0x12808b3940 rettime=0x12808fe819 overrun=0x0 depth=0
>   bash-10478 [004]80.468005: funcgraph_entry:   func=0x8123bf90 
> depth=0
>   bash-10478 [002]80.468291: funcgraph_exit:func=0x8123bf90 
> calltime=0x12bc44551f rettime=0x12bc48ac9a overrun=0x0 depth=0
>   bash-10478 [002]81.469718: funcgraph_entry:   func=0x8123bf90 
> depth=0
>   bash-10478 [007]81.470088: funcgraph_exit:func=0x8123bf90 
> calltime=0x12f7f945b8 rettime=0x12f7fee028 overrun=0x0 depth=0
> 
> The first entry was call to cma_alloc function, it was on cpu 7 but the
> task was migrated to cpu 0 before returning from the function.
> Currently trace-cmd shows like below:
> 
>   bash-10478 [007]73.454273: funcgraph_entry:  ! 367.216 us |  
> cma_alloc();
>   bash-10478 [000]73.454650: funcgraph_exit:   ! 375.369 us |  }
>   bash-10478 [000]74.456383: funcgraph_entry:  ! 270.882 us |  
> cma_alloc();
>   bash-10478 [000]75.458517: funcgraph_entry:  ! 195.407 us |  
> cma_alloc();
>   bash-10478 [001]75.458849: funcgraph_exit:   ! 329.930 us |  }
>   bash-10478 [001]76.460482: funcgraph_entry:  ! 327.243 us |  
> cma_alloc();
>   bash-10478 [000]77.462334: funcgraph_entry:  ! 293.465 us |  
> cma_alloc();
>   bash-10478 [004]77.462564: funcgraph_exit:   ! 227.251 us |  }
>   bash-10478 [004]78.464315: funcgraph_entry:  ! 306.905 us |  
> cma_alloc();
>   bash-10478 [001]79.466018: funcgraph_entry:  ! 303.196 us |  
> cma_alloc();
>   bash-10478 [004]80.468005: funcgraph_entry:   |  
> cma_alloc() {
>   bash-10478 [002]80.468291: funcgraph_exit:   ! 284.539 us |  }
>   bash-10478 [002]81.469718: funcgraph_entry:  ! 323.215 us |  
> cma_alloc();
> 
> This is because the first funcgraph_entry on cpu 7 matched to the last
> funcgraph_exit on cpu 7.  And second funcgraph_exit on cpu 0 was shown
> alone.  We need to match record from all cpu rather than the same cpu.
> In this case, entry on cpu 7 should be paired with exit on cpu 0.
> 
> With this patch, the output look like below:
> 
>   bash-10478 [007]73.454273: funcgraph_entry:  ! 375.369 us |  
> cma_alloc();
>   bash-10478 [000]74.456383: funcgraph_entry:  ! 270.882 us |  
> cma_alloc();
>   bash-10478 [000]75.458517: funcgraph_entry:  ! 329.930 us |  
> cma_alloc();
>   bash-10478 [001]76.460482: funcgraph_entry:  ! 195.407 us |  
> cma_alloc();
>   bash-10478 [000]77.462334: funcgraph_entry:  ! 227.251 us |  
> cma_alloc();
>   bash-10478 [004]78.464315: funcgraph_entry:  ! 327.243 us |  
> cma_alloc();
>   bash-10478 [001]79.466018: funcgraph_entry:  ! 306.905 us |  
> cma_alloc();
>   bash-10478 [004]80.468005: funcgraph_entry:  ! 284.539 us |  
> cma_alloc();
>   bash-10478 [002]81.469718: funcgraph_entry:  ! 367.216 us |  
> cma_alloc();
> 
> Maybe we can separate enter and exit if they happened on different
> cpu.  Anyway the 

Re: [PATCH 2/2] trace-cmd: Use tracecmd_peek_next_data() in fgraph_ent_handler

2016-07-11 Thread Steven Rostedt
On Fri,  8 Jul 2016 14:56:12 +0900
Namhyung Kim  wrote:

> When a task was migrated to other cpu in the middle of a function, the
> fgraph_exit record will be in a different cpu than the enter record.
> But currently fgraph_ent_handler() only peeks at the same cpu so it
> could read an incorrect record.
> 
> For example, please see following raw records:
> 
>   bash-10478 [007]73.454273: funcgraph_entry:   func=0x8123bf90 
> depth=0
>   bash-10478 [000]73.454650: funcgraph_exit:func=0x8123bf90 
> calltime=0x111a37483c rettime=0x111a3d0285 overrun=0x0 depth=0
>   bash-10478 [000]74.456383: funcgraph_entry:   func=0x8123bf90 
> depth=0
>   bash-10478 [000]74.456655: funcgraph_exit:func=0x8123bf90 
> calltime=0x1155f24337 rettime=0x1155f66559 overrun=0x0 depth=0
>   bash-10478 [000]75.458517: funcgraph_entry:   func=0x8123bf90 
> depth=0
>   bash-10478 [001]75.458849: funcgraph_exit:func=0x8123bf90 
> calltime=0x1191ad9de0 rettime=0x1191b2a6aa overrun=0x0 depth=0
>   bash-10478 [001]76.460482: funcgraph_entry:   func=0x8123bf90 
> depth=0
>   bash-10478 [000]76.460679: funcgraph_exit:func=0x8123bf90 
> calltime=0x11cd6662b4 rettime=0x11cd695e03 overrun=0x0 depth=0
>   bash-10478 [000]77.462334: funcgraph_entry:   func=0x8123bf90 
> depth=0
>   bash-10478 [004]77.462564: funcgraph_exit:func=0x8123bf90 
> calltime=0x12091d71c4 rettime=0x120920e977 overrun=0x0 depth=0
>   bash-10478 [004]78.464315: funcgraph_entry:   func=0x8123bf90 
> depth=0
>   bash-10478 [001]78.464644: funcgraph_exit:func=0x8123bf90 
> calltime=0x1244d674de rettime=0x1244db7329 overrun=0x0 depth=0
>   bash-10478 [001]79.466018: funcgraph_entry:   func=0x8123bf90 
> depth=0
>   bash-10478 [004]79.466326: funcgraph_exit:func=0x8123bf90 
> calltime=0x12808b3940 rettime=0x12808fe819 overrun=0x0 depth=0
>   bash-10478 [004]80.468005: funcgraph_entry:   func=0x8123bf90 
> depth=0
>   bash-10478 [002]80.468291: funcgraph_exit:func=0x8123bf90 
> calltime=0x12bc44551f rettime=0x12bc48ac9a overrun=0x0 depth=0
>   bash-10478 [002]81.469718: funcgraph_entry:   func=0x8123bf90 
> depth=0
>   bash-10478 [007]81.470088: funcgraph_exit:func=0x8123bf90 
> calltime=0x12f7f945b8 rettime=0x12f7fee028 overrun=0x0 depth=0
> 
> The first entry was call to cma_alloc function, it was on cpu 7 but the
> task was migrated to cpu 0 before returning from the function.
> Currently trace-cmd shows like below:
> 
>   bash-10478 [007]73.454273: funcgraph_entry:  ! 367.216 us |  
> cma_alloc();
>   bash-10478 [000]73.454650: funcgraph_exit:   ! 375.369 us |  }
>   bash-10478 [000]74.456383: funcgraph_entry:  ! 270.882 us |  
> cma_alloc();
>   bash-10478 [000]75.458517: funcgraph_entry:  ! 195.407 us |  
> cma_alloc();
>   bash-10478 [001]75.458849: funcgraph_exit:   ! 329.930 us |  }
>   bash-10478 [001]76.460482: funcgraph_entry:  ! 327.243 us |  
> cma_alloc();
>   bash-10478 [000]77.462334: funcgraph_entry:  ! 293.465 us |  
> cma_alloc();
>   bash-10478 [004]77.462564: funcgraph_exit:   ! 227.251 us |  }
>   bash-10478 [004]78.464315: funcgraph_entry:  ! 306.905 us |  
> cma_alloc();
>   bash-10478 [001]79.466018: funcgraph_entry:  ! 303.196 us |  
> cma_alloc();
>   bash-10478 [004]80.468005: funcgraph_entry:   |  
> cma_alloc() {
>   bash-10478 [002]80.468291: funcgraph_exit:   ! 284.539 us |  }
>   bash-10478 [002]81.469718: funcgraph_entry:  ! 323.215 us |  
> cma_alloc();
> 
> This is because the first funcgraph_entry on cpu 7 matched to the last
> funcgraph_exit on cpu 7.  And second funcgraph_exit on cpu 0 was shown
> alone.  We need to match record from all cpu rather than the same cpu.
> In this case, entry on cpu 7 should be paired with exit on cpu 0.
> 
> With this patch, the output look like below:
> 
>   bash-10478 [007]73.454273: funcgraph_entry:  ! 375.369 us |  
> cma_alloc();
>   bash-10478 [000]74.456383: funcgraph_entry:  ! 270.882 us |  
> cma_alloc();
>   bash-10478 [000]75.458517: funcgraph_entry:  ! 329.930 us |  
> cma_alloc();
>   bash-10478 [001]76.460482: funcgraph_entry:  ! 195.407 us |  
> cma_alloc();
>   bash-10478 [000]77.462334: funcgraph_entry:  ! 227.251 us |  
> cma_alloc();
>   bash-10478 [004]78.464315: funcgraph_entry:  ! 327.243 us |  
> cma_alloc();
>   bash-10478 [001]79.466018: funcgraph_entry:  ! 306.905 us |  
> cma_alloc();
>   bash-10478 [004]80.468005: funcgraph_entry:  ! 284.539 us |  
> cma_alloc();
>   bash-10478 [002]81.469718: funcgraph_entry:  ! 367.216 us |  
> cma_alloc();
> 
> Maybe we can separate enter and exit if they happened on different
> cpu.  Anyway the time duration has 

[PATCH 2/2] trace-cmd: Use tracecmd_peek_next_data() in fgraph_ent_handler

2016-07-07 Thread Namhyung Kim
When a task was migrated to other cpu in the middle of a function, the
fgraph_exit record will be in a different cpu than the enter record.
But currently fgraph_ent_handler() only peeks at the same cpu so it
could read an incorrect record.

For example, please see following raw records:

  bash-10478 [007]73.454273: funcgraph_entry:   func=0x8123bf90 
depth=0
  bash-10478 [000]73.454650: funcgraph_exit:func=0x8123bf90 
calltime=0x111a37483c rettime=0x111a3d0285 overrun=0x0 depth=0
  bash-10478 [000]74.456383: funcgraph_entry:   func=0x8123bf90 
depth=0
  bash-10478 [000]74.456655: funcgraph_exit:func=0x8123bf90 
calltime=0x1155f24337 rettime=0x1155f66559 overrun=0x0 depth=0
  bash-10478 [000]75.458517: funcgraph_entry:   func=0x8123bf90 
depth=0
  bash-10478 [001]75.458849: funcgraph_exit:func=0x8123bf90 
calltime=0x1191ad9de0 rettime=0x1191b2a6aa overrun=0x0 depth=0
  bash-10478 [001]76.460482: funcgraph_entry:   func=0x8123bf90 
depth=0
  bash-10478 [000]76.460679: funcgraph_exit:func=0x8123bf90 
calltime=0x11cd6662b4 rettime=0x11cd695e03 overrun=0x0 depth=0
  bash-10478 [000]77.462334: funcgraph_entry:   func=0x8123bf90 
depth=0
  bash-10478 [004]77.462564: funcgraph_exit:func=0x8123bf90 
calltime=0x12091d71c4 rettime=0x120920e977 overrun=0x0 depth=0
  bash-10478 [004]78.464315: funcgraph_entry:   func=0x8123bf90 
depth=0
  bash-10478 [001]78.464644: funcgraph_exit:func=0x8123bf90 
calltime=0x1244d674de rettime=0x1244db7329 overrun=0x0 depth=0
  bash-10478 [001]79.466018: funcgraph_entry:   func=0x8123bf90 
depth=0
  bash-10478 [004]79.466326: funcgraph_exit:func=0x8123bf90 
calltime=0x12808b3940 rettime=0x12808fe819 overrun=0x0 depth=0
  bash-10478 [004]80.468005: funcgraph_entry:   func=0x8123bf90 
depth=0
  bash-10478 [002]80.468291: funcgraph_exit:func=0x8123bf90 
calltime=0x12bc44551f rettime=0x12bc48ac9a overrun=0x0 depth=0
  bash-10478 [002]81.469718: funcgraph_entry:   func=0x8123bf90 
depth=0
  bash-10478 [007]81.470088: funcgraph_exit:func=0x8123bf90 
calltime=0x12f7f945b8 rettime=0x12f7fee028 overrun=0x0 depth=0

The first entry was call to cma_alloc function, it was on cpu 7 but the
task was migrated to cpu 0 before returning from the function.
Currently trace-cmd shows like below:

  bash-10478 [007]73.454273: funcgraph_entry:  ! 367.216 us |  
cma_alloc();
  bash-10478 [000]73.454650: funcgraph_exit:   ! 375.369 us |  }
  bash-10478 [000]74.456383: funcgraph_entry:  ! 270.882 us |  
cma_alloc();
  bash-10478 [000]75.458517: funcgraph_entry:  ! 195.407 us |  
cma_alloc();
  bash-10478 [001]75.458849: funcgraph_exit:   ! 329.930 us |  }
  bash-10478 [001]76.460482: funcgraph_entry:  ! 327.243 us |  
cma_alloc();
  bash-10478 [000]77.462334: funcgraph_entry:  ! 293.465 us |  
cma_alloc();
  bash-10478 [004]77.462564: funcgraph_exit:   ! 227.251 us |  }
  bash-10478 [004]78.464315: funcgraph_entry:  ! 306.905 us |  
cma_alloc();
  bash-10478 [001]79.466018: funcgraph_entry:  ! 303.196 us |  
cma_alloc();
  bash-10478 [004]80.468005: funcgraph_entry:   |  
cma_alloc() {
  bash-10478 [002]80.468291: funcgraph_exit:   ! 284.539 us |  }
  bash-10478 [002]81.469718: funcgraph_entry:  ! 323.215 us |  
cma_alloc();

This is because the first funcgraph_entry on cpu 7 matched to the last
funcgraph_exit on cpu 7.  And second funcgraph_exit on cpu 0 was shown
alone.  We need to match record from all cpu rather than the same cpu.
In this case, entry on cpu 7 should be paired with exit on cpu 0.

With this patch, the output look like below:

  bash-10478 [007]73.454273: funcgraph_entry:  ! 375.369 us |  
cma_alloc();
  bash-10478 [000]74.456383: funcgraph_entry:  ! 270.882 us |  
cma_alloc();
  bash-10478 [000]75.458517: funcgraph_entry:  ! 329.930 us |  
cma_alloc();
  bash-10478 [001]76.460482: funcgraph_entry:  ! 195.407 us |  
cma_alloc();
  bash-10478 [000]77.462334: funcgraph_entry:  ! 227.251 us |  
cma_alloc();
  bash-10478 [004]78.464315: funcgraph_entry:  ! 327.243 us |  
cma_alloc();
  bash-10478 [001]79.466018: funcgraph_entry:  ! 306.905 us |  
cma_alloc();
  bash-10478 [004]80.468005: funcgraph_entry:  ! 284.539 us |  
cma_alloc();
  bash-10478 [002]81.469718: funcgraph_entry:  ! 367.216 us |  
cma_alloc();

Maybe we can separate enter and exit if they happened on different
cpu.  Anyway the time duration has correct value now.

Reported-by: Joonsoo Kim 
Signed-off-by: Namhyung Kim 
---
 trace-ftrace.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/trace-ftrace.c b/trace-ftrace.c
index 

[PATCH 2/2] trace-cmd: Use tracecmd_peek_next_data() in fgraph_ent_handler

2016-07-07 Thread Namhyung Kim
When a task was migrated to other cpu in the middle of a function, the
fgraph_exit record will be in a different cpu than the enter record.
But currently fgraph_ent_handler() only peeks at the same cpu so it
could read an incorrect record.

For example, please see following raw records:

  bash-10478 [007]73.454273: funcgraph_entry:   func=0x8123bf90 
depth=0
  bash-10478 [000]73.454650: funcgraph_exit:func=0x8123bf90 
calltime=0x111a37483c rettime=0x111a3d0285 overrun=0x0 depth=0
  bash-10478 [000]74.456383: funcgraph_entry:   func=0x8123bf90 
depth=0
  bash-10478 [000]74.456655: funcgraph_exit:func=0x8123bf90 
calltime=0x1155f24337 rettime=0x1155f66559 overrun=0x0 depth=0
  bash-10478 [000]75.458517: funcgraph_entry:   func=0x8123bf90 
depth=0
  bash-10478 [001]75.458849: funcgraph_exit:func=0x8123bf90 
calltime=0x1191ad9de0 rettime=0x1191b2a6aa overrun=0x0 depth=0
  bash-10478 [001]76.460482: funcgraph_entry:   func=0x8123bf90 
depth=0
  bash-10478 [000]76.460679: funcgraph_exit:func=0x8123bf90 
calltime=0x11cd6662b4 rettime=0x11cd695e03 overrun=0x0 depth=0
  bash-10478 [000]77.462334: funcgraph_entry:   func=0x8123bf90 
depth=0
  bash-10478 [004]77.462564: funcgraph_exit:func=0x8123bf90 
calltime=0x12091d71c4 rettime=0x120920e977 overrun=0x0 depth=0
  bash-10478 [004]78.464315: funcgraph_entry:   func=0x8123bf90 
depth=0
  bash-10478 [001]78.464644: funcgraph_exit:func=0x8123bf90 
calltime=0x1244d674de rettime=0x1244db7329 overrun=0x0 depth=0
  bash-10478 [001]79.466018: funcgraph_entry:   func=0x8123bf90 
depth=0
  bash-10478 [004]79.466326: funcgraph_exit:func=0x8123bf90 
calltime=0x12808b3940 rettime=0x12808fe819 overrun=0x0 depth=0
  bash-10478 [004]80.468005: funcgraph_entry:   func=0x8123bf90 
depth=0
  bash-10478 [002]80.468291: funcgraph_exit:func=0x8123bf90 
calltime=0x12bc44551f rettime=0x12bc48ac9a overrun=0x0 depth=0
  bash-10478 [002]81.469718: funcgraph_entry:   func=0x8123bf90 
depth=0
  bash-10478 [007]81.470088: funcgraph_exit:func=0x8123bf90 
calltime=0x12f7f945b8 rettime=0x12f7fee028 overrun=0x0 depth=0

The first entry was call to cma_alloc function, it was on cpu 7 but the
task was migrated to cpu 0 before returning from the function.
Currently trace-cmd shows like below:

  bash-10478 [007]73.454273: funcgraph_entry:  ! 367.216 us |  
cma_alloc();
  bash-10478 [000]73.454650: funcgraph_exit:   ! 375.369 us |  }
  bash-10478 [000]74.456383: funcgraph_entry:  ! 270.882 us |  
cma_alloc();
  bash-10478 [000]75.458517: funcgraph_entry:  ! 195.407 us |  
cma_alloc();
  bash-10478 [001]75.458849: funcgraph_exit:   ! 329.930 us |  }
  bash-10478 [001]76.460482: funcgraph_entry:  ! 327.243 us |  
cma_alloc();
  bash-10478 [000]77.462334: funcgraph_entry:  ! 293.465 us |  
cma_alloc();
  bash-10478 [004]77.462564: funcgraph_exit:   ! 227.251 us |  }
  bash-10478 [004]78.464315: funcgraph_entry:  ! 306.905 us |  
cma_alloc();
  bash-10478 [001]79.466018: funcgraph_entry:  ! 303.196 us |  
cma_alloc();
  bash-10478 [004]80.468005: funcgraph_entry:   |  
cma_alloc() {
  bash-10478 [002]80.468291: funcgraph_exit:   ! 284.539 us |  }
  bash-10478 [002]81.469718: funcgraph_entry:  ! 323.215 us |  
cma_alloc();

This is because the first funcgraph_entry on cpu 7 matched to the last
funcgraph_exit on cpu 7.  And second funcgraph_exit on cpu 0 was shown
alone.  We need to match record from all cpu rather than the same cpu.
In this case, entry on cpu 7 should be paired with exit on cpu 0.

With this patch, the output look like below:

  bash-10478 [007]73.454273: funcgraph_entry:  ! 375.369 us |  
cma_alloc();
  bash-10478 [000]74.456383: funcgraph_entry:  ! 270.882 us |  
cma_alloc();
  bash-10478 [000]75.458517: funcgraph_entry:  ! 329.930 us |  
cma_alloc();
  bash-10478 [001]76.460482: funcgraph_entry:  ! 195.407 us |  
cma_alloc();
  bash-10478 [000]77.462334: funcgraph_entry:  ! 227.251 us |  
cma_alloc();
  bash-10478 [004]78.464315: funcgraph_entry:  ! 327.243 us |  
cma_alloc();
  bash-10478 [001]79.466018: funcgraph_entry:  ! 306.905 us |  
cma_alloc();
  bash-10478 [004]80.468005: funcgraph_entry:  ! 284.539 us |  
cma_alloc();
  bash-10478 [002]81.469718: funcgraph_entry:  ! 367.216 us |  
cma_alloc();

Maybe we can separate enter and exit if they happened on different
cpu.  Anyway the time duration has correct value now.

Reported-by: Joonsoo Kim 
Signed-off-by: Namhyung Kim 
---
 trace-ftrace.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/trace-ftrace.c b/trace-ftrace.c
index 636b08b..edc9349 100644
--- a/trace-ftrace.c
+++