Em Tue, Sep 22, 2015 at 04:25:34PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Tue, Sep 22, 2015 at 09:24:55AM +0900, Namhyung Kim escreveu:
> > When perf creates a new child to profile, the events are enabled on
> > exec().  And in this case, it doesn't synthesize any event for the
> > child since they'll be generated during exec().  But there's an window
> > between the enabling and the event generation.
> > 
> > It used to be overcome since samples are only in kernel (so we always
> > have the map) and the comm is overridden by a later COMM event.
> > However it won't work if events are processed and displayed before the
> > COMM event overrides like in 'perf script'.  This leads to those early
> > samples (like native_write_msr_safe) not having a comm but pid (like
> > ':15328').
> > 
> > So it needs to synthesize COMM event for the child explicitly before
> > enabling so that it can have a correct comm.  But at this time, the
> > comm will be "perf" since it's not exec-ed yet.
> 
> Don't we use enable-on-exec?
> 
>   # perf record usleep 1
>   [ perf record: Woken up 1 times to write data ]
>   [ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ]
>   # perf evlist -v
>   cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type:
>   IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1,
>   enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2:
>   1, comm_exec: 1
>   #
> 
> Ok, but even then, if we use --show-task-events, we see that the comm
> appears only later...
> 
>   # perf script --show-task-events
>     :4429  4429 27909.079372:          1 cycles:  ffffffff8105f45a 
> native_write_msr_safe (/lib/modules/4.
>     :4429  4429 27909.079375:          1 cycles:  ffffffff8105f45a 
> native_write_msr_safe (/lib/modules/4.
>     :4429  4429 27909.079376:         10 cycles:  ffffffff8105f45a 
> native_write_msr_safe (/lib/modules/4.
>     :4429  4429 27909.079377:        223 cycles:  ffffffff8105f45a 
> native_write_msr_safe (/lib/modules/4.
>     :4429  4429 27909.079378:       6571 cycles:  ffffffff8105f45a 
> native_write_msr_safe (/lib/modules/4.
>    usleep  4429 27909.079380: PERF_RECORD_COMM exec: usleep:4429/4429
>    usleep  4429 27909.079381:     185403 cycles:  ffffffff810a72d3 
> flush_signal_handlers (/lib/modules/4.
>    usleep  4429 27909.079444:    2241110 cycles:      7fc575355be3 _dl_start 
> (/usr/lib64/ld-2.20.so)
>    usleep  4429 27909.079875: PERF_RECORD_EXIT(4429:4429):(4429:4429)
> 
> While with your patch we see:
> 
>   # perf record usleep 1
>   [ perf record: Woken up 1 times to write data ]
>   [ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ]
>   # perf script --show-task-events
>      perf     0     0.000000: PERF_RECORD_COMM: perf:6571/6571
>      perf  6571 28378.609330:          1 cycles:  ffffffff8105f45a 
> native_write_msr_safe (/lib/modules/4.
>      perf  6571 28378.609334:          1 cycles:  ffffffff8105f45a 
> native_write_msr_safe (/lib/modules/4.
>      perf  6571 28378.609335:          7 cycles:  ffffffff8105f45a 
> native_write_msr_safe (/lib/modules/4.
>      perf  6571 28378.609336:        163 cycles:  ffffffff8105f45a 
> native_write_msr_safe (/lib/modules/4.
>      perf  6571 28378.609337:       4795 cycles:  ffffffff8105f45a 
> native_write_msr_safe (/lib/modules/4.
>      perf  6571 28378.609340:     119283 cycles:  ffffffff81189bdd 
> perf_event_comm_output (/lib/modules/4
>    usleep  6571 28378.609340: PERF_RECORD_COMM exec: usleep:6571/6571
>    usleep  6571 28378.609380:    1900804 cycles:  ffffffff810a0b40 
> find_next_iomem_res (/lib/modules/4.3.
>    usleep  6571 28378.609879: PERF_RECORD_EXIT(6571:6571):(6571:6571)
> 
> Better indeed, I'm adding this before/after so that the changeset comment
> can fully illustrate what happens.
> 
> But you see there is still a problem with that synthesized COMM, right? the 
> one
> coming from the kernel has the 6571 there, while the synthesized one 
> doesn't...
> Checking...
> 
> But I'll apply your patch.

Humm, I think it is working by accident, as you're not allocating space
for machine->id_hdr_size, please take a look at
perf_event__synthesize_thread_map().

Right now its not a problem this line in perf_event__prepare_comm():

        memset(event->comm.comm + size, 0, machine->id_hdr_size);

Because perf_event is an union and some of its elements, like mmap/mmap2
have that PATH_MAX part, but its just a matter of the id_hdr_size
becoming bigger than that and we'll have a problem...

That zero there is probably because we're not synthesizing the
PERF_SAMPLE_ part, that would be encoded in this id_hdr_size area, we're
zeroing it all and then when showing it in perf script, we see zeros.

Would be good to do that, will try if you don't do it first 8-)

But have some other stuff to do first :-)

- Arnaldo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to