Re: perf_event wakeup_events = 0

2019-09-07 Thread Theodore Dubois
On Sep 7, 2019, at 3:45 PM, Valdis Klētnieks  wrote:

> So an entry is made in the buffer. It's not clear that this immediately 
> triggers
> a signal…

I think the documentation says it does when wakeup_events is 1. The code for
perf backs this up:
https://github.com/torvalds/linux/blob/a9815a4fa2fd297cab9fa7a12161b16657290293/tools/perf/util/evsel.c#L1051-L1054
The puzzle is what happens when wakeup_events is 0. The documentation saying
"more recent kernels treat 0 the same as 1" suggests it should behave the same,
but then why would perf set it to 1 after zero-initializing it?

> So you need to look at what size mmap buffer is being allocated.  It's 
> *probably*
> on the order of megabytes, so that you can buffer a fairly large number of 
> entries
> and not take several user/kernel transitions on every single entry…

It’s 512 KiB. Each sample is 40 bytes (the sample_type is IP | TID | TIME |
PERIOD, and each one of those 8 bytes). 40 bytes per sample * 4000 samples per
second * 1.637 seconds is 261920 which is almost exactly half the buffer.

So does wakeup_events = 0 means it causes a wakeup when the buffer is half
full? I don't see anything in the man page about this

If you'd like to try yourself, this is the strace command I've been using:
strace -ttTv -eperf_event_open,mmap,poll -operf.strace perf record stress --cpu 
1 --timeout 1

~Theodore

> 
> On Sat, 07 Sep 2019 09:14:49 -0700, Theodore Dubois said:
> 
> Reading what it actually says rather than what I thought it said.. :)
> 
>   Events come in two flavors: counting and sampled.  A counting event  is
>   one  that  is  used  for  counting  the aggregate number of events that
>   occur.  In general, counting event results are gathered with a  read(2)
>   call.   A  sampling  event periodically writes measurements to a buffer
>   that can then be accessed via mmap(2).
> 
> For some reason, I was thinking counting events.  -ENOCAFFEINE. :)
> 
>> sample_freq is 4000 (and freq is 1). Here’s the man page on this field:
>> 
>>   sample_period, sample_freq
>>  A "sampling" event is one that generates an  overflow  notifica‐
>>  tion  every N events, where N is given by sample_period.  A sam‐
>>  pling event has sample_period > 0.
> 
> There's this part:
>>  pling event has sample_period > 0.   When  an  overflow  occurs,
>>  requested  data is recorded in the mmap buffer.  The sample_type
>>  field controls what data is recorded on each overflow.
> 
> So an entry is made in the buffer. It's not clear that this immediately 
> triggers
> a signal...
> 
>   MMAP layout
>   When using perf_event_open() in sampled mode, asynchronous events (like
>   counter overflow or PROT_EXEC mmap tracking) are logged  into  a  ring-
>   buffer.  This ring-buffer is created and accessed through mmap(2).
> 
>   The mmap size should be 1+2^n pages, where the first page is a metadata
>   page (struct perf_event_mmap_page) that contains various bits of infor?
>   mation such as where the ring-buffer head is.
> 
> So you need to look at what size mmap buffer is being allocated.  It's 
> *probably*
> on the order of megabytes, so that you can buffer a fairly large number of 
> entries
> and not take several user/kernel transitions on every single entry...
> 
>> If I’m reading this right, this is a sampling event which overflows 4000 
>> times a second.
> 
> And 4,000 entries are made in the buffer per second..
> 
>> But perf then does a poll call which wakes up on this FD with POLLIN after
>> 1.637 seconds, instead of 0.00025 seconds
> 
> At which point perf goes and looks at several thousand entries in the ring 
> buffer...



Re: perf_event wakeup_events = 0

2019-09-07 Thread Valdis Klētnieks
On Sat, 07 Sep 2019 09:14:49 -0700, Theodore Dubois said:

Reading what it actually says rather than what I thought it said.. :)

   Events come in two flavors: counting and sampled.  A counting event  is
   one  that  is  used  for  counting  the aggregate number of events that
   occur.  In general, counting event results are gathered with a  read(2)
   call.   A  sampling  event periodically writes measurements to a buffer
   that can then be accessed via mmap(2).

For some reason, I was thinking counting events.  -ENOCAFFEINE. :)

> sample_freq is 4000 (and freq is 1). Here’s the man page on this field:
>
>sample_period, sample_freq
>   A "sampling" event is one that generates an  overflow  
> notifica‐
>   tion  every N events, where N is given by sample_period.  A 
> sam‐
>   pling event has sample_period > 0.

There's this part:
>   pling event has sample_period > 0.   When  an  overflow  occurs,
>   requested  data is recorded in the mmap buffer.  The sample_type
>   field controls what data is recorded on each overflow.

So an entry is made in the buffer. It's not clear that this immediately triggers
a signal...

   MMAP layout
   When using perf_event_open() in sampled mode, asynchronous events (like
   counter overflow or PROT_EXEC mmap tracking) are logged  into  a  ring-
   buffer.  This ring-buffer is created and accessed through mmap(2).

   The mmap size should be 1+2^n pages, where the first page is a metadata
   page (struct perf_event_mmap_page) that contains various bits of infor?
   mation such as where the ring-buffer head is.

So you need to look at what size mmap buffer is being allocated.  It's 
*probably*
on the order of megabytes, so that you can buffer a fairly large number of 
entries
and not take several user/kernel transitions on every single entry...

> If I’m reading this right, this is a sampling event which overflows 4000 
> times a second.

And 4,000 entries are made in the buffer per second..

> But perf then does a poll call which wakes up on this FD with POLLIN after
> 1.637 seconds, instead of 0.00025 seconds

At which point perf goes and looks at several thousand entries in the ring 
buffer...


pgp2wjxgbcJF2.pgp
Description: PGP signature


Re: perf_event wakeup_events = 0

2019-09-07 Thread Valdis Klētnieks
On Sat, 07 Sep 2019 09:14:49 -0700, Theodore Dubois said:

> If I’m reading this right, this is a sampling event which overflows 4000
> times a second. But perf then does a poll call which wakes up on this FD with
> POLLIN after 1.637 seconds, instead of 0.00025 seconds.

No, it *takes a sample* 4,000 times a second.  For instance, number of cache 
line
misses since the last sample.  You get an overflow when the counter wraps 
because
there have been more than 2^32 events since you read the counter.

At least that's my understanding of it.


pgp4B98dVc2cw.pgp
Description: PGP signature