Re: [perfmon] [PATCH] Read only PMD access fix, updated patch file

Stephane Eranian Tue, 01 May 2007 00:47:42 -0700

Carl,

On Mon, Apr 30, 2007 at 04:31:36PM -0700, Carl Love wrote:
> 
> I have a fundamental design issue with requiring a write to a read only
> register to allow you to then read the register.  The approach does work
> on Power so that is not a problem. First of all it is very counter
> intuitive that you should have to write to a read only register so you
> can then read from it.  Secondly, the fact of the matter is you actually
> don't care about the write operation at all.  You really just want is
> the side effect, i.e. registering the PMD.  It seems like there should
> be a more natural way to do that.  The write function ends up adding the
> read only PMD to the PMDs to be reset and accumulated.  Note sure of the
> value of adding the read only PMD to the list of PMDs to be reset.  
> 
I tend to agree with about the current design. I think it is clearly
counter-intuitive and I'd like to correct it. The only reason we allow
pfm_write_pmds() on read-only PMD is to track their usage, i.e., update
the used_pmds bitmasks. This is needed because we only allow pfm_read_pmds()
on registers which have been declared used used either via pfm_write_pmds()
or bcause they have been declared as being sampled (smpl_pmds) or
reset (reset_pmds).


The used_pmds bitmasks is also used on context switch to avoid saving
unused registers. Similarly, it is used on context switch in to only
restore the pmds we use. Finally, it is also used when detaching the
context to save the last HW values into the perfmon context such that
any subsequent pfm_read_pmds() will return the last known value. This
latter property applies also to read-only registers.

On context switch out, we do need to save read-only PMD because any
pfm_read_pmds() by another (controlling) process requires the monitored
thread to be stopped (i.e., switched out). On context switch in, we do
not need to restore read-only registers.

if we use your patch and only start tracking read-only PMD registers
when they are actually read (assuming they are not passed as smpl_pmds
nor reset_pmds), then we have a problem with context switches. Let me
give a short example.

Assume process A is monitored the single-threaded process B and that
register TBR is a read-only PMD. A programs the PMU, attches the
context and then starts active monitoring. TBR is not yet read or written.
Then at some point, A wants to read TBR from B. It first stops process B,
which causes it to be switched out. Only the PMD registers used by B are
saved, that does not include TBR. Process B eventually is off the CPU and
the pmf_read_pmds() can proceed. It will read from the perfmon context 
and not touch HW. The returned value will be zero simply because the kernel
did not know we were interested in TBR and it did not track it. 

One workaround for this problem would be to systematically include all read-only
registers as part of used_pmds but that would penalize the context switch out 
code.

Another way would be to force the user to issue a pfm_read_pmds() prior to 
starting monitoring to add the read-only register to used_pmds.

> The first thought is to have an explicit call to register the read only
> pmd.  It would not be necessary to register it as a pmd to be restored.

We already have a lot of sytem calls. It would be hard to justify adding one 
more
to register read-only PMDs.

> You may want it as an accumulated value on.  For Power, it is a 64 bit
> reg already so it will never overflow and need accumulating.  But that
> may not be the case for all architectures.  Obviously, that means yet
> another system call.  

Note that the kernel does not assume anything about read-only registers,
they could be counters or mostly likely something else. There cannot be
64-bit emulation on read-only register simply because they cannot be modified.


> The second option is to register reads to read only pmds when the PMD is
> read. Consider the time base register (TBR).  Currently, you must write
> the TBR, the write is ignored so nothing happens to the TBR. Then read
> the TBR to get the initial value. Run the app and then read the TBR
> again.  If we were to register the TBR on the initial read, we could
> eliminate the write which doesn't make sense anyway.  In the end, the
> user would just do the initial read and the final read and subtract them
> to get the elapsed time.  To me this make more sense.  Additionally,
> only the bit masks that are really needed would have their bits set,
> i.e. used_pmds, accumulated pmds.  You would not include it in the
> restore pmd mask.
> 
The double-read is was I suggest above but there is something I don't like
about this approach. The optimization done by the implementation is exposed
through the interface and that is bad. The interface definition must remain
disconnected from the implementation and especially from fancy optimization
schemes, e.g., lazy save.

What do you think?

--
-Stephane
_______________________________________________
perfmon mailing list
[email protected]
http://www.hpl.hp.com/hosted/linux/mail-archives/perfmon/

Re: [perfmon] [PATCH] Read only PMD access fix, updated patch file

Reply via email to