Carl, On Tue, May 01, 2007 at 07:54:30AM -0700, Carl Love wrote: > > > > One workaround for this problem would be to systematically include all > > read-only > > registers as part of used_pmds but that would penalize the context switch > > out > > code. > > I agree having to read all read-only regs as part of used_pmd by default > could result in a lot of extra overhead. I suspect in most cases there > would only be a couple registers but it is possible an architecture > could have a number of read only registers. It would be a performance > impact to save pmds when they may not get read. > It is hard to tell ho w many read-only register could really be used. Not all read-only PMD registers would necessarily map to HW registers. There is enough support to have read-only virtual PMD that could map to a SW resource, e.g., current->pid. Those could be less costly to save. I think that systematically including the read-onyl PMD in used_pmds is a last-resort solution and I'd like to explore others first.
> > Another way would be to force the user to issue a pfm_read_pmds() prior to > > starting monitoring to add the read-only register to used_pmds. > > Yes I agree, this is the only way my scheme would be guaranteed to work. > In most cases, I think you probably would read it prior to starting > monitoring. If we were to require it, then again we would have a > requirement that would not be intuitively obvious. I can envision a > case where someone would want to periodically sample the register and > the first sample would not occur until monitoring started. Again, how > do you tell users that there is this non-intuitive requirement on read > only registers? > In the context where the user is interested in deltas, it make sense to issue an initial read. If the context is not attached, the returned value will be zero. If the context is attached then either the actual HW register is used otherwise the last saved value is returned and that one may be zero for first use. The key problem is that some pfm_read_pmds() do not end-up reading the actual registers, either because the context is detached or because the thread is context switched out. For those there is no way to get to the actual register because the caller may be running on a different CPU or the read-only register may be different than what it was at the time the monitored thread was last switched out. One way to avoid this is to ensure that the read-only register is marked used before the context is attached. Note also that perfmon does not know about the relationship between config (PMC) and data (PMD) registers. This is another reason why you cannot read a PMD you have not previously written, i.e., declared as used. Without this, if you were to write a PMC but not its associated PMD, you could potentially read garbage or stale value from another thread. Another issue related to read-only PMD is if they are passed as reset_pmds of another PMD, i.e., ask for them to be reset when a counter overflows. Of course, they cannot be reset, so I think we need to reject a pfm_write_pmds() that would specify a read-only pmd the in reset_pmds bitmask. I am not sure we do this today. I will check on that. > > > > > The first thought is to have an explicit call to register the read only > > > pmd. It would not be necessary to register it as a pmd to be restored. > > > > We already have a lot of sytem calls. It would be hard to justify adding > > one more > > to register read-only PMDs. > > > > > You may want it as an accumulated value on. For Power, it is a 64 bit > > > reg already so it will never overflow and need accumulating. But that > > > may not be the case for all architectures. Obviously, that means yet > > > another system call. > > Yes, I also recognized that having yet another system call is not really > desirable. Hence my comment above. > > > > > Note that the kernel does not assume anything about read-only registers, > > they could be counters or mostly likely something else. There cannot be > > 64-bit emulation on read-only register simply because they cannot be > > modified. > > > > > > > The second option is to register reads to read only pmds when the PMD is > > > read. Consider the time base register (TBR). Currently, you must write > > > the TBR, the write is ignored so nothing happens to the TBR. Then read > > > the TBR to get the initial value. Run the app and then read the TBR > > > again. If we were to register the TBR on the initial read, we could > > > eliminate the write which doesn't make sense anyway. In the end, the > > > user would just do the initial read and the final read and subtract them > > > to get the elapsed time. To me this make more sense. Additionally, > > > only the bit masks that are really needed would have their bits set, > > > i.e. used_pmds, accumulated pmds. You would not include it in the > > > restore pmd mask. > > > > > The double-read is was I suggest above but there is something I don't like > > about this approach. The optimization done by the implementation is exposed > > through the interface and that is bad. The interface definition must remain > > disconnected from the implementation and especially from fancy optimization > > schemes, e.g., lazy save. > > > > What do you think? > > > > -- > > -Stephane > > _______________________________________________ > > perfmon mailing list > > [email protected] > > http://www.hpl.hp.com/hosted/linux/mail-archives/perfmon/ > > I was thinking about a bit vector that could be used to track if it was > the first read to a read only pmr. If it was, you would register the That is what used_pmds is doing for all PMDS today. If the bit is not set,then the register has never been accessed. > use of the pmr and go to the HW to read the register rather then from > the context. In your example above, this would result in potentially a As I explained above, there are many situations where you cannot go to hardware on pfm_read_pmds(), either because you are running on the wrong CPU or because the value would have no meaning at the time of the read compared to the time at which monitoring for the thread was last stopped. > long delay between when process B stopped and process A came in to read > the register. Hence the value particularly in the case of the TBR could > be significantly different. So clearly this doesn't work either. I > will think about this more. It really would be nice to have an > implementation that would be intuitively cleaner. Thank you for your > explanation as it helps me to understand all of the issues involved. No problem. -- -Stephane _______________________________________________ perfmon mailing list [email protected] http://www.hpl.hp.com/hosted/linux/mail-archives/perfmon/
