On Mon, 2010-02-22 at 15:07 +0100, Stephane Eranian wrote: > On Thu, Feb 18, 2010 at 11:25 PM, Peter Zijlstra <pet...@infradead.org> wrote: > > On Sun, 2010-02-14 at 11:12 +0100, Peter Zijlstra wrote: > >> > >> Dealing with context switches is also going to be tricky, where we have > >> to safe and 'restore' LBR stacks for per-task counters. > > > > OK, so I poked at the LBR hardware a bit, sadly the TOS really doesn't > > count beyond the few bits it requires :-( > > > > The TOS is also a read-only MSR.
well, r/o is fine. > > I had hopes it would, since that would make it easier to share the LBR, > > simply take a TOS snapshot when you schedule the counter in, and never > > roll back further for that particular counter. > > > > As it stands we'll have to wipe the full LBR state every time we 'touch' > > it, which makes it less useful for cpu-bound counters. > > > Yes, you need to clean it up each time you snapshot it and each time > you restore it. > > The patch does not seem to handle LBR context switches. Well, it does, but sadly not in a viable way, it assumes the TOS counts more than the required bits and stops the unwind on hwc->lbr_tos snapshot. Except that the TOS doesn't work that way. This whole PEBS/LBR stuff is a massive trainwreck from a design pov. > > Also, not all hw (core and pentium-m) supports the freeze_lbrs_on_pmi > > bit, what we could do for those is stick an unconditional LBR disable > > very early in the NMI path and simply roll back the stack until we hit a > > branch into the NMI vector, that should leave a few usable LBR entries. > > > You need to be consistent across the CPUs. If a CPU does not provide > freeze_on_pmi, then I would simply not support it as a first approach. > Same thing if the LBR is less than 4-deep. I don't think you'll get anything > useful out of it. Well, if at the first branch into the NMI handler you do an unconditional LBR disable, you should still have 3 usable records. But yeah, the 1 deep LBR chips (p6 and amd) are pretty useless for this purpose and are indeed not supported. > The patch does not address the configuration options available on Intel > Nehalem/Westmere, i.e., LBR_SELECT (see Vol 3a table 16-9). We can > handle priv level separately as it can be derived from the event exclude_*. > But it you want to allow multiple events in a group to use PERF_SAMPLE_LBR > then you need to ensure LBR_SELECT is set to the same value, priv levels > included. Yes, I explicitly skipped that because of the HT thing and because like I argued in an earlier reply, I don't see much use for it, that is, it significantly complicates matters for not much (if any) benefit. As it stands LBR seems much more like a hw-breakpoint feature than a PMU feature, except for this trainwreck called PEBS. > Furthermore, LBR_SELECT is shared between HT threads. We need to either > add another field in perf_event_attr or encode this in the config > field, though it > is ugly because unrelated to the event but rather to the sample_type. > > The patch is missing the sampling part, i.e., dump of the LBR (in sequential > order) into the sampling buffer. Yes, I just hacked enough stuff together to poke at the hardware a bit, never said it was anywhere near complete. > I would also select a better name than PERF_SAMPLE_LBR. LBR is an > Intel thing. Maybe PERF_SAMPLE_TAKEN_BRANCH. Either LAST_BRANCH (suggesting a single entry), or BRANCH_STACK (suggesting >1 possible entries) seem more appropriate. Supporting only a single entry, LAST_BRANCH, seems like an attractive enough option, the use of multiple steps back seem rather pointless for interpreting the sample. ------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ perfmon2-devel mailing list perfmon2-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/perfmon2-devel