Dan, On Tue, Dec 2, 2008 at 5:10 PM, Dan Terpstra <[EMAIL PROTECTED]> wrote: > > Are you referring to the Uncore Address/Opcode Match stuff (18.17.2.3)? I > saw that, but wasn't quite sure how to use it. I didn't see anything in the > PEBS stuff that looked like Data EAR. Or is this part of the Load Latency > stuff that's described in (18.17.1.2)? Looks like part of the latency stuff > includes a Data Address. >>
No, I was indeed referring to offcore (which is different from uncore). Yes, that's the load latency PEBS I was talking about. I does give you the cache miss information similar to Itanium D-EAR, you get instr and data addresses, latency, source of the data in addition to the machine state which is quite nice. >> You missed one thing, however, the offcore_response feature. That one >> is tricky because >> it uses a register that is shared per core (if I recall). >> Perfmon handles offcore_response similaryl to what is going on with >> AMD northbridge event. >> It enforces some form of mutual exclusion. >> > Yes, the off-core response stuff can be coded into any of the generic > registers on any core, but it shares a single common configuration register. > Exclusion logic for this guy could be fun. It looks like this takes the > place of the SELF/ANY modifiers used in earlier Core architectures for > events that probed shared cache? >> Well, yes this is tricky. The current code does the following: - only one system-wide session per physical core (each physical core has 2 threads) - only one per-thread session across the entire system (otherwise you have problems in case of migration). > Who owns the system-wide session? First-come first-served? Can it be any > thread or must it be a specific core? And if you restrict access to counting > (calipers), couldn't you do per-thread access without worrying about > overflow? >> For uncore, the first system-wide session which asks for it, gets it. It can be coming from any core/threads on the socket. >> > I'm not sure that uncore counters should be restricted to system-wide >> > counting only; I think it could be quite useful, as Phil described for >> > SiCortex, to measure "what's happening to this shared resource while I'm >> > active". That's not unlike Component PAPI measuring network activity on >> a >> Counting requires interrupt to virtualize the counters to 64-bit. They are 48 bits if I recall. >> No, this is not yet supported. I think on x86, this is not that far off. >> > Could you do first-person monitoring in a parent thread and spawn a daughter > thread to measure uncore stuff? Or maybe even fork a new process? >> No, this is currently restricted to system-wide sessions only. ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ perfmon2-devel mailing list perfmon2-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/perfmon2-devel