> -----Original Message----- > From: stephane eranian [mailto:[EMAIL PROTECTED] > Sent: Tuesday, December 02, 2008 8:45 PM > To: Dan Terpstra > Cc: perfmon2-devel > Subject: Re: [perfmon2] Intel Core i7 specs available. > > On Wed, Dec 3, 2008 at 2:43 AM, Dan Terpstra <[EMAIL PROTECTED]> > wrote: > >> Counting requires interrupt to virtualize the counters to 64-bit. They > are > >> 48 > >> bits if I recall. > >> > > Yes, they're 48-bit, but why do you need interrupt to virtualize? With a > > multi-tasking OS, you just do it at context switch. I still think it > will be > > important to count in both domains at once... > > There is no context switch in system-wide mode. You program the counters > and you let them run. > Didn't realize that. But it makes sense.
> Also the perfmon API export all counters as 64-bit wide. This is very > useful for tools, especially when sampling. > Agreed, completely. But I still want both sets of events :) - d > > > - d > > > >> -----Original Message----- > >> From: stephane eranian [mailto:[EMAIL PROTECTED] > >> Sent: Tuesday, December 02, 2008 12:54 PM > >> To: Dan Terpstra > >> Cc: perfmon2-devel > >> Subject: Re: [perfmon2] Intel Core i7 specs available. > >> > >> Dan, > >> > >> On Tue, Dec 2, 2008 at 5:10 PM, Dan Terpstra <[EMAIL PROTECTED]> > >> wrote: > >> > > >> > Are you referring to the Uncore Address/Opcode Match stuff > (18.17.2.3)? > >> I > >> > saw that, but wasn't quite sure how to use it. I didn't see anything > in > >> the > >> > PEBS stuff that looked like Data EAR. Or is this part of the Load > >> Latency > >> > stuff that's described in (18.17.1.2)? Looks like part of the latency > >> stuff > >> > includes a Data Address. > >> >> > >> > >> No, I was indeed referring to offcore (which is different from uncore). > >> Yes, that's the load latency PEBS I was talking about. I does give > >> you the cache miss information similar to Itanium D-EAR, you get > >> instr and data addresses, latency, source of the data in addition > >> to the machine state which is quite nice. > >> > >> >> You missed one thing, however, the offcore_response feature. That > one > >> >> is tricky because > >> >> it uses a register that is shared per core (if I recall). > >> >> Perfmon handles offcore_response similaryl to what is going on with > >> >> AMD northbridge event. > >> >> It enforces some form of mutual exclusion. > >> >> > >> > Yes, the off-core response stuff can be coded into any of the generic > >> > registers on any core, but it shares a single common configuration > >> register. > >> > Exclusion logic for this guy could be fun. It looks like this takes > the > >> > place of the SELF/ANY modifiers used in earlier Core architectures > for > >> > events that probed shared cache? > >> >> > >> Well, yes this is tricky. The current code does the following: > >> - only one system-wide session per physical core (each physical > >> core has 2 threads) > >> - only one per-thread session across the entire system (otherwise > >> you have problems > >> in case of migration). > >> > >> > Who owns the system-wide session? First-come first-served? Can it be > any > >> > thread or must it be a specific core? And if you restrict access to > >> counting > >> > (calipers), couldn't you do per-thread access without worrying about > >> > overflow? > >> >> > >> For uncore, the first system-wide session which asks for it, gets it. > >> It can be coming from any core/threads on the socket. > >> > >> >> > I'm not sure that uncore counters should be restricted to system- > wide > >> >> > counting only; I think it could be quite useful, as Phil described > >> for > >> >> > SiCortex, to measure "what's happening to this shared resource > while > >> I'm > >> >> > active". That's not unlike Component PAPI measuring network > activity > >> on > >> >> a > >> >> > >> Counting requires interrupt to virtualize the counters to 64-bit. They > are > >> 48 > >> bits if I recall. > >> > >> > >> >> No, this is not yet supported. I think on x86, this is not that far > >> off. > >> >> > >> > Could you do first-person monitoring in a parent thread and spawn a > >> daughter > >> > thread to measure uncore stuff? Or maybe even fork a new process? > >> >> > >> No, this is currently restricted to system-wide sessions only. > > > > ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ perfmon2-devel mailing list perfmon2-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/perfmon2-devel