> -----Original Message-----
> From: stephane eranian [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, December 02, 2008 8:45 PM
> To: Dan Terpstra
> Cc: perfmon2-devel
> Subject: Re: [perfmon2] Intel Core i7 specs available.
> 
> On Wed, Dec 3, 2008 at 2:43 AM, Dan Terpstra <[EMAIL PROTECTED]>
> wrote:
> >> Counting requires interrupt to virtualize the counters to 64-bit. They
> are
> >> 48
> >> bits if I recall.
> >>
> > Yes, they're 48-bit, but why do you need interrupt to virtualize? With a
> > multi-tasking OS, you just do it at context switch. I still think it
> will be
> > important to count in both domains at once...
> 
> There is no context switch in system-wide mode. You program the counters
> and you let them run.
> 
Didn't realize that. But it makes sense.

> Also the perfmon API export all counters as 64-bit wide. This is very
> useful for tools, especially when sampling.
> 
Agreed, completely. But I still want both sets of events :)
- d

> 
> > - d
> >
> >> -----Original Message-----
> >> From: stephane eranian [mailto:[EMAIL PROTECTED]
> >> Sent: Tuesday, December 02, 2008 12:54 PM
> >> To: Dan Terpstra
> >> Cc: perfmon2-devel
> >> Subject: Re: [perfmon2] Intel Core i7 specs available.
> >>
> >> Dan,
> >>
> >> On Tue, Dec 2, 2008 at 5:10 PM, Dan Terpstra <[EMAIL PROTECTED]>
> >> wrote:
> >> >
> >> > Are you referring to the Uncore Address/Opcode Match stuff
> (18.17.2.3)?
> >> I
> >> > saw that, but wasn't quite sure how to use it. I didn't see anything
> in
> >> the
> >> > PEBS stuff that looked like Data EAR. Or is this part of the Load
> >> Latency
> >> > stuff that's described in (18.17.1.2)? Looks like part of the latency
> >> stuff
> >> > includes a Data Address.
> >> >>
> >>
> >> No, I was indeed referring to offcore (which is different from uncore).
> >> Yes, that's the load latency PEBS I was talking about. I does give
> >> you the cache miss information similar to Itanium D-EAR, you get
> >> instr and data addresses, latency, source of the data in addition
> >> to the machine state which is quite nice.
> >>
> >> >> You missed one thing, however, the offcore_response feature. That
> one
> >> >> is tricky because
> >> >> it uses a register that is shared per core (if I recall).
> >> >> Perfmon handles offcore_response similaryl to what is going on with
> >> >> AMD northbridge event.
> >> >> It enforces some form of mutual exclusion.
> >> >>
> >> > Yes, the off-core response stuff can be coded into any of the generic
> >> > registers on any core, but it shares a single common configuration
> >> register.
> >> > Exclusion logic for this guy could be fun. It looks like this takes
> the
> >> > place of the SELF/ANY modifiers used in earlier Core architectures
> for
> >> > events that probed shared cache?
> >> >>
> >> Well, yes this is tricky. The current code does the following:
> >>    - only one system-wide session per physical core (each physical
> >> core has 2 threads)
> >>    - only one per-thread session across the entire system (otherwise
> >> you have problems
> >>      in case of migration).
> >>
> >> > Who owns the system-wide session? First-come first-served? Can it be
> any
> >> > thread or must it be a specific core? And if you restrict access to
> >> counting
> >> > (calipers), couldn't you do per-thread access without worrying about
> >> > overflow?
> >> >>
> >> For uncore, the first system-wide session which asks for it, gets it.
> >> It can be coming from any core/threads on the socket.
> >>
> >> >> > I'm not sure that uncore counters should be restricted to system-
> wide
> >> >> > counting only; I think it could be quite useful, as Phil described
> >> for
> >> >> > SiCortex, to measure "what's happening to this shared resource
> while
> >> I'm
> >> >> > active". That's not unlike Component PAPI measuring network
> activity
> >> on
> >> >> a
> >> >>
> >> Counting requires interrupt to virtualize the counters to 64-bit. They
> are
> >> 48
> >> bits if I recall.
> >>
> >>
> >> >> No, this is not yet supported. I think on x86, this is not that far
> >> off.
> >> >>
> >> > Could you do first-person monitoring in a parent thread and spawn a
> >> daughter
> >> > thread to measure uncore stuff? Or maybe even fork a new process?
> >> >>
> >> No, this is currently restricted to system-wide sessions only.
> >
> >


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
perfmon2-devel mailing list
perfmon2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel

Reply via email to