Dan,

On Tue, Dec 2, 2008 at 5:10 PM, Dan Terpstra <[EMAIL PROTECTED]> wrote:
>
> Are you referring to the Uncore Address/Opcode Match stuff (18.17.2.3)? I
> saw that, but wasn't quite sure how to use it. I didn't see anything in the
> PEBS stuff that looked like Data EAR. Or is this part of the Load Latency
> stuff that's described in (18.17.1.2)? Looks like part of the latency stuff
> includes a Data Address.
>>

No, I was indeed referring to offcore (which is different from uncore).
Yes, that's the load latency PEBS I was talking about. I does give
you the cache miss information similar to Itanium D-EAR, you get
instr and data addresses, latency, source of the data in addition
to the machine state which is quite nice.

>> You missed one thing, however, the offcore_response feature. That one
>> is tricky because
>> it uses a register that is shared per core (if I recall).
>> Perfmon handles offcore_response similaryl to what is going on with
>> AMD northbridge event.
>> It enforces some form of mutual exclusion.
>>
> Yes, the off-core response stuff can be coded into any of the generic
> registers on any core, but it shares a single common configuration register.
> Exclusion logic for this guy could be fun. It looks like this takes the
> place of the SELF/ANY modifiers used in earlier Core architectures for
> events that probed shared cache?
>>
Well, yes this is tricky. The current code does the following:
   - only one system-wide session per physical core (each physical
core has 2 threads)
   - only one per-thread session across the entire system (otherwise
you have problems
     in case of migration).

> Who owns the system-wide session? First-come first-served? Can it be any
> thread or must it be a specific core? And if you restrict access to counting
> (calipers), couldn't you do per-thread access without worrying about
> overflow?
>>
For uncore, the first system-wide session which asks for it, gets it.
It can be coming from any core/threads on the socket.

>> > I'm not sure that uncore counters should be restricted to system-wide
>> > counting only; I think it could be quite useful, as Phil described for
>> > SiCortex, to measure "what's happening to this shared resource while I'm
>> > active". That's not unlike Component PAPI measuring network activity on
>> a
>>
Counting requires interrupt to virtualize the counters to 64-bit. They are 48
bits if I recall.


>> No, this is not yet supported. I think on x86, this is not that far off.
>>
> Could you do first-person monitoring in a parent thread and spawn a daughter
> thread to measure uncore stuff? Or maybe even fork a new process?
>>
No, this is currently restricted to system-wide sessions only.

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
perfmon2-devel mailing list
perfmon2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel

Reply via email to