Hi,

I've tried out offcore and and uncore counters on a Xeon E7- 4850
(Westmere-EX) 4-Socket-Server and compared the output of both
counters.

Here are the results:

perf stat --per-socket --interval-print 1000 -a \
 -e "uncore_mbox_0/event=bbox_
cmds_read/"
 -e "uncore_mbox_1/event=bbox_cmds_read/"
 -e "cpu/config=0x5301b7,config1=0x40ff/" <- all accesses to local DRAM
 -e "cpu/config=0x5301bb,config1=0x20ff/" <- all accesses to remote DRAM

taskset -c 10 ./stream.1000M.1000
   -> start stream with single thread on socket 1
   -> first touch policy allocates memory on socket 1

51.200567532 S0   1     51212  uncore_mbox_0/event=bbox_cmds_read/
51.200567532 S0   1     53875  uncore_mbox_1/event=bbox_cmds_read/
51.200567532 S0  20       930  cpu/config=0x5301b7,config1=0x40ff/
51.200567532 S0  20       256  cpu/config=0x5301bb,config1=0x20ff/
51.200567532 S1   1  35026588  uncore_mbox_0/event=bbox_cmds_read/
51.200567532 S1   1  35027264  uncore_mbox_1/event=bbox_cmds_read/
51.200567532 S1  20  70051225  cpu/config=0x5301b7,config1=0x40ff/
51.200567532 S1  20     94532  cpu/config=0x5301bb,config1=0x20ff/
51.200567532 S2   1      1100  uncore_mbox_0/event=bbox_cmds_read/
51.200567532 S2   1      1313  uncore_mbox_1/event=bbox_cmds_read/
51.200567532 S2  20       502  cpu/config=0x5301b7,config1=0x40ff/
51.200567532 S2  20       543  cpu/config=0x5301bb,config1=0x20ff/
51.200567532 S3   1      1837  uncore_mbox_0/event=bbox_cmds_read/
51.200567532 S3   1      1995  uncore_mbox_1/event=bbox_cmds_read/
51.200567532 S3  20       422  cpu/config=0x5301b7,config1=0x40ff/
51.200567532 S3  20       937  cpu/config=0x5301bb,config1=0x20ff/

Observation: uncore_mbox_0 + uncore_mbox_1 = offcore_response_0 (config1=0x40ff)

taskset -pc 20 $(pgrep stream.1000M.10)
   -> move process to socket 2

58.372255828 S0   1     34562  uncore_mbox_0/event=bbox_cmds_read/
58.372255828 S0   1     36453  uncore_mbox_1/event=bbox_cmds_read/
58.372255828 S0  20      1076  cpu/config=0x5301b7,config1=0x40ff/
58.372255828 S0  20       419  cpu/config=0x5301bb,config1=0x20ff/
58.372255828 S1   1  27712533  uncore_mbox_0/event=bbox_cmds_read/
58.372255828 S1   1  27713447  uncore_mbox_1/event=bbox_cmds_read/
58.372255828 S1  20        98  cpu/config=0x5301b7,config1=0x40ff/
58.372255828 S1  20       490  cpu/config=0x5301bb,config1=0x20ff/
58.372255828 S2   1     17692  uncore_mbox_0/event=bbox_cmds_read/
58.372255828 S2   1     18255  uncore_mbox_1/event=bbox_cmds_read/
58.372255828 S2  20     34914  cpu/config=0x5301b7,config1=0x40ff/
58.372255828 S2  20  55478954  cpu/config=0x5301bb,config1=0x20ff/
58.372255828 S3   1      1734  uncore_mbox_0/event=bbox_cmds_read/
58.372255828 S3   1      2057  uncore_mbox_1/event=bbox_cmds_read/
58.372255828 S3  20       407  cpu/config=0x5301b7,config1=0x40ff/
58.372255828 S3  20      1110  cpu/config=0x5301bb,config1=0x20ff/

Observation: uncore_mbox_0 + uncore_mbox_1 = offcore_response_1 (config1=0x20ff)


./check_events OFFCORE_RESPONSE_0:ANY_REQUEST:LOCAL_DRAM_AND_REMOTE_CACHE_HIT

Requested Event: OFFCORE_RESPONSE_0:ANY_REQUEST:LOCAL_DRAM_AND_REMOTE_CACHE_HIT
Actual    Event:
wsm_dp::OFFCORE_RESPONSE_0:DMND_DATA_RD:DMND_RFO:DMND_IFETCH:WB:PF_DATA_RD:PF_RFO:PF_IFETCH:OTHER:LOCAL_DRAM_AND_REMOTE_CACHE_HIT:k=1:u=1:e=0:i=0:c=0:t=0
PMU            : Intel Westmere DP
IDX            : 111149145
Codes          : 0x5301b7 0x10ff

./check_events OFFCORE_RESPONSE_0:ANY_REQUEST:REMOTE_DRAM

Requested Event: OFFCORE_RESPONSE_0:ANY_REQUEST:REMOTE_DRAM
Actual    Event:
wsm_dp::OFFCORE_RESPONSE_0:DMND_DATA_RD:DMND_RFO:DMND_IFETCH:WB:PF_DATA_RD:PF_RFO:PF_IFETCH:OTHER:REMOTE_DRAM:k=1:u=1:e=0:i=0:c=0:t=0
PMU            : Intel Westmere DP
IDX            : 111149145
Codes          : 0x5301b7 0x20ff


offcore_response_1 (config1=0x10ff) --> gives wrong results on (Westmere-EX)

45.122619772 S0   1     47750  uncore_mbox_0/event=bbox_cmds_read/
45.122619772 S0   1     49549  uncore_mbox_1/event=bbox_cmds_read/
45.122619772 S0   1     47864  uncore_bbox_0/counter=0x1,event=0x1D/
45.122619772 S0   1     49504  uncore_bbox_1/counter=0x1,event=0x1D/
45.122619772 S0  20       212  cpu/config=0x5301b7,config1=0x10ff/
45.122619772 S0  20       290  cpu/config=0x5301bb,config1=0x20ff/
45.122619772 S1   1  37402338  uncore_mbox_0/event=bbox_cmds_read/
45.122619772 S1   1  37398016  uncore_mbox_1/event=bbox_cmds_read/
45.122619772 S1   1  37397916  uncore_bbox_0/counter=0x1,event=0x1D/
45.122619772 S1   1  37442759  uncore_bbox_1/counter=0x1,event=0x1D/
45.122619772 S1  20       574  cpu/config=0x5301b7,config1=0x10ff/
45.122619772 S1  20     85665  cpu/config=0x5301bb,config1=0x20ff/
45.122619772 S2   1      1382  uncore_mbox_0/event=bbox_cmds_read/
45.122619772 S2   1      1921  uncore_mbox_1/event=bbox_cmds_read/
45.122619772 S2   1      1385  uncore_bbox_0/counter=0x1,event=0x1D/
45.122619772 S2   1      1920  uncore_bbox_1/counter=0x1,event=0x1D/
45.122619772 S2  20       276  cpu/config=0x5301b7,config1=0x10ff/
45.122619772 S2  20      1108  cpu/config=0x5301bb,config1=0x20ff/
45.122619772 S3   1      1289  uncore_mbox_0/event=bbox_cmds_read/
45.122619772 S3   1      1367  uncore_mbox_1/event=bbox_cmds_read/
45.122619772 S3   1      1257  uncore_bbox_0/counter=0x1,event=0x1D/
45.122619772 S3   1      1335  uncore_bbox_1/counter=0x1,event=0x1D/
45.122619772 S3  20       258  cpu/config=0x5301b7,config1=0x10ff/
45.122619772 S3  20       665  cpu/config=0x5301bb,config1=0x20ff/

Table 18-21, describing "MSR_OFFCORE_RSP_0 and MSR_OFFCORE_RSP_1 Bit
Field Definition" on Nehalem seems to be valid for Westmere-EX /
libpfm4 seems to be wrong???
(IntelĀ® 64 and IA-32 Architectures Software Developer s Manual, Volume
3 (3A, 3B & 3C): System Programming Guide - February 2014)

Best regards,
Andreas Hollmann

2014-02-24 14:31 GMT+01:00 Stephane Eranian <eran...@googlemail.com>:
> On Mon, Feb 24, 2014 at 2:03 PM, Manuel Selva <selva.man...@gmail.com> wrote:
>> Hi,
>>
>> Following my investigations I reached the following documentation of
>> intel Vtunes amplifier tool:
>>
>> http://software.intel.com/sites/products/documentation/doclib/stdxe/2013/~amplifierxe/pmw_dp/events/offcore_response.html
>>
> Yes, that's another good source of information.
>
>> In this document, the bit 14 is described has nothing and the bit 12
>> is defined as remote cache forward AND local ram accesses.
>>
>> According to this document, to the libpfm showevtinfo and to my
>> experiments I am concluding that the Intel documentation is wrong, and
>> that offcore response events are only able to count globally Local RAM
>> accesses and remote cache accesses. It's impossible to count these
>> events separately. This idea is conforted by the existence of core
>> event named MEM_UNCORE_RETIRED.LOCAL_DRAM_AND_REMOTE_CACHE_HIT and the
>> absence of separate events.
>>
> I believe your conclusion is correct AFAIR. On Westmere, you cannot measure
> those event separately. You'd want to try on IvyBridge-EP (IvyTown), I think.
>
>> Nevertheless I was not able to confirm this hypothesis from an
>> official Intel documentation and was wondering where you (libpfm
>> author) got the information to write your library.
>>
> Waiting for an official answer from them as well.
>
> ------------------------------------------------------------------------------
> Flow-based real-time traffic analytics software. Cisco certified tool.
> Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer
> Customize your own dashboards, set traffic alerts and generate reports.
> Network behavioral analysis & security monitoring. All-in-one tool.
> http://pubads.g.doubleclick.net/gampad/clk?id=126839071&iu=/4140/ostg.clktrk
> _______________________________________________
> perfmon2-devel mailing list
> perfmon2-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/perfmon2-devel

------------------------------------------------------------------------------
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems
_______________________________________________
perfmon2-devel mailing list
perfmon2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel

Reply via email to