Hi Andre, you could take a look at Offcore counters. This counter are per CPU and offer possibilities of filter certain events.
I cannot tell you if they are really suited for you needs, but you could give them a try. The best documentation on Offcore counters I this: http://software.intel.com/sites/products/collateral/hpc/vtune/performance_analysis_guide.pdf and the Intel SDM Volume 3: https://www-ssl.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-system-programming-manual-325384.pdf I would also advice you to use libpfm4 to translate event names like OFFCORE_RESPONSE_0:ANY_REQUEST:REMOTE_DRAM into raw events for perf. Here is some overview about howto use libpfm4 with perf http://www.bnikolic.co.uk/blog/hpc-prof-events.html It doesn't cover the use of offcore counters. But here is one example: check_events OFFCORE_RESPONSE_0:ANY_DATA:REMOTE_DRAM Supported PMU models: [7, netburst, "Pentium4"] [8, netburst_p, "Pentium4 (Prescott)"] [11, core, "Intel Core"] [14, atom, "Intel Atom"] [15, nhm, "Intel Nehalem"] [16, nhm_ex, "Intel Nehalem EX"] [17, nhm_unc, "Intel Nehalem uncore"] [18, ix86arch, "Intel X86 architectural PMU"] [51, perf, "perf_events generic PMU"] [52, wsm, "Intel Westmere (single-socket)"] [53, wsm_dp, "Intel Westmere DP"] [54, wsm_unc, "Intel Westmere uncore"] [55, amd64_k7, "AMD64 K7"] [56, amd64_k8_revb, "AMD64 K8 RevB"] [57, amd64_k8_revc, "AMD64 K8 RevC"] [58, amd64_k8_revd, "AMD64 K8 RevD"] [59, amd64_k8_reve, "AMD64 K8 RevE"] [60, amd64_k8_revf, "AMD64 K8 RevF"] [61, amd64_k8_revg, "AMD64 K8 RevG"] [62, amd64_fam10h_barcelona, "AMD64 Fam10h Barcelona"] [63, amd64_fam10h_shanghai, "AMD64 Fam10h Shanghai"] [64, amd64_fam10h_istanbul, "AMD64 Fam10h Istanbul"] [68, snb, "Intel Sandy Bridge"] [69, amd64_fam14h_bobcat, "AMD64 Fam14h Bobcat"] [70, amd64_fam15h_interlagos, "AMD64 Fam15h Interlagos"] [71, snb_ep, "Intel Sandy Bridge EP"] [72, amd64_fam12h_llano, "AMD64 Fam12h Llano"] [73, amd64_fam11h_turion, "AMD64 Fam11h Turion"] [74, ivb, "Intel Ivy Bridge"] [76, snb_unc_cbo0, "Intel Sandy Bridge C-box0 uncore"] [77, snb_unc_cbo1, "Intel Sandy Bridge C-box1 uncore"] [78, snb_unc_cbo2, "Intel Sandy Bridge C-box2 uncore"] [79, snb_unc_cbo3, "Intel Sandy Bridge C-box3 uncore"] [80, snbep_unc_cbo0, "Intel Sandy Bridge-EP C-Box 0 uncore"] [81, snbep_unc_cbo1, "Intel Sandy Bridge-EP C-Box 1 uncore"] [82, snbep_unc_cbo2, "Intel Sandy Bridge-EP C-Box 2 uncore"] [83, snbep_unc_cbo3, "Intel Sandy Bridge-EP C-Box 3 uncore"] [84, snbep_unc_cbo4, "Intel Sandy Bridge-EP C-Box 4 uncore"] [85, snbep_unc_cbo5, "Intel Sandy Bridge-EP C-Box 5 uncore"] [86, snbep_unc_cbo6, "Intel Sandy Bridge-EP C-Box 6 uncore"] [87, snbep_unc_cbo7, "Intel Sandy Bridge-EP C-Box 7 uncore"] [88, snbep_unc_ha, "Intel Sandy Bridge-EP HA uncore"] [89, snbep_unc_imc0, "Intel Sandy Bridge-EP IMC0 uncore"] [90, snbep_unc_imc1, "Intel Sandy Bridge-EP IMC1 uncore"] [91, snbep_unc_imc2, "Intel Sandy Bridge-EP IMC2 uncore"] [92, snbep_unc_imc3, "Intel Sandy Bridge-EP IMC3 uncore"] [93, snbep_unc_pcu, "Intel Sandy Bridge-EP PCU uncore"] [94, snbep_unc_qpi0, "Intel Sandy Bridge-EP QPI0 uncore"] [95, snbep_unc_qpi1, "Intel Sandy Bridge-EP QPI1 uncore"] [96, snbep_unc_ubo, "Intel Sandy Bridge-EP U-Box uncore"] [97, snbep_unc_r2pcie, "Intel Sandy Bridge-EP R2PCIe uncore"] [98, snbep_unc_r3qpi0, "Intel Sandy Bridge-EP R3QPI0 uncore"] [99, snbep_unc_r3qpi1, "Intel Sandy Bridge-EP R3QPI1 uncore"] [100, knc, "Intel Knights Corner"] [103, ivb_ep, "Intel Ivy Bridge EP"] [104, hsw, "Intel Haswell"] [105, ivb_unc_cbo0, "Intel Ivy Bridge C-box0 uncore"] [106, ivb_unc_cbo1, "Intel Ivy Bridge C-box1 uncore"] [107, ivb_unc_cbo2, "Intel Ivy Bridge C-box2 uncore"] [108, ivb_unc_cbo3, "Intel Ivy Bridge C-box3 uncore"] Detected PMU models: [18, ix86arch, "Intel X86 architectural PMU"] [51, perf, "perf_events generic PMU"] [53, wsm_dp, "Intel Westmere DP"] Total events: 3042 available, 177 supported Requested Event: OFFCORE_RESPONSE_0:ANY_DATA:REMOTE_DRAM Actual Event: wsm_dp::OFFCORE_RESPONSE_0:DMND_DATA_RD:DMND_RFO:PF_DATA_RD:PF_RFO:REMOTE_DRAM:k=1:u=1:e=0:i=0:c=0:t=0 PMU : Intel Westmere DP IDX : 111149145 Codes : 0x5301b7 0x2033 <----| |--- Codes are the configs for using them in perf Now you can use the these offcore counters with perf stat: First number config, second config1 [hollmann@inwest format]$ perf stat -e cpu/config=0x5301b7,config1=0x2033,name=Remote_DRAM_Accesses/ ls any cmask edge event inv ldlat offcore_rsp pc umask Performance counter stats for 'ls': 46 Remote_DRAM_Accesses 0.001096052 seconds time elapsed [hollmann@inwest format]$ showevtinfo offcore returns the line Umask-20 : 0x8000 : PMU : [NON_DRAM] : None : Response: Non-DRAM requests that were serviced by IOH which could be useful in your case. Best regards, Andreas 2014/1/15 Andre Richter <andre.o.rich...@gmail.com>: > Hello everyone, > > I am currently fiddling around with performance monitoring on a Xeon > Machine with an e5-2600 series CPU. > > What I understand from Intel's uncore performance monitoring guide and > from Andi Kleen's Readmes from his super useful pmu-tools, it is not > possible to differentiate PCIe traffic per core when using the uncore > PMUs (the CBo boxes to be precise). It is only possible per socket. > > I wonder, however, if it would be possible to at least make an > educated guess of which core is producing how much PCIe or MMIO > traffic. > Maybe by cross-correlating with some PMU evens from core-resident PMUs? > The amount of PMU events and configuration options is quite > overwhelming, that's why I would appreciate any hints I can get :) > > Cheers, > Andre > -- > To unsubscribe from this list: send the line "unsubscribe linux-perf-users" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-perf-users" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html