This is great Pierre!

I recommend you the following documentation http://www.gem5.org/contributing 
which will guide you through the contributing process. If you have any other 
question, don’t hesitate to ask.

About the issue you are encountering, I am not that familiar with perf_events; 
I think you can debug what is going on
during PMU probing via the PMUVerbose flag in gem5 (it will tell you what 
register is being read/written) or via gdb.

Can I ask you to open a ticket in our bug tracker for this anyway? 
https://gem5.atlassian.net/secure/BrowseProjects.jspa

Many thanks

Giacomo

From: Pierre Ayoub <pierre.ay...@irisa.fr>
Sent: 30 September 2020 12:37
To: Giacomo Travaglini <giacomo.travagl...@arm.com>
Cc: gem5-users <gem5-users@gem5.org>
Subject: Re: Using perf_event with the ARM PMU inside gem5 on Linux

Hi Giacomo,

Many thanks. This time, it works fine and I feel that I really understand how 
the DTB, the GIC and the gem5 code interact together! After declaring correctly 
the PMU in the DTB like you did, we have this confirmation at boot time that 
the Linux kernel correctly see it:
[    0.239967] hw perfevents: enabled with armv8_pmuv3 PMU driver, 32 counters 
available
Just one thing. On my real ARM hardware, I used perf_event with the 
PERF_TYPE_HARDWARE type of event. It doesn't work like this for my gem5 
simulated system, perf_event was not able to establish a correspondence between 
gem5 events and architectural events -- despite that the events number are the 
same and corresponds to the ARMv8 specification. I don't know the reason. Thus, 
the workaround is to use the PERF_TYPE_RAW type of event, and the event ids 
that are declared in the ArmPMU.py file directly in our C source code.

I will see how to send patches and learn how to use gerrit. Thanks for your 
help.

Best,
Pierre

________________________________
De: "Giacomo Travaglini" 
<giacomo.travagl...@arm.com<mailto:giacomo.travagl...@arm.com>>
À: "gem5-users" <gem5-users@gem5.org<mailto:gem5-users@gem5.org>>
Cc: "Pierre Ayoub" <pierre.ay...@irisa.fr<mailto:pierre.ay...@irisa.fr>>
Envoyé: Mardi 29 Septembre 2020 22:22:15
Objet: RE: Using perf_event with the ARM PMU inside gem5 on Linux
Hey Pierre,

You are actually very close to get it right! The problem is: there should be a 
single PMU instantiation.

What you need to do in the BaseCPU is:

        # Generate nodes from the BaseCPU children.
        # Please note: this is mainly needed for the ISA class
        for node in self.recurseDeviceTree(state):
            yield node

Please feel free to push this BaseCPU and ArmISA changes as separate patches to 
gerrit if you want (I have implemented it in the same way). I will post the PMU 
one (it is similar to what you are doing but I have done some other refactoring)

Another thing. You are using PPIs for the PMU (good)
PPIs are per-cpu interrupts; by being local to a PE, there’s no need of having 
a different PPI number per core (and the GIC/PMU driver might actually complain)

So rather than doing:

ints = [20, 21, 22, 23]

You should do something like (example)

ints = [22, 22, 22, 22]

Kind Regards

Giacomo


From: Pierre Ayoub via gem5-users 
<gem5-users@gem5.org<mailto:gem5-users@gem5.org>>
Sent: 29 September 2020 18:05
To: Giacomo Travaglini 
<giacomo.travagl...@arm.com<mailto:giacomo.travagl...@arm.com>>
Cc: gem5-users <gem5-users@gem5.org<mailto:gem5-users@gem5.org>>; Pierre Ayoub 
<pierre.ay...@irisa.fr<mailto:pierre.ay...@irisa.fr>>
Subject: [gem5-users] Re: Using perf_event with the ARM PMU inside gem5 on Linux

Hi Giacomo,

Thank you for your reply. Your hint about the DTB gives me a great starting 
point to make a lot of research about it, and its relation between the Linux 
kernel and the ARM PMU. I though that I would be able to fix this myself, by 
studying how gem5 generate the DTB and how the PMU is declared in a DTB. 
However, despite that I have learned a lot of things, I was wrong.

In my system script, I declare and attach a PMU like this:
            ints = [20, 21, 22, 23]
            assert len(ints) == len(system.cpu_cluster.cpus)
            for cpu, pint in zip(system.cpu_cluster.cpus, ints):
                for isa in cpu.isa:
                    isa.pmu = ArmPMU(interrupt=ArmPPI(num=pint))
                    isa.pmu.addArchEvents(
                        cpu=cpu, dtb=cpu.dtb, itb=cpu.itb,
                        icache=getattr(cpu, "dcache", None),
                        dcache=getattr(cpu, "icache", None),
                        l2cache=getattr(system.cpu_cluster, "l2", None))

And I applied this patch to gem5:
    diff --git i/src/arch/arm/ArmISA.py w/src/arch/arm/ArmISA.py
    index 2641ec3fb..3d85c1b75 100644
    --- i/src/arch/arm/ArmISA.py
    +++ w/src/arch/arm/ArmISA.py
    @@ -36,6 +36,7 @@
     from m5.params import *
     from m5.proxy import *

    +from m5.SimObject import SimObject
     from m5.objects.ArmPMU import ArmPMU
     from m5.objects.ArmSystem import SveVectorLength
     from m5.objects.BaseISA import BaseISA
    @@ -49,6 +50,8 @@ class ArmISA(BaseISA):
         cxx_class = 'ArmISA::ISA'
         cxx_header = "arch/arm/isa.hh"

    +    generateDeviceTree = SimObject.recurseDeviceTree
    +
         system = Param.System(Parent.any, "System this ISA object belongs to")

         pmu = Param.ArmPMU(NULL, "Performance Monitoring Unit")
    diff --git i/src/arch/arm/ArmPMU.py w/src/arch/arm/ArmPMU.py
    index 047e908b3..58553fbf9 100644
    --- i/src/arch/arm/ArmPMU.py
    +++ w/src/arch/arm/ArmPMU.py
    @@ -40,6 +40,7 @@ from m5.params import *
     from m5.params import isNullPointer
     from m5.proxy import *
     from m5.objects.Gic import ArmInterruptPin
    +from m5.util.fdthelper import *

     class ProbeEvent(object):
         def __init__(self, pmu, _eventId, obj, *listOfNames):
    @@ -76,6 +77,17 @@ class ArmPMU(SimObject):

         _events = None

    +    def generateDeviceTree(self, state):
    +        node = FdtNode("pmu")
    +        node.appendCompatible("arm,armv8-pmuv3")
    +        # gem5 uses GIC controller interrupt notation, where PPI interrupts
    +        # start to 16. However, the Linux kernel start from 0, and used a 
tag
    +        # (set to 1) to indicate the PPI interrupt type.
    +        node.append(FdtPropertyWords("interrupts", [
    +            1, int(self.interrupt.num) - 16, 0xf04
    +        ]))
    +        yield node
    +
         def addEvent(self, newObject):
             if not (isinstance(newObject, ProbeEvent)
                 or isinstance(newObject, SoftwareIncrement)):
    diff --git i/src/cpu/BaseCPU.py w/src/cpu/BaseCPU.py
    index ab70d1d7f..e5d0ed3dd 100644
    --- i/src/cpu/BaseCPU.py
    +++ w/src/cpu/BaseCPU.py
    @@ -302,6 +302,9 @@ class BaseCPU(ClockedObject):
                 node.appendPhandle(phandle_key)
                 cpus_node.append(node)

    +        for subnode in self.recurseDeviceTree(state):
    +            node.append(subnode)
    +
             yield cpus_node

         def __init__(self, **kwargs):
I end up with a DTB with this:
                pmu {
                    compatible = "arm,armv8-pmuv3";
                    interrupts = <0x01 0x04 0xf04>;
                };
                pmu {
                    compatible = "arm,armv8-pmuv3";
                    interrupts = <0x01 0x05 0xf04>;
                };
                pmu {
                    compatible = "arm,armv8-pmuv3";
                    interrupts = <0x01 0x06 0xf04>;
                };
                pmu {
                    compatible = "arm,armv8-pmuv3";
                    interrupts = <0x01 0x07 0xf04>;
                };
One PMU declaration for one core. However, it does not work. I don't even know 
if this kind of declaration is correct, maybe we have to declare the PMU once 
for all cores -- instead of one by core ?
Note that the configuration of the kernel is correct to normally initialize 
perf_event (in /proc/config.gz).

Many thanks if you help me, and many thanks also if you post a patch in the 
future.

Best,
Pierre

________________________________
De: "Giacomo Travaglini" 
<giacomo.travagl...@arm.com<mailto:giacomo.travagl...@arm.com>>
À: "gem5-users" <gem5-users@gem5.org<mailto:gem5-users@gem5.org>>
Cc: "Pierre Ayoub" <pierre.ay...@irisa.fr<mailto:pierre.ay...@irisa.fr>>
Envoyé: Jeudi 24 Septembre 2020 12:09:17
Objet: RE: Using perf_event with the ARM PMU inside gem5 on Linux
Hi Pierre,

First of all many thanks for explaining in detail what is your problem. This is 
very helpful.

The reason why you are not able to use perf_events is probably because the 
kernel is not aware of the presence of PMUs. This is usually communicated to 
Linux via the DTB. I can see how we are not enabling DTB autogen for the ArmPMU.

I will post a patch

Kind Regards

Giacomo

From: Pierre Ayoub via gem5-users 
<gem5-users@gem5.org<mailto:gem5-users@gem5.org>>
Sent: 23 September 2020 08:45
To: gem5-users@gem5.org<mailto:gem5-users@gem5.org>
Cc: Pierre Ayoub <pierre.ay...@irisa.fr<mailto:pierre.ay...@irisa.fr>>
Subject: [gem5-users] Using perf_event with the ARM PMU inside gem5 on Linux


Hi gem5's users,

TL;DR:
------

I know that the ARM PMU is partially implemented, thanks to the gem5 source
code and some publications. I have a binary which uses perf_event to access the
PMU on a Linux-based OS, under an ARM processor, on real hardware. Could it use
perf_event inside a gem5 full-system simulation with a Linux kernel, under the
ARM ISA? So far, I haven't found the right way to do it. If someone knows, I
will be very grateful!

Detailed information:
---------------------

I have a binary (developed by myself) which uses perf_event on real ARM
hardware, to get cache misses and mispredicted branches, and it works well. My
"perf_event_attr.type" is configured with "PERF_TYPE_HARDWARE" and the
".config" field with "PERF_COUNT_HW_CACHE_MISSES" and another with
"PERF_COUNT_HW_BRANCH_MISSES." However, when I put this binary on a gem5 fs
simulation, configured with the DerivO3CPU, ArmSystem, and RealView platform, I
got the following error:

"ENOENT (2): No such file or directory"

The perf_event file descriptor is not created by the kernel (equal to -1). I
wish to precise that this error arrives at the return of the perf_event_open()
syscall. Finally, this error is documented in the perf_event_open.2 manpage,
and also discussed here. However, it didn't help me to understand the error
regarding gem5.

I don't know if we can access the PMU through perf_event into gem5. If so,
maybe we have to use RAW events? (i.e., do you know if perf_event is supposed
to be initialized with PERF_EVENT_HARDWARE or PERF_EVENT_RAW, to be used with
gem5?) In the gem5 example code under configs, I have found a snippet in
devices.py which "Instantiates 1 ArmPMU per PE" (addPMUs()). However, after few
tries, I don't understand how to use this correctly and how it is related to
perf_event.

I used a code similar to addPMUs() in devices.py, with PPI interrupts number
20, 21, 22, and 23 (one by core) according to the RealView interrupts mapping,
with the ArmPPI class. However, perf_event_open() still return the same
error. Note also that I got this message during the boot:

src/arch/arm/pmu.cc:293: warn: Not doing anything for write to miscreg 
pmuserenr_el0.

This register is documented in the ARMv8-A architecture manual. I have checked
the pmu.cc file, and saw that writing to this register is not implemented (TODO
state). Normally, it should not be a problem since this register allows (when
set to 1) userland access to the PMU, which we don't want because I want to
access it through the Linux kernel perf_event interface.


With --debug-flags=PMUVerbose, I get the following:

0: system.cpu_cluster.cpus0.isa.pmu: Initializing the PMU.
[...]
0: system.cpu_cluster.cpus0.isa.pmu: PMU: Adding Probe Driven event with id 
'0x2'as probe system.cpu_cluster.cpus0.itb:Refills
[...]
8687351673751: system.cpu_cluster.cpus0.isa.pmu: Assigning PMU to ContextID 0.
[...]
8687351673751: system.cpu_cluster.cpus0.isa.pmu: updateCounter(31): Disabling 
counter
[...]

Now, you know all I know about this issue!
IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.
IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.
IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.
_______________________________________________
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

Reply via email to