Hi All,
The proposal is to about defining the qemu-kvm interface for reserving the
performance monitor for guest(guests) in Embedded Power Architecture. The plan
is to reserve the Performance Monitor for guest/s on guest boot-up and will be
released only on guest reset/exit. If a Performance Monitor reserved by host
then it can't be reserved for guest and vice-versa. So when Performance monitor
is reserved by host then performance monitor can't be used by guests and if
reserved by guest then it should not be used by host.
Note: This proposal is not about partitioning the performance monitor counters
between host and guest. This requires substantial work in existing performance
monitor code in Linux so may be planned as future work.
1) Add a qemu command line parameter (example --reserve-perfmon)
2) If the above command line parameter is present then QEMU will make
a IOCTL call to reserve PM for guest.
Define a new IOCTL (KVM_RESERVE_PERFMON).
3) KVM will reserve the PerfMon.
There is nothing like per core reservation on powerpc platform and also
I feel that it does not make much sense.
int reserve_pmc_hardware(perf_irq_t new_perf_irq)
This reservation just attach new interrupt handler passed in this call.
So Passing NULL or empty function will work because the interrupt is not going
to be taken by host in any case. This will just allow host perfmon reservation
to fail, so that host should not use it.
To support multiple guest, KVM will reserve Performance Monitor only
once and keep a reference count. On subsequent reserve it will increment
reference count.
On Successful reservation on performance monitor: Emulate the
Performance monitor registers and interrupts to guest and return SUCCESS.
If Performance monitor reservation Fails: Write to Performance
registers are ignored and reads are boundedly undefined.
4) On SUCCESS: QEMU will add "power-isa-e.pm" property in all cpus
node of guest device tree
On Failure : Nothing.
5) On Guest Exit: Decrement the reference count and release
reservation when reference count is ZERO (no more guest using PerfMon).
Kernel/Guest:
Ideally the kernel should try to reserve performance monitor only if
"power-isa-e.pm" property is present in device tree. Arch/powerpc/kernel/pmc.c
- reserve_pm_hardware() should fail if this property does not exists. But
addition of this property will require the change in all existing device trees.
To solve this problem we can change u-boot to always add this property
in cpus node, so the Performance Monitor is always available to host kernel (if
not reserved by guest). While QEMU will patch guest device tree as explained
above to allow guest to use it.
Thanks
-Bharat
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html