pmc_reprogram_counter() always sets a sample period based on the value of
pmc->counter. However, hsw_hw_config() rejects sample periods less than
2^31 - 1. So for example, a KVM guest doing
  perf stat -e r2005101c4 sleep 0
will count some conditional branch events, deschedule the task, reschedule
the task, try to restore the guest PMU state for the task, in the host
reach pmc_reprogram_counter() with nonzero pmc->count, trigger EOPNOTSUPP
in hsw_hw_config(), print "kvm_pmu: event creation failed" in
pmc_reprogram_counter(), and silently (from the guest's point of view) stop
counting events.

We fix event counting by forcing attr.sample_period to always be zero for
in_tx_cp counters. Sampling doesn't work, but it already didn't work and
can't be fixed without major changes to the approach in hsw_hw_config().

Signed-off-by: Robert O'Callahan <[email protected]>
---
 arch/x86/kvm/pmu.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 06ce377..af993d7 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -113,12 +113,18 @@ static void pmc_reprogram_counter(struct kvm_pmc *pmc, 
u32 type,
                .config = config,
        };
 
+       attr.sample_period = (-pmc->counter) & pmc_bitmask(pmc);
+
        if (in_tx)
                attr.config |= HSW_IN_TX;
-       if (in_tx_cp)
+       if (in_tx_cp) {
+               /* HSW_IN_TX_CHECKPOINTED is not supported with nonzero
+                * period. Just clear the sample period so at least
+                * allocating the counter doesn't fail.
+                */
+               attr.sample_period = 0;
                attr.config |= HSW_IN_TX_CHECKPOINTED;
-
-       attr.sample_period = (-pmc->counter) & pmc_bitmask(pmc);
+       }
 
        event = perf_event_create_kernel_counter(&attr, -1, current,
                                                 intr ? kvm_perf_overflow_intr :
-- 
2.9.3

Reply via email to