Re: [PATCH 3/3] KVM: x86/vPMU: Add lazy mechanism to release perf_event per vPMC

2019-10-09 Thread Peter Zijlstra
On Wed, Oct 09, 2019 at 11:21:30AM +0200, Paolo Bonzini wrote:
> On 09/10/19 10:16, Peter Zijlstra wrote:

> >> bool bitfields preserve the magic behavior where something like this:
> >>
> >>   foo->x = y;
> >>
> >> (x is a bool bitfield) would be compiled as
> >>
> >>   foo->x = (y != 0);
> > 
> > This is confusion; if y is a single bit bitfield, then there is
> > absolutely _NO_ difference between these two expressions.
> 
> y is not in a struct so it cannot be a single bit bitfield. :) If y is
> an int and foo->x is a bool bitfield, you get the following:
> 
>   foo->x = 6; /* foo->x is 1, it would be 0 for int:1 */
>   foo->x = 7; /* foo->x is 1, it would be 1 for int:1 */
> 

Urgh, reading hard. You're right!


Re: [PATCH 3/3] KVM: x86/vPMU: Add lazy mechanism to release perf_event per vPMC

2019-10-09 Thread Paolo Bonzini
On 09/10/19 10:16, Peter Zijlstra wrote:
> On Wed, Oct 09, 2019 at 09:15:03AM +0200, Paolo Bonzini wrote:
>> For stuff like hardware registers, bitfields are probably a bad idea
>> anyway, so let's only consider the case of space optimization.
> 
> Except for hardware registers? I actually like bitfields to describe
> hardware registers.

In theory yes, in practice for MMIO it's a problem that you're not able
to see the exact compiler reads or writes.  Of course you can do:

union {
struct {
/* some bitfields here
} u;
u32 val;
}

and only use the bitfields after reading/writing from the register.

> But worse, as used in the parent thread:
> 
>   u8  count:7;
>   boolflag:1;
> 
> Who says the @flag thing will even be the msb of the initial u8 and not
> a whole new variable due to change in base type?

Good point.

>> bool bitfields preserve the magic behavior where something like this:
>>
>>   foo->x = y;
>>
>> (x is a bool bitfield) would be compiled as
>>
>>   foo->x = (y != 0);
> 
> This is confusion; if y is a single bit bitfield, then there is
> absolutely _NO_ difference between these two expressions.

y is not in a struct so it cannot be a single bit bitfield. :) If y is
an int and foo->x is a bool bitfield, you get the following:

foo->x = 6; /* foo->x is 1, it would be 0 for int:1 */
foo->x = 7; /* foo->x is 1, it would be 1 for int:1 */

Anyway it's good that we agree on the important thing about the patch!

Paolo

> The _only_ thing about _Bool is that it magically casts values to 0,1.
> Single bit bitfield variables have no choice but to already be in that
> range.



Re: [PATCH 3/3] KVM: x86/vPMU: Add lazy mechanism to release perf_event per vPMC

2019-10-09 Thread Peter Zijlstra
On Wed, Oct 09, 2019 at 09:15:03AM +0200, Paolo Bonzini wrote:
> For stuff like hardware registers, bitfields are probably a bad idea
> anyway, so let's only consider the case of space optimization.

Except for hardware registers? I actually like bitfields to describe
hardware registers.

> bool:2 would definitely cause an eyebrow raise, but I don't see why
> bool:1 bitfields are a problem.  An integer type large enough to store
> the values 0 and 1 can be of any size bigger than one bit.

Consider:

boolfoo:1;
boolbar:1;

Will bar use the second bit of _Bool? Does it have one? (yes it does,
but it's still weird).

But worse, as used in the parent thread:

u8  count:7;
boolflag:1;

Who says the @flag thing will even be the msb of the initial u8 and not
a whole new variable due to change in base type?

> bool bitfields preserve the magic behavior where something like this:
> 
>   foo->x = y;
> 
> (x is a bool bitfield) would be compiled as
> 
>   foo->x = (y != 0);

This is confusion; if y is a single bit bitfield, then there is
absolutely _NO_ difference between these two expressions.

The _only_ thing about _Bool is that it magically casts values to 0,1.
Single bit bitfield variables have no choice but to already be in that
range.

So expressions where it matters are:

x = (7&2)   // x == 2
vs
x = !!(7&2) // x == 1

But it is impossible for int:1 and _Bool to behave differently.

> However, in this patch bitfields are unnecessary and they result in
> worse code from the compiler.

Fully agreed :-)


Re: [PATCH 3/3] KVM: x86/vPMU: Add lazy mechanism to release perf_event per vPMC

2019-10-09 Thread Like Xu

Hi Paolo,
On 2019/10/9 15:15, Paolo Bonzini wrote:

On 09/10/19 05:14, Like Xu wrote:




I'm not sure is this your personal preference or is there a technical
reason such as this usage is not incompatible with union syntax?


Apparently it 'works', so there is no hard technical reason, but
consider that _Bool is specified as an integer type large enough to
store the values 0 and 1, then consider it as a base type for a
bitfield. That's just disguisting.


It's reasonable. Thanks.


/me chimes in since this is KVM code after all...

For stuff like hardware registers, bitfields are probably a bad idea
anyway, so let's only consider the case of space optimization.

bool:2 would definitely cause an eyebrow raise, but I don't see why
bool:1 bitfields are a problem.  An integer type large enough to store
the values 0 and 1 can be of any size bigger than one bit.

bool bitfields preserve the magic behavior where something like this:

   foo->x = y;

(x is a bool bitfield) would be compiled as

   foo->x = (y != 0);

which can be a plus or a minus depending on the point of view. :)
Either way, bool bitfields are useful if you are using bitfields for
space optimization, especially if you have existing code using bool and
it might rely on the idiom above.

However, in this patch bitfields are unnecessary and they result in
worse code from the compiler.  There is plenty of padding in struct
kvm_pmu, with or without bitfields, so I'd go with "u8 event_count; bool
enable_cleanup;" (or better "need_cleanup").


Thanks. The "u8 event_count; bool need_cleanup;" looks good to me.

So is the lazy release mechanism looks reasonable to you ?
If so, I may release the next version based on current feedback.



Thanks,

Paolo





Re: [PATCH 3/3] KVM: x86/vPMU: Add lazy mechanism to release perf_event per vPMC

2019-10-09 Thread Paolo Bonzini
On 09/10/19 05:14, Like Xu wrote:
>>
>>
>>> I'm not sure is this your personal preference or is there a technical
>>> reason such as this usage is not incompatible with union syntax?
>>
>> Apparently it 'works', so there is no hard technical reason, but
>> consider that _Bool is specified as an integer type large enough to
>> store the values 0 and 1, then consider it as a base type for a
>> bitfield. That's just disguisting.
> 
> It's reasonable. Thanks.

/me chimes in since this is KVM code after all...

For stuff like hardware registers, bitfields are probably a bad idea
anyway, so let's only consider the case of space optimization.

bool:2 would definitely cause an eyebrow raise, but I don't see why
bool:1 bitfields are a problem.  An integer type large enough to store
the values 0 and 1 can be of any size bigger than one bit.

bool bitfields preserve the magic behavior where something like this:

  foo->x = y;

(x is a bool bitfield) would be compiled as

  foo->x = (y != 0);

which can be a plus or a minus depending on the point of view. :)
Either way, bool bitfields are useful if you are using bitfields for
space optimization, especially if you have existing code using bool and
it might rely on the idiom above.

However, in this patch bitfields are unnecessary and they result in
worse code from the compiler.  There is plenty of padding in struct
kvm_pmu, with or without bitfields, so I'd go with "u8 event_count; bool
enable_cleanup;" (or better "need_cleanup").

Thanks,

Paolo


Re: [PATCH 3/3] KVM: x86/vPMU: Add lazy mechanism to release perf_event per vPMC

2019-10-08 Thread Like Xu

On 2019/10/8 20:11, Peter Zijlstra wrote:

On Tue, Oct 01, 2019 at 08:33:45PM +0800, Like Xu wrote:

Hi Peter,

On 2019/10/1 16:23, Peter Zijlstra wrote:

On Mon, Sep 30, 2019 at 03:22:57PM +0800, Like Xu wrote:

+   union {
+   u8 event_count :7; /* the total number of created perf_events */
+   bool enable_cleanup :1;


That's atrocious, don't ever create a bitfield with base _Bool.


I saw this kind of usages in the tree such as "struct
arm_smmu_master/tipc_mon_state/regmap_irq_chip".


Because other people do tasteless things doesn't make it right.


I'm not sure is this your personal preference or is there a technical
reason such as this usage is not incompatible with union syntax?


Apparently it 'works', so there is no hard technical reason, but
consider that _Bool is specified as an integer type large enough to
store the values 0 and 1, then consider it as a base type for a
bitfield. That's just disguisting.


It's reasonable. Thanks.



Now, I suppose it 'works', but there is no actual benefit over just
using a single bit of any other base type.


My design point is to save a little bit space without introducing
two variables such as "int event_count & bool enable_cleanup".


Your design is questionable, the structure is _huge_, and your union has
event_count:0 and enable_cleanup:0 as the same bit, which I don't think
was intentional.

Did you perhaps want to write:

struct {
u8 event_count : 7;
u8 event_cleanup : 1;
};

which has a total size of 1 byte and uses the low 7 bits as count and the
msb as cleanup.


Yes, my union here is wrong and let me fix it in the next version.



Also, the structure has plenty holes to stick proper variables in
without further growing it.


Yes, we could benefit from it.




By the way, is the lazy release mechanism looks reasonable to you?


I've no idea how it works.. I don't know much about virt.





Re: [PATCH 3/3] KVM: x86/vPMU: Add lazy mechanism to release perf_event per vPMC

2019-10-08 Thread Peter Zijlstra
On Tue, Oct 01, 2019 at 08:33:45PM +0800, Like Xu wrote:
> Hi Peter,
> 
> On 2019/10/1 16:23, Peter Zijlstra wrote:
> > On Mon, Sep 30, 2019 at 03:22:57PM +0800, Like Xu wrote:
> > > + union {
> > > + u8 event_count :7; /* the total number of created perf_events */
> > > + bool enable_cleanup :1;
> > 
> > That's atrocious, don't ever create a bitfield with base _Bool.
> 
> I saw this kind of usages in the tree such as "struct
> arm_smmu_master/tipc_mon_state/regmap_irq_chip".

Because other people do tasteless things doesn't make it right.

> I'm not sure is this your personal preference or is there a technical
> reason such as this usage is not incompatible with union syntax?

Apparently it 'works', so there is no hard technical reason, but
consider that _Bool is specified as an integer type large enough to
store the values 0 and 1, then consider it as a base type for a
bitfield. That's just disguisting.

Now, I suppose it 'works', but there is no actual benefit over just
using a single bit of any other base type.

> My design point is to save a little bit space without introducing
> two variables such as "int event_count & bool enable_cleanup".

Your design is questionable, the structure is _huge_, and your union has
event_count:0 and enable_cleanup:0 as the same bit, which I don't think
was intentional.

Did you perhaps want to write:

struct {
u8 event_count : 7;
u8 event_cleanup : 1;
};

which has a total size of 1 byte and uses the low 7 bits as count and the
msb as cleanup.

Also, the structure has plenty holes to stick proper variables in
without further growing it.

> By the way, is the lazy release mechanism looks reasonable to you?

I've no idea how it works.. I don't know much about virt.


Re: [PATCH 3/3] KVM: x86/vPMU: Add lazy mechanism to release perf_event per vPMC

2019-10-01 Thread Like Xu

Hi Peter,

On 2019/10/1 16:23, Peter Zijlstra wrote:

On Mon, Sep 30, 2019 at 03:22:57PM +0800, Like Xu wrote:

+   union {
+   u8 event_count :7; /* the total number of created perf_events */
+   bool enable_cleanup :1;


That's atrocious, don't ever create a bitfield with base _Bool.


I saw this kind of usages in the tree such as "struct 
arm_smmu_master/tipc_mon_state/regmap_irq_chip".


I'm not sure is this your personal preference or is there a technical
reason such as this usage is not incompatible with union syntax?

My design point is to save a little bit space without introducing
two variables such as "int event_count & bool enable_cleanup".

One of the alternatives is to introduce "u8 pmu_state", where the last 
seven bits are event_count for X86_PMC_IDX_MAX and the highest bit is 
the enable_cleanup bit. Are you OK with this ?


By the way, is the lazy release mechanism looks reasonable to you?




+   } state;






Re: [PATCH 3/3] KVM: x86/vPMU: Add lazy mechanism to release perf_event per vPMC

2019-10-01 Thread Peter Zijlstra
On Mon, Sep 30, 2019 at 03:22:57PM +0800, Like Xu wrote:
> + union {
> + u8 event_count :7; /* the total number of created perf_events */
> + bool enable_cleanup :1;

That's atrocious, don't ever create a bitfield with base _Bool.

> + } state;


[PATCH 3/3] KVM: x86/vPMU: Add lazy mechanism to release perf_event per vPMC

2019-09-30 Thread Like Xu
Currently, a host perf_event is created for a vPMC functionality emulation.
It’s unpredictable to determine if a disabled perf_event will be reused.
If they are disabled and are not reused for a considerable period of time,
those obsolete perf_events would increase host context switch overhead that
could have been avoided.

If the guest doesn't access (set_msr/get_msr/rdpmc) any of the vPMC's MSRs
during an entire vcpu sched time slice, and its independent enable bit of
the vPMC isn't set, we can predict that the guest has finished the use of
this vPMC, and then it's time to release the non-reused perf_event on the
first call of vcpu_enter_guest() since the vcpu gets next scheduled in.

This lazy mechanism delays the event release time to the beginning of the
next scheduled time slice if vPMC's MSRs aren't accessed during this time
slice. If guest comes back to use this vPMC in next time slice, a new perf
event would be re-created via perf_event_create_kernel_counter() as usual.

Suggested-by: Wei W Wang 
Signed-off-by: Like Xu 
---
 arch/x86/include/asm/kvm_host.h |  8 ++
 arch/x86/kvm/pmu.c  | 43 +
 arch/x86/kvm/pmu.h  |  3 +++
 arch/x86/kvm/pmu_amd.c  | 13 ++
 arch/x86/kvm/vmx/pmu_intel.c| 25 +++
 arch/x86/kvm/x86.c  |  6 +
 6 files changed, 98 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 15f2ebad94f9..6723c04c8dc6 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -479,6 +479,14 @@ struct kvm_pmu {
struct kvm_pmc fixed_counters[INTEL_PMC_MAX_FIXED];
struct irq_work irq_work;
u64 reprogram_pmi;
+
+   /* for PMC being set, do not released its perf_event (if any) */
+   u64 lazy_release_ctrl;
+
+   union {
+   u8 event_count :7; /* the total number of created perf_events */
+   bool enable_cleanup :1;
+   } state;
 };
 
 struct kvm_pmu_ops;
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 74bc5c42b8b5..1b3cec38b1a1 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -137,6 +137,7 @@ static void pmc_reprogram_counter(struct kvm_pmc *pmc, u32 
type,
}
 
pmc->perf_event = event;
+   pmc_to_pmu(pmc)->state.event_count++;
clear_bit(pmc->idx, (unsigned long*)&pmc_to_pmu(pmc)->reprogram_pmi);
 }
 
@@ -368,6 +369,7 @@ int kvm_pmu_rdpmc(struct kvm_vcpu *vcpu, unsigned idx, u64 
*data)
if (!pmc)
return 1;
 
+   __set_bit(pmc->idx, (unsigned long *)&pmu->lazy_release_ctrl);
*data = pmc_read_counter(pmc) & mask;
return 0;
 }
@@ -385,11 +387,13 @@ bool kvm_pmu_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
 
 int kvm_pmu_get_msr(struct kvm_vcpu *vcpu, u32 msr, u64 *data)
 {
+   kvm_x86_ops->pmu_ops->update_lazy_release_ctrl(vcpu, msr);
return kvm_x86_ops->pmu_ops->get_msr(vcpu, msr, data);
 }
 
 int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 {
+   kvm_x86_ops->pmu_ops->update_lazy_release_ctrl(vcpu, msr_info->index);
return kvm_x86_ops->pmu_ops->set_msr(vcpu, msr_info);
 }
 
@@ -417,9 +421,48 @@ void kvm_pmu_init(struct kvm_vcpu *vcpu)
memset(pmu, 0, sizeof(*pmu));
kvm_x86_ops->pmu_ops->init(vcpu);
init_irq_work(&pmu->irq_work, kvm_pmi_trigger_fn);
+   pmu->lazy_release_ctrl = 0;
+   pmu->state.event_count = 0;
+   pmu->state.enable_cleanup = false;
kvm_pmu_refresh(vcpu);
 }
 
+static inline bool pmc_speculative_in_use(struct kvm_pmc *pmc)
+{
+   struct kvm_pmu *pmu = pmc_to_pmu(pmc);
+
+   if (pmc_is_fixed(pmc))
+   return fixed_ctrl_field(pmu->fixed_ctr_ctrl,
+   pmc->idx - INTEL_PMC_IDX_FIXED) & 0x3;
+
+   return pmc->eventsel & ARCH_PERFMON_EVENTSEL_ENABLE;
+}
+
+void kvm_pmu_cleanup(struct kvm_vcpu *vcpu)
+{
+   struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
+   struct kvm_pmc *pmc = NULL;
+   u64 bitmask = ~pmu->lazy_release_ctrl;
+   int i;
+
+   if (!unlikely(pmu->state.enable_cleanup))
+   return;
+
+   /* do cleanup before the first time of running vcpu after sched_in */
+   pmu->state.enable_cleanup = false;
+
+   /* cleanup unmarked vPMC in the last sched time slice */
+   for_each_set_bit(i, (unsigned long *)&bitmask, X86_PMC_IDX_MAX) {
+   pmc = kvm_x86_ops->pmu_ops->pmc_idx_to_pmc(pmu, i);
+
+   if (pmc && pmc->perf_event && !pmc_speculative_in_use(pmc))
+   pmc_stop_counter(pmc);
+   }
+
+   /* reset vPMC lazy-release states for this sched time slice */
+   pmu->lazy_release_ctrl = 0;
+}
+
 void kvm_pmu_destroy(struct kvm_vcpu *vcpu)
 {
kvm_pmu_reset(vcpu);
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index 3a95952702d2..c681738ba59c 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -34,6 +34,7 @@ st