Re: [PATCH] KVM: arm/arm64: Close VMID generation race

2018-04-16 Thread Marc Zyngier
On 16/04/18 11:05, Shannon Zhao wrote:
> 
> 
> On 2018/4/11 9:30, Shannon Zhao wrote:
>>
>> On 2018/4/10 23:37, Marc Zyngier wrote:

[...]

 I don't mind either way. If you can be bothered to write a proper commit
 log for this, I'll take it. What I'd really want is Shannon to indicate
 whether or not this solves the issue he was seeing.

>> I'll test Marc's patch. This will take about 3 days since it's not 100%
>> reproducible.
> Hi Marc,
> 
> I've run the test for about 4 days. The issue doesn't appear.
> So Tested-by: Shannon Zhao 

Thanks Shannon, much appreciated. I'll send the fix upstream towards the
end of the week.

Cheers,

M.
-- 
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH] KVM: arm/arm64: Close VMID generation race

2018-04-16 Thread Shannon Zhao


On 2018/4/11 9:30, Shannon Zhao wrote:
> 
> On 2018/4/10 23:37, Marc Zyngier wrote:
>> > On 10/04/18 16:24, Mark Rutland wrote:
>>> >> On Tue, Apr 10, 2018 at 05:05:40PM +0200, Christoffer Dall wrote:
 >>> On Tue, Apr 10, 2018 at 11:51:19AM +0100, Mark Rutland wrote:
>  I think we also need to update kvm->arch.vttbr before updating
>  kvm->arch.vmid_gen, otherwise another CPU can come in, see that the
>  vmid_gen is up-to-date, jump to hyp, and program a stale VTTBR (with 
>  the
>  old VMID).
> 
>  With the smp_wmb() and update of kvm->arch.vmid_gen moved to the end 
>  of
>  the critical section, I think that works, modulo using READ_ONCE() 
>  and
>  WRITE_ONCE() to ensure single-copy-atomicity of the fields we access
>  locklessly.
 >>>
 >>> Indeed, you're right.  I would look something like this, then:
 >>>
 >>> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
 >>> index 2e43f9d42bd5..6cb08995e7ff 100644
 >>> --- a/virt/kvm/arm/arm.c
 >>> +++ b/virt/kvm/arm/arm.c
 >>> @@ -450,7 +450,9 @@ void force_vm_exit(const cpumask_t *mask)
 >>>   */
 >>>  static bool need_new_vmid_gen(struct kvm *kvm)
 >>>  {
 >>> -  return unlikely(kvm->arch.vmid_gen != 
 >>> atomic64_read(_vmid_gen));
 >>> +  u64 current_vmid_gen = atomic64_read(_vmid_gen);
 >>> +  smp_rmb(); /* Orders read of kvm_vmid_gen and kvm->arch.vmid */
 >>> +  return unlikely(READ_ONCE(kvm->arch.vmid_gen) != 
 >>> current_vmid_gen);
 >>>  }
 >>>  
 >>>  /**
 >>> @@ -500,7 +502,6 @@ static void update_vttbr(struct kvm *kvm)
 >>>kvm_call_hyp(__kvm_flush_vm_context);
 >>>}
 >>>  
 >>> -  kvm->arch.vmid_gen = atomic64_read(_vmid_gen);
 >>>kvm->arch.vmid = kvm_next_vmid;
 >>>kvm_next_vmid++;
 >>>kvm_next_vmid &= (1 << kvm_vmid_bits) - 1;
 >>> @@ -509,7 +510,10 @@ static void update_vttbr(struct kvm *kvm)
 >>>pgd_phys = virt_to_phys(kvm->arch.pgd);
 >>>BUG_ON(pgd_phys & ~VTTBR_BADDR_MASK);
 >>>vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & 
 >>> VTTBR_VMID_MASK(kvm_vmid_bits);
 >>> -  kvm->arch.vttbr = pgd_phys | vmid;
 >>> +  WRITE_ONCE(kvm->arch.vttbr, pgd_phys | vmid);
 >>> +
 >>> +  smp_wmb(); /* Ensure vttbr update is observed before vmid_gen 
 >>> update */
 >>> +  kvm->arch.vmid_gen = atomic64_read(_vmid_gen);
 >>>  
 >>>spin_unlock(_vmid_lock);
 >>>  }
>>> >>
>>> >> I think that's right, yes.
>>> >>
>>> >> We could replace the smp_{r,w}mb() barriers with an acquire of the
>>> >> kvm_vmid_gen and a release of kvm->arch.vmid_gen, but if we're really
>>> >> trying to optimize things there are larger algorithmic changes necessary
>>> >> anyhow.
>>> >>
 >>> It's probably easier to convince ourselves about the correctness of
 >>> Marc's code using a rwlock instead, though.  Thoughts?
>>> >>
>>> >> I believe that Marc's preference was the rwlock; I have no preference
>>> >> either way.
>> > 
>> > I don't mind either way. If you can be bothered to write a proper commit
>> > log for this, I'll take it. What I'd really want is Shannon to indicate
>> > whether or not this solves the issue he was seeing.
>> > 
> I'll test Marc's patch. This will take about 3 days since it's not 100%
> reproducible.
Hi Marc,

I've run the test for about 4 days. The issue doesn't appear.
So Tested-by: Shannon Zhao 

Thanks,
-- 
Shannon

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH] KVM: arm/arm64: Close VMID generation race

2018-04-10 Thread Shannon Zhao


On 2018/4/10 23:37, Marc Zyngier wrote:
> On 10/04/18 16:24, Mark Rutland wrote:
>> On Tue, Apr 10, 2018 at 05:05:40PM +0200, Christoffer Dall wrote:
>>> On Tue, Apr 10, 2018 at 11:51:19AM +0100, Mark Rutland wrote:
 I think we also need to update kvm->arch.vttbr before updating
 kvm->arch.vmid_gen, otherwise another CPU can come in, see that the
 vmid_gen is up-to-date, jump to hyp, and program a stale VTTBR (with the
 old VMID).

 With the smp_wmb() and update of kvm->arch.vmid_gen moved to the end of
 the critical section, I think that works, modulo using READ_ONCE() and
 WRITE_ONCE() to ensure single-copy-atomicity of the fields we access
 locklessly.
>>>
>>> Indeed, you're right.  I would look something like this, then:
>>>
>>> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
>>> index 2e43f9d42bd5..6cb08995e7ff 100644
>>> --- a/virt/kvm/arm/arm.c
>>> +++ b/virt/kvm/arm/arm.c
>>> @@ -450,7 +450,9 @@ void force_vm_exit(const cpumask_t *mask)
>>>   */
>>>  static bool need_new_vmid_gen(struct kvm *kvm)
>>>  {
>>> -   return unlikely(kvm->arch.vmid_gen != atomic64_read(_vmid_gen));
>>> +   u64 current_vmid_gen = atomic64_read(_vmid_gen);
>>> +   smp_rmb(); /* Orders read of kvm_vmid_gen and kvm->arch.vmid */
>>> +   return unlikely(READ_ONCE(kvm->arch.vmid_gen) != current_vmid_gen);
>>>  }
>>>  
>>>  /**
>>> @@ -500,7 +502,6 @@ static void update_vttbr(struct kvm *kvm)
>>> kvm_call_hyp(__kvm_flush_vm_context);
>>> }
>>>  
>>> -   kvm->arch.vmid_gen = atomic64_read(_vmid_gen);
>>> kvm->arch.vmid = kvm_next_vmid;
>>> kvm_next_vmid++;
>>> kvm_next_vmid &= (1 << kvm_vmid_bits) - 1;
>>> @@ -509,7 +510,10 @@ static void update_vttbr(struct kvm *kvm)
>>> pgd_phys = virt_to_phys(kvm->arch.pgd);
>>> BUG_ON(pgd_phys & ~VTTBR_BADDR_MASK);
>>> vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & 
>>> VTTBR_VMID_MASK(kvm_vmid_bits);
>>> -   kvm->arch.vttbr = pgd_phys | vmid;
>>> +   WRITE_ONCE(kvm->arch.vttbr, pgd_phys | vmid);
>>> +
>>> +   smp_wmb(); /* Ensure vttbr update is observed before vmid_gen update */
>>> +   kvm->arch.vmid_gen = atomic64_read(_vmid_gen);
>>>  
>>> spin_unlock(_vmid_lock);
>>>  }
>>
>> I think that's right, yes.
>>
>> We could replace the smp_{r,w}mb() barriers with an acquire of the
>> kvm_vmid_gen and a release of kvm->arch.vmid_gen, but if we're really
>> trying to optimize things there are larger algorithmic changes necessary
>> anyhow.
>>
>>> It's probably easier to convince ourselves about the correctness of
>>> Marc's code using a rwlock instead, though.  Thoughts?
>>
>> I believe that Marc's preference was the rwlock; I have no preference
>> either way.
> 
> I don't mind either way. If you can be bothered to write a proper commit
> log for this, I'll take it. What I'd really want is Shannon to indicate
> whether or not this solves the issue he was seeing.
> 
I'll test Marc's patch. This will take about 3 days since it's not 100%
reproducible.

Thanks,
-- 
Shannon

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH] KVM: arm/arm64: Close VMID generation race

2018-04-10 Thread Christoffer Dall
On Tue, Apr 10, 2018 at 04:37:12PM +0100, Marc Zyngier wrote:
> On 10/04/18 16:24, Mark Rutland wrote:
> > On Tue, Apr 10, 2018 at 05:05:40PM +0200, Christoffer Dall wrote:
> >> On Tue, Apr 10, 2018 at 11:51:19AM +0100, Mark Rutland wrote:
> >>> I think we also need to update kvm->arch.vttbr before updating
> >>> kvm->arch.vmid_gen, otherwise another CPU can come in, see that the
> >>> vmid_gen is up-to-date, jump to hyp, and program a stale VTTBR (with the
> >>> old VMID).
> >>>
> >>> With the smp_wmb() and update of kvm->arch.vmid_gen moved to the end of
> >>> the critical section, I think that works, modulo using READ_ONCE() and
> >>> WRITE_ONCE() to ensure single-copy-atomicity of the fields we access
> >>> locklessly.
> >>
> >> Indeed, you're right.  I would look something like this, then:
> >>
> >> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> >> index 2e43f9d42bd5..6cb08995e7ff 100644
> >> --- a/virt/kvm/arm/arm.c
> >> +++ b/virt/kvm/arm/arm.c
> >> @@ -450,7 +450,9 @@ void force_vm_exit(const cpumask_t *mask)
> >>   */
> >>  static bool need_new_vmid_gen(struct kvm *kvm)
> >>  {
> >> -  return unlikely(kvm->arch.vmid_gen != atomic64_read(_vmid_gen));
> >> +  u64 current_vmid_gen = atomic64_read(_vmid_gen);
> >> +  smp_rmb(); /* Orders read of kvm_vmid_gen and kvm->arch.vmid */
> >> +  return unlikely(READ_ONCE(kvm->arch.vmid_gen) != current_vmid_gen);
> >>  }
> >>  
> >>  /**
> >> @@ -500,7 +502,6 @@ static void update_vttbr(struct kvm *kvm)
> >>kvm_call_hyp(__kvm_flush_vm_context);
> >>}
> >>  
> >> -  kvm->arch.vmid_gen = atomic64_read(_vmid_gen);
> >>kvm->arch.vmid = kvm_next_vmid;
> >>kvm_next_vmid++;
> >>kvm_next_vmid &= (1 << kvm_vmid_bits) - 1;
> >> @@ -509,7 +510,10 @@ static void update_vttbr(struct kvm *kvm)
> >>pgd_phys = virt_to_phys(kvm->arch.pgd);
> >>BUG_ON(pgd_phys & ~VTTBR_BADDR_MASK);
> >>vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & 
> >> VTTBR_VMID_MASK(kvm_vmid_bits);
> >> -  kvm->arch.vttbr = pgd_phys | vmid;
> >> +  WRITE_ONCE(kvm->arch.vttbr, pgd_phys | vmid);
> >> +
> >> +  smp_wmb(); /* Ensure vttbr update is observed before vmid_gen update */
> >> +  kvm->arch.vmid_gen = atomic64_read(_vmid_gen);
> >>  
> >>spin_unlock(_vmid_lock);
> >>  }
> > 
> > I think that's right, yes.
> > 
> > We could replace the smp_{r,w}mb() barriers with an acquire of the
> > kvm_vmid_gen and a release of kvm->arch.vmid_gen, but if we're really
> > trying to optimize things there are larger algorithmic changes necessary
> > anyhow.
> > 
> >> It's probably easier to convince ourselves about the correctness of
> >> Marc's code using a rwlock instead, though.  Thoughts?
> > 
> > I believe that Marc's preference was the rwlock; I have no preference
> > either way.
> 
> I don't mind either way. If you can be bothered to write a proper commit
> log for this, I'll take it. 

You've already done the work, and your patch is easier to read, so let's
just go ahead with that.

I was just curious to which degree my original implementation was
broken; was I trying to achieve something impossible or was I just
writing buggy code.  Seems the latter.  Oh well.

> What I'd really want is Shannon to indicate
> whether or not this solves the issue he was seeing.
> 

Agreed, would like to see that too.

Thanks (and sorry for being noisy),
-Christoffer
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH] KVM: arm/arm64: Close VMID generation race

2018-04-10 Thread Marc Zyngier
On 10/04/18 16:24, Mark Rutland wrote:
> On Tue, Apr 10, 2018 at 05:05:40PM +0200, Christoffer Dall wrote:
>> On Tue, Apr 10, 2018 at 11:51:19AM +0100, Mark Rutland wrote:
>>> I think we also need to update kvm->arch.vttbr before updating
>>> kvm->arch.vmid_gen, otherwise another CPU can come in, see that the
>>> vmid_gen is up-to-date, jump to hyp, and program a stale VTTBR (with the
>>> old VMID).
>>>
>>> With the smp_wmb() and update of kvm->arch.vmid_gen moved to the end of
>>> the critical section, I think that works, modulo using READ_ONCE() and
>>> WRITE_ONCE() to ensure single-copy-atomicity of the fields we access
>>> locklessly.
>>
>> Indeed, you're right.  I would look something like this, then:
>>
>> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
>> index 2e43f9d42bd5..6cb08995e7ff 100644
>> --- a/virt/kvm/arm/arm.c
>> +++ b/virt/kvm/arm/arm.c
>> @@ -450,7 +450,9 @@ void force_vm_exit(const cpumask_t *mask)
>>   */
>>  static bool need_new_vmid_gen(struct kvm *kvm)
>>  {
>> -return unlikely(kvm->arch.vmid_gen != atomic64_read(_vmid_gen));
>> +u64 current_vmid_gen = atomic64_read(_vmid_gen);
>> +smp_rmb(); /* Orders read of kvm_vmid_gen and kvm->arch.vmid */
>> +return unlikely(READ_ONCE(kvm->arch.vmid_gen) != current_vmid_gen);
>>  }
>>  
>>  /**
>> @@ -500,7 +502,6 @@ static void update_vttbr(struct kvm *kvm)
>>  kvm_call_hyp(__kvm_flush_vm_context);
>>  }
>>  
>> -kvm->arch.vmid_gen = atomic64_read(_vmid_gen);
>>  kvm->arch.vmid = kvm_next_vmid;
>>  kvm_next_vmid++;
>>  kvm_next_vmid &= (1 << kvm_vmid_bits) - 1;
>> @@ -509,7 +510,10 @@ static void update_vttbr(struct kvm *kvm)
>>  pgd_phys = virt_to_phys(kvm->arch.pgd);
>>  BUG_ON(pgd_phys & ~VTTBR_BADDR_MASK);
>>  vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & 
>> VTTBR_VMID_MASK(kvm_vmid_bits);
>> -kvm->arch.vttbr = pgd_phys | vmid;
>> +WRITE_ONCE(kvm->arch.vttbr, pgd_phys | vmid);
>> +
>> +smp_wmb(); /* Ensure vttbr update is observed before vmid_gen update */
>> +kvm->arch.vmid_gen = atomic64_read(_vmid_gen);
>>  
>>  spin_unlock(_vmid_lock);
>>  }
> 
> I think that's right, yes.
> 
> We could replace the smp_{r,w}mb() barriers with an acquire of the
> kvm_vmid_gen and a release of kvm->arch.vmid_gen, but if we're really
> trying to optimize things there are larger algorithmic changes necessary
> anyhow.
> 
>> It's probably easier to convince ourselves about the correctness of
>> Marc's code using a rwlock instead, though.  Thoughts?
> 
> I believe that Marc's preference was the rwlock; I have no preference
> either way.

I don't mind either way. If you can be bothered to write a proper commit
log for this, I'll take it. What I'd really want is Shannon to indicate
whether or not this solves the issue he was seeing.

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH] KVM: arm/arm64: Close VMID generation race

2018-04-10 Thread Christoffer Dall
On Tue, Apr 10, 2018 at 04:24:20PM +0100, Mark Rutland wrote:
> On Tue, Apr 10, 2018 at 05:05:40PM +0200, Christoffer Dall wrote:
> > On Tue, Apr 10, 2018 at 11:51:19AM +0100, Mark Rutland wrote:
> > > I think we also need to update kvm->arch.vttbr before updating
> > > kvm->arch.vmid_gen, otherwise another CPU can come in, see that the
> > > vmid_gen is up-to-date, jump to hyp, and program a stale VTTBR (with the
> > > old VMID).
> > > 
> > > With the smp_wmb() and update of kvm->arch.vmid_gen moved to the end of
> > > the critical section, I think that works, modulo using READ_ONCE() and
> > > WRITE_ONCE() to ensure single-copy-atomicity of the fields we access
> > > locklessly.
> > 
> > Indeed, you're right.  I would look something like this, then:
> > 
> > diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> > index 2e43f9d42bd5..6cb08995e7ff 100644
> > --- a/virt/kvm/arm/arm.c
> > +++ b/virt/kvm/arm/arm.c
> > @@ -450,7 +450,9 @@ void force_vm_exit(const cpumask_t *mask)
> >   */
> >  static bool need_new_vmid_gen(struct kvm *kvm)
> >  {
> > -   return unlikely(kvm->arch.vmid_gen != atomic64_read(_vmid_gen));
> > +   u64 current_vmid_gen = atomic64_read(_vmid_gen);
> > +   smp_rmb(); /* Orders read of kvm_vmid_gen and kvm->arch.vmid */
> > +   return unlikely(READ_ONCE(kvm->arch.vmid_gen) != current_vmid_gen);
> >  }
> >  
> >  /**
> > @@ -500,7 +502,6 @@ static void update_vttbr(struct kvm *kvm)
> > kvm_call_hyp(__kvm_flush_vm_context);
> > }
> >  
> > -   kvm->arch.vmid_gen = atomic64_read(_vmid_gen);
> > kvm->arch.vmid = kvm_next_vmid;
> > kvm_next_vmid++;
> > kvm_next_vmid &= (1 << kvm_vmid_bits) - 1;
> > @@ -509,7 +510,10 @@ static void update_vttbr(struct kvm *kvm)
> > pgd_phys = virt_to_phys(kvm->arch.pgd);
> > BUG_ON(pgd_phys & ~VTTBR_BADDR_MASK);
> > vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & 
> > VTTBR_VMID_MASK(kvm_vmid_bits);
> > -   kvm->arch.vttbr = pgd_phys | vmid;
> > +   WRITE_ONCE(kvm->arch.vttbr, pgd_phys | vmid);
> > +
> > +   smp_wmb(); /* Ensure vttbr update is observed before vmid_gen update */
> > +   kvm->arch.vmid_gen = atomic64_read(_vmid_gen);
> >  
> > spin_unlock(_vmid_lock);
> >  }
> 
> I think that's right, yes.
> 
> We could replace the smp_{r,w}mb() barriers with an acquire of the
> kvm_vmid_gen and a release of kvm->arch.vmid_gen, but if we're really
> trying to optimize things there are larger algorithmic changes necessary
> anyhow.
> 
> > It's probably easier to convince ourselves about the correctness of
> > Marc's code using a rwlock instead, though.  Thoughts?
> 
> I believe that Marc's preference was the rwlock; I have no preference
> either way.
> 

I'm fine with both approaches as well, but it was educational for me to
see if this could be done in the lockless way as well.  Thanks for
having a look at that!

-Christoffer
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH] KVM: arm/arm64: Close VMID generation race

2018-04-10 Thread Mark Rutland
On Tue, Apr 10, 2018 at 05:05:40PM +0200, Christoffer Dall wrote:
> On Tue, Apr 10, 2018 at 11:51:19AM +0100, Mark Rutland wrote:
> > I think we also need to update kvm->arch.vttbr before updating
> > kvm->arch.vmid_gen, otherwise another CPU can come in, see that the
> > vmid_gen is up-to-date, jump to hyp, and program a stale VTTBR (with the
> > old VMID).
> > 
> > With the smp_wmb() and update of kvm->arch.vmid_gen moved to the end of
> > the critical section, I think that works, modulo using READ_ONCE() and
> > WRITE_ONCE() to ensure single-copy-atomicity of the fields we access
> > locklessly.
> 
> Indeed, you're right.  I would look something like this, then:
> 
> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> index 2e43f9d42bd5..6cb08995e7ff 100644
> --- a/virt/kvm/arm/arm.c
> +++ b/virt/kvm/arm/arm.c
> @@ -450,7 +450,9 @@ void force_vm_exit(const cpumask_t *mask)
>   */
>  static bool need_new_vmid_gen(struct kvm *kvm)
>  {
> - return unlikely(kvm->arch.vmid_gen != atomic64_read(_vmid_gen));
> + u64 current_vmid_gen = atomic64_read(_vmid_gen);
> + smp_rmb(); /* Orders read of kvm_vmid_gen and kvm->arch.vmid */
> + return unlikely(READ_ONCE(kvm->arch.vmid_gen) != current_vmid_gen);
>  }
>  
>  /**
> @@ -500,7 +502,6 @@ static void update_vttbr(struct kvm *kvm)
>   kvm_call_hyp(__kvm_flush_vm_context);
>   }
>  
> - kvm->arch.vmid_gen = atomic64_read(_vmid_gen);
>   kvm->arch.vmid = kvm_next_vmid;
>   kvm_next_vmid++;
>   kvm_next_vmid &= (1 << kvm_vmid_bits) - 1;
> @@ -509,7 +510,10 @@ static void update_vttbr(struct kvm *kvm)
>   pgd_phys = virt_to_phys(kvm->arch.pgd);
>   BUG_ON(pgd_phys & ~VTTBR_BADDR_MASK);
>   vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & 
> VTTBR_VMID_MASK(kvm_vmid_bits);
> - kvm->arch.vttbr = pgd_phys | vmid;
> + WRITE_ONCE(kvm->arch.vttbr, pgd_phys | vmid);
> +
> + smp_wmb(); /* Ensure vttbr update is observed before vmid_gen update */
> + kvm->arch.vmid_gen = atomic64_read(_vmid_gen);
>  
>   spin_unlock(_vmid_lock);
>  }

I think that's right, yes.

We could replace the smp_{r,w}mb() barriers with an acquire of the
kvm_vmid_gen and a release of kvm->arch.vmid_gen, but if we're really
trying to optimize things there are larger algorithmic changes necessary
anyhow.

> It's probably easier to convince ourselves about the correctness of
> Marc's code using a rwlock instead, though.  Thoughts?

I believe that Marc's preference was the rwlock; I have no preference
either way.

Thanks,
Mark.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH] KVM: arm/arm64: Close VMID generation race

2018-04-10 Thread Christoffer Dall
On Tue, Apr 10, 2018 at 11:51:19AM +0100, Mark Rutland wrote:
> On Mon, Apr 09, 2018 at 10:51:39PM +0200, Christoffer Dall wrote:
> > On Mon, Apr 09, 2018 at 06:07:06PM +0100, Marc Zyngier wrote:
> > > Before entering the guest, we check whether our VMID is still
> > > part of the current generation. In order to avoid taking a lock,
> > > we start with checking that the generation is still current, and
> > > only if not current do we take the lock, recheck, and update the
> > > generation and VMID.
> > > 
> > > This leaves open a small race: A vcpu can bump up the global
> > > generation number as well as the VM's, but has not updated
> > > the VMID itself yet.
> > > 
> > > At that point another vcpu from the same VM comes in, checks
> > > the generation (and finds it not needing anything), and jumps
> > > into the guest. At this point, we end-up with two vcpus belonging
> > > to the same VM running with two different VMIDs. Eventually, the
> > > VMID used by the second vcpu will get reassigned, and things will
> > > really go wrong...
> > > 
> > > A simple solution would be to drop this initial check, and always take
> > > the lock. This is likely to cause performance issues. A middle ground
> > > is to convert the spinlock to a rwlock, and only take the read lock
> > > on the fast path. If the check fails at that point, drop it and
> > > acquire the write lock, rechecking the condition.
> > > 
> > > This ensures that the above scenario doesn't occur.
> > > 
> > > Reported-by: Mark Rutland 
> > > Signed-off-by: Marc Zyngier 
> > > ---
> > > I haven't seen any reply from Shannon, so reposting this to
> > > a slightly wider audience for feedback.
> > > 
> > >  virt/kvm/arm/arm.c | 15 ++-
> > >  1 file changed, 10 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> > > index dba629c5f8ac..a4c1b76240df 100644
> > > --- a/virt/kvm/arm/arm.c
> > > +++ b/virt/kvm/arm/arm.c
> > > @@ -63,7 +63,7 @@ static DEFINE_PER_CPU(struct kvm_vcpu *, 
> > > kvm_arm_running_vcpu);
> > >  static atomic64_t kvm_vmid_gen = ATOMIC64_INIT(1);
> > >  static u32 kvm_next_vmid;
> > >  static unsigned int kvm_vmid_bits __read_mostly;
> > > -static DEFINE_SPINLOCK(kvm_vmid_lock);
> > > +static DEFINE_RWLOCK(kvm_vmid_lock);
> > >  
> > >  static bool vgic_present;
> > >  
> > > @@ -473,11 +473,16 @@ static void update_vttbr(struct kvm *kvm)
> > >  {
> > >   phys_addr_t pgd_phys;
> > >   u64 vmid;
> > > + bool new_gen;
> > >  
> > > - if (!need_new_vmid_gen(kvm))
> > > + read_lock(_vmid_lock);
> > > + new_gen = need_new_vmid_gen(kvm);
> > > + read_unlock(_vmid_lock);
> > > +
> > > + if (!new_gen)
> > >   return;
> > >  
> > > - spin_lock(_vmid_lock);
> > > + write_lock(_vmid_lock);
> > >  
> > >   /*
> > >* We need to re-check the vmid_gen here to ensure that if another vcpu
> > > @@ -485,7 +490,7 @@ static void update_vttbr(struct kvm *kvm)
> > >* use the same vmid.
> > >*/
> > >   if (!need_new_vmid_gen(kvm)) {
> > > - spin_unlock(_vmid_lock);
> > > + write_unlock(_vmid_lock);
> > >   return;
> > >   }
> > >  
> > > @@ -519,7 +524,7 @@ static void update_vttbr(struct kvm *kvm)
> > >   vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & 
> > > VTTBR_VMID_MASK(kvm_vmid_bits);
> > >   kvm->arch.vttbr = kvm_phys_to_vttbr(pgd_phys) | vmid;
> > >  
> > > - spin_unlock(_vmid_lock);
> > > + write_unlock(_vmid_lock);
> > >  }
> > >  
> > >  static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
> > > -- 
> > > 2.14.2
> > > 
> > 
> > The above looks correct to me.  I am wondering if something like the
> > following would also work, which may be slightly more efficient,
> > although I doubt the difference can be measured:
> > 

[...]

> 
> I think we also need to update kvm->arch.vttbr before updating
> kvm->arch.vmid_gen, otherwise another CPU can come in, see that the
> vmid_gen is up-to-date, jump to hyp, and program a stale VTTBR (with the
> old VMID).
> 
> With the smp_wmb() and update of kvm->arch.vmid_gen moved to the end of
> the critical section, I think that works, modulo using READ_ONCE() and
> WRITE_ONCE() to ensure single-copy-atomicity of the fields we access
> locklessly.

Indeed, you're right.  I would look something like this, then:

diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 2e43f9d42bd5..6cb08995e7ff 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -450,7 +450,9 @@ void force_vm_exit(const cpumask_t *mask)
  */
 static bool need_new_vmid_gen(struct kvm *kvm)
 {
-   return unlikely(kvm->arch.vmid_gen != atomic64_read(_vmid_gen));
+   u64 current_vmid_gen = atomic64_read(_vmid_gen);
+   smp_rmb(); /* Orders read of kvm_vmid_gen and kvm->arch.vmid */
+   return unlikely(READ_ONCE(kvm->arch.vmid_gen) != current_vmid_gen);
 }
 
 /**
@@ -500,7 +502,6 @@ static void update_vttbr(struct kvm *kvm)

Re: [PATCH] KVM: arm/arm64: Close VMID generation race

2018-04-10 Thread Mark Rutland
On Mon, Apr 09, 2018 at 10:51:39PM +0200, Christoffer Dall wrote:
> On Mon, Apr 09, 2018 at 06:07:06PM +0100, Marc Zyngier wrote:
> > Before entering the guest, we check whether our VMID is still
> > part of the current generation. In order to avoid taking a lock,
> > we start with checking that the generation is still current, and
> > only if not current do we take the lock, recheck, and update the
> > generation and VMID.
> > 
> > This leaves open a small race: A vcpu can bump up the global
> > generation number as well as the VM's, but has not updated
> > the VMID itself yet.
> > 
> > At that point another vcpu from the same VM comes in, checks
> > the generation (and finds it not needing anything), and jumps
> > into the guest. At this point, we end-up with two vcpus belonging
> > to the same VM running with two different VMIDs. Eventually, the
> > VMID used by the second vcpu will get reassigned, and things will
> > really go wrong...
> > 
> > A simple solution would be to drop this initial check, and always take
> > the lock. This is likely to cause performance issues. A middle ground
> > is to convert the spinlock to a rwlock, and only take the read lock
> > on the fast path. If the check fails at that point, drop it and
> > acquire the write lock, rechecking the condition.
> > 
> > This ensures that the above scenario doesn't occur.
> > 
> > Reported-by: Mark Rutland 
> > Signed-off-by: Marc Zyngier 
> > ---
> > I haven't seen any reply from Shannon, so reposting this to
> > a slightly wider audience for feedback.
> > 
> >  virt/kvm/arm/arm.c | 15 ++-
> >  1 file changed, 10 insertions(+), 5 deletions(-)
> > 
> > diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> > index dba629c5f8ac..a4c1b76240df 100644
> > --- a/virt/kvm/arm/arm.c
> > +++ b/virt/kvm/arm/arm.c
> > @@ -63,7 +63,7 @@ static DEFINE_PER_CPU(struct kvm_vcpu *, 
> > kvm_arm_running_vcpu);
> >  static atomic64_t kvm_vmid_gen = ATOMIC64_INIT(1);
> >  static u32 kvm_next_vmid;
> >  static unsigned int kvm_vmid_bits __read_mostly;
> > -static DEFINE_SPINLOCK(kvm_vmid_lock);
> > +static DEFINE_RWLOCK(kvm_vmid_lock);
> >  
> >  static bool vgic_present;
> >  
> > @@ -473,11 +473,16 @@ static void update_vttbr(struct kvm *kvm)
> >  {
> > phys_addr_t pgd_phys;
> > u64 vmid;
> > +   bool new_gen;
> >  
> > -   if (!need_new_vmid_gen(kvm))
> > +   read_lock(_vmid_lock);
> > +   new_gen = need_new_vmid_gen(kvm);
> > +   read_unlock(_vmid_lock);
> > +
> > +   if (!new_gen)
> > return;
> >  
> > -   spin_lock(_vmid_lock);
> > +   write_lock(_vmid_lock);
> >  
> > /*
> >  * We need to re-check the vmid_gen here to ensure that if another vcpu
> > @@ -485,7 +490,7 @@ static void update_vttbr(struct kvm *kvm)
> >  * use the same vmid.
> >  */
> > if (!need_new_vmid_gen(kvm)) {
> > -   spin_unlock(_vmid_lock);
> > +   write_unlock(_vmid_lock);
> > return;
> > }
> >  
> > @@ -519,7 +524,7 @@ static void update_vttbr(struct kvm *kvm)
> > vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & 
> > VTTBR_VMID_MASK(kvm_vmid_bits);
> > kvm->arch.vttbr = kvm_phys_to_vttbr(pgd_phys) | vmid;
> >  
> > -   spin_unlock(_vmid_lock);
> > +   write_unlock(_vmid_lock);
> >  }
> >  
> >  static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
> > -- 
> > 2.14.2
> > 
> 
> The above looks correct to me.  I am wondering if something like the
> following would also work, which may be slightly more efficient,
> although I doubt the difference can be measured:
> 
> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> index dba629c5f8ac..7ac869bcad21 100644
> --- a/virt/kvm/arm/arm.c
> +++ b/virt/kvm/arm/arm.c
> @@ -458,7 +458,9 @@ void force_vm_exit(const cpumask_t *mask)
>   */
>  static bool need_new_vmid_gen(struct kvm *kvm)
>  {
> - return unlikely(kvm->arch.vmid_gen != atomic64_read(_vmid_gen));
> + u64 current_vmid_gen = atomic64_read(_vmid_gen);
> + smp_rmb(); /* Orders read of kvm_vmid_gen and kvm->arch.vmid */
> + return unlikely(kvm->arch.vmid_gen != current_vmid_gen);
>  }
>  
>  /**
> @@ -508,10 +510,11 @@ static void update_vttbr(struct kvm *kvm)
>   kvm_call_hyp(__kvm_flush_vm_context);
>   }
>  
> - kvm->arch.vmid_gen = atomic64_read(_vmid_gen);
>   kvm->arch.vmid = kvm_next_vmid;
>   kvm_next_vmid++;
>   kvm_next_vmid &= (1 << kvm_vmid_bits) - 1;
> + smp_wmb();
> + kvm->arch.vmid_gen = atomic64_read(_vmid_gen);
>  
>   /* update vttbr to be used with the new vmid */
>   pgd_phys = virt_to_phys(kvm->arch.pgd);
> 

I think we also need to update kvm->arch.vttbr before updating
kvm->arch.vmid_gen, otherwise another CPU can come in, see that the
vmid_gen is up-to-date, jump to hyp, and program a stale VTTBR (with the
old VMID).

With the smp_wmb() and update of kvm->arch.vmid_gen moved to the end of
the critical section, I think that works, modulo using 

Re: [PATCH] KVM: arm/arm64: Close VMID generation race

2018-04-09 Thread Christoffer Dall
On Mon, Apr 09, 2018 at 06:07:06PM +0100, Marc Zyngier wrote:
> Before entering the guest, we check whether our VMID is still
> part of the current generation. In order to avoid taking a lock,
> we start with checking that the generation is still current, and
> only if not current do we take the lock, recheck, and update the
> generation and VMID.
> 
> This leaves open a small race: A vcpu can bump up the global
> generation number as well as the VM's, but has not updated
> the VMID itself yet.
> 
> At that point another vcpu from the same VM comes in, checks
> the generation (and finds it not needing anything), and jumps
> into the guest. At this point, we end-up with two vcpus belonging
> to the same VM running with two different VMIDs. Eventually, the
> VMID used by the second vcpu will get reassigned, and things will
> really go wrong...
> 
> A simple solution would be to drop this initial check, and always take
> the lock. This is likely to cause performance issues. A middle ground
> is to convert the spinlock to a rwlock, and only take the read lock
> on the fast path. If the check fails at that point, drop it and
> acquire the write lock, rechecking the condition.
> 
> This ensures that the above scenario doesn't occur.
> 
> Reported-by: Mark Rutland 
> Signed-off-by: Marc Zyngier 
> ---
> I haven't seen any reply from Shannon, so reposting this to
> a slightly wider audience for feedback.
> 
>  virt/kvm/arm/arm.c | 15 ++-
>  1 file changed, 10 insertions(+), 5 deletions(-)
> 
> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> index dba629c5f8ac..a4c1b76240df 100644
> --- a/virt/kvm/arm/arm.c
> +++ b/virt/kvm/arm/arm.c
> @@ -63,7 +63,7 @@ static DEFINE_PER_CPU(struct kvm_vcpu *, 
> kvm_arm_running_vcpu);
>  static atomic64_t kvm_vmid_gen = ATOMIC64_INIT(1);
>  static u32 kvm_next_vmid;
>  static unsigned int kvm_vmid_bits __read_mostly;
> -static DEFINE_SPINLOCK(kvm_vmid_lock);
> +static DEFINE_RWLOCK(kvm_vmid_lock);
>  
>  static bool vgic_present;
>  
> @@ -473,11 +473,16 @@ static void update_vttbr(struct kvm *kvm)
>  {
>   phys_addr_t pgd_phys;
>   u64 vmid;
> + bool new_gen;
>  
> - if (!need_new_vmid_gen(kvm))
> + read_lock(_vmid_lock);
> + new_gen = need_new_vmid_gen(kvm);
> + read_unlock(_vmid_lock);
> +
> + if (!new_gen)
>   return;
>  
> - spin_lock(_vmid_lock);
> + write_lock(_vmid_lock);
>  
>   /*
>* We need to re-check the vmid_gen here to ensure that if another vcpu
> @@ -485,7 +490,7 @@ static void update_vttbr(struct kvm *kvm)
>* use the same vmid.
>*/
>   if (!need_new_vmid_gen(kvm)) {
> - spin_unlock(_vmid_lock);
> + write_unlock(_vmid_lock);
>   return;
>   }
>  
> @@ -519,7 +524,7 @@ static void update_vttbr(struct kvm *kvm)
>   vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & 
> VTTBR_VMID_MASK(kvm_vmid_bits);
>   kvm->arch.vttbr = kvm_phys_to_vttbr(pgd_phys) | vmid;
>  
> - spin_unlock(_vmid_lock);
> + write_unlock(_vmid_lock);
>  }
>  
>  static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
> -- 
> 2.14.2
> 

The above looks correct to me.  I am wondering if something like the
following would also work, which may be slightly more efficient,
although I doubt the difference can be measured:

diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index dba629c5f8ac..7ac869bcad21 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -458,7 +458,9 @@ void force_vm_exit(const cpumask_t *mask)
  */
 static bool need_new_vmid_gen(struct kvm *kvm)
 {
-   return unlikely(kvm->arch.vmid_gen != atomic64_read(_vmid_gen));
+   u64 current_vmid_gen = atomic64_read(_vmid_gen);
+   smp_rmb(); /* Orders read of kvm_vmid_gen and kvm->arch.vmid */
+   return unlikely(kvm->arch.vmid_gen != current_vmid_gen);
 }
 
 /**
@@ -508,10 +510,11 @@ static void update_vttbr(struct kvm *kvm)
kvm_call_hyp(__kvm_flush_vm_context);
}
 
-   kvm->arch.vmid_gen = atomic64_read(_vmid_gen);
kvm->arch.vmid = kvm_next_vmid;
kvm_next_vmid++;
kvm_next_vmid &= (1 << kvm_vmid_bits) - 1;
+   smp_wmb();
+   kvm->arch.vmid_gen = atomic64_read(_vmid_gen);
 
/* update vttbr to be used with the new vmid */
pgd_phys = virt_to_phys(kvm->arch.pgd);


Thanks,
-Christoffer
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm