Re: [PATCH] KVM: arm/arm64: Close VMID generation race
On 16/04/18 11:05, Shannon Zhao wrote: > > > On 2018/4/11 9:30, Shannon Zhao wrote: >> >> On 2018/4/10 23:37, Marc Zyngier wrote: [...] I don't mind either way. If you can be bothered to write a proper commit log for this, I'll take it. What I'd really want is Shannon to indicate whether or not this solves the issue he was seeing. >> I'll test Marc's patch. This will take about 3 days since it's not 100% >> reproducible. > Hi Marc, > > I've run the test for about 4 days. The issue doesn't appear. > So Tested-by: Shannon ZhaoThanks Shannon, much appreciated. I'll send the fix upstream towards the end of the week. Cheers, M. -- Jazz is not dead. It just smells funny... ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH] KVM: arm/arm64: Close VMID generation race
On 2018/4/11 9:30, Shannon Zhao wrote: > > On 2018/4/10 23:37, Marc Zyngier wrote: >> > On 10/04/18 16:24, Mark Rutland wrote: >>> >> On Tue, Apr 10, 2018 at 05:05:40PM +0200, Christoffer Dall wrote: >>> On Tue, Apr 10, 2018 at 11:51:19AM +0100, Mark Rutland wrote: > I think we also need to update kvm->arch.vttbr before updating > kvm->arch.vmid_gen, otherwise another CPU can come in, see that the > vmid_gen is up-to-date, jump to hyp, and program a stale VTTBR (with > the > old VMID). > > With the smp_wmb() and update of kvm->arch.vmid_gen moved to the end > of > the critical section, I think that works, modulo using READ_ONCE() > and > WRITE_ONCE() to ensure single-copy-atomicity of the fields we access > locklessly. >>> >>> Indeed, you're right. I would look something like this, then: >>> >>> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c >>> index 2e43f9d42bd5..6cb08995e7ff 100644 >>> --- a/virt/kvm/arm/arm.c >>> +++ b/virt/kvm/arm/arm.c >>> @@ -450,7 +450,9 @@ void force_vm_exit(const cpumask_t *mask) >>> */ >>> static bool need_new_vmid_gen(struct kvm *kvm) >>> { >>> - return unlikely(kvm->arch.vmid_gen != >>> atomic64_read(_vmid_gen)); >>> + u64 current_vmid_gen = atomic64_read(_vmid_gen); >>> + smp_rmb(); /* Orders read of kvm_vmid_gen and kvm->arch.vmid */ >>> + return unlikely(READ_ONCE(kvm->arch.vmid_gen) != >>> current_vmid_gen); >>> } >>> >>> /** >>> @@ -500,7 +502,6 @@ static void update_vttbr(struct kvm *kvm) >>>kvm_call_hyp(__kvm_flush_vm_context); >>>} >>> >>> - kvm->arch.vmid_gen = atomic64_read(_vmid_gen); >>>kvm->arch.vmid = kvm_next_vmid; >>>kvm_next_vmid++; >>>kvm_next_vmid &= (1 << kvm_vmid_bits) - 1; >>> @@ -509,7 +510,10 @@ static void update_vttbr(struct kvm *kvm) >>>pgd_phys = virt_to_phys(kvm->arch.pgd); >>>BUG_ON(pgd_phys & ~VTTBR_BADDR_MASK); >>>vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & >>> VTTBR_VMID_MASK(kvm_vmid_bits); >>> - kvm->arch.vttbr = pgd_phys | vmid; >>> + WRITE_ONCE(kvm->arch.vttbr, pgd_phys | vmid); >>> + >>> + smp_wmb(); /* Ensure vttbr update is observed before vmid_gen >>> update */ >>> + kvm->arch.vmid_gen = atomic64_read(_vmid_gen); >>> >>>spin_unlock(_vmid_lock); >>> } >>> >> >>> >> I think that's right, yes. >>> >> >>> >> We could replace the smp_{r,w}mb() barriers with an acquire of the >>> >> kvm_vmid_gen and a release of kvm->arch.vmid_gen, but if we're really >>> >> trying to optimize things there are larger algorithmic changes necessary >>> >> anyhow. >>> >> >>> It's probably easier to convince ourselves about the correctness of >>> Marc's code using a rwlock instead, though. Thoughts? >>> >> >>> >> I believe that Marc's preference was the rwlock; I have no preference >>> >> either way. >> > >> > I don't mind either way. If you can be bothered to write a proper commit >> > log for this, I'll take it. What I'd really want is Shannon to indicate >> > whether or not this solves the issue he was seeing. >> > > I'll test Marc's patch. This will take about 3 days since it's not 100% > reproducible. Hi Marc, I've run the test for about 4 days. The issue doesn't appear. So Tested-by: Shannon ZhaoThanks, -- Shannon ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH] KVM: arm/arm64: Close VMID generation race
On 2018/4/10 23:37, Marc Zyngier wrote: > On 10/04/18 16:24, Mark Rutland wrote: >> On Tue, Apr 10, 2018 at 05:05:40PM +0200, Christoffer Dall wrote: >>> On Tue, Apr 10, 2018 at 11:51:19AM +0100, Mark Rutland wrote: I think we also need to update kvm->arch.vttbr before updating kvm->arch.vmid_gen, otherwise another CPU can come in, see that the vmid_gen is up-to-date, jump to hyp, and program a stale VTTBR (with the old VMID). With the smp_wmb() and update of kvm->arch.vmid_gen moved to the end of the critical section, I think that works, modulo using READ_ONCE() and WRITE_ONCE() to ensure single-copy-atomicity of the fields we access locklessly. >>> >>> Indeed, you're right. I would look something like this, then: >>> >>> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c >>> index 2e43f9d42bd5..6cb08995e7ff 100644 >>> --- a/virt/kvm/arm/arm.c >>> +++ b/virt/kvm/arm/arm.c >>> @@ -450,7 +450,9 @@ void force_vm_exit(const cpumask_t *mask) >>> */ >>> static bool need_new_vmid_gen(struct kvm *kvm) >>> { >>> - return unlikely(kvm->arch.vmid_gen != atomic64_read(_vmid_gen)); >>> + u64 current_vmid_gen = atomic64_read(_vmid_gen); >>> + smp_rmb(); /* Orders read of kvm_vmid_gen and kvm->arch.vmid */ >>> + return unlikely(READ_ONCE(kvm->arch.vmid_gen) != current_vmid_gen); >>> } >>> >>> /** >>> @@ -500,7 +502,6 @@ static void update_vttbr(struct kvm *kvm) >>> kvm_call_hyp(__kvm_flush_vm_context); >>> } >>> >>> - kvm->arch.vmid_gen = atomic64_read(_vmid_gen); >>> kvm->arch.vmid = kvm_next_vmid; >>> kvm_next_vmid++; >>> kvm_next_vmid &= (1 << kvm_vmid_bits) - 1; >>> @@ -509,7 +510,10 @@ static void update_vttbr(struct kvm *kvm) >>> pgd_phys = virt_to_phys(kvm->arch.pgd); >>> BUG_ON(pgd_phys & ~VTTBR_BADDR_MASK); >>> vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & >>> VTTBR_VMID_MASK(kvm_vmid_bits); >>> - kvm->arch.vttbr = pgd_phys | vmid; >>> + WRITE_ONCE(kvm->arch.vttbr, pgd_phys | vmid); >>> + >>> + smp_wmb(); /* Ensure vttbr update is observed before vmid_gen update */ >>> + kvm->arch.vmid_gen = atomic64_read(_vmid_gen); >>> >>> spin_unlock(_vmid_lock); >>> } >> >> I think that's right, yes. >> >> We could replace the smp_{r,w}mb() barriers with an acquire of the >> kvm_vmid_gen and a release of kvm->arch.vmid_gen, but if we're really >> trying to optimize things there are larger algorithmic changes necessary >> anyhow. >> >>> It's probably easier to convince ourselves about the correctness of >>> Marc's code using a rwlock instead, though. Thoughts? >> >> I believe that Marc's preference was the rwlock; I have no preference >> either way. > > I don't mind either way. If you can be bothered to write a proper commit > log for this, I'll take it. What I'd really want is Shannon to indicate > whether or not this solves the issue he was seeing. > I'll test Marc's patch. This will take about 3 days since it's not 100% reproducible. Thanks, -- Shannon ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH] KVM: arm/arm64: Close VMID generation race
On Tue, Apr 10, 2018 at 04:37:12PM +0100, Marc Zyngier wrote: > On 10/04/18 16:24, Mark Rutland wrote: > > On Tue, Apr 10, 2018 at 05:05:40PM +0200, Christoffer Dall wrote: > >> On Tue, Apr 10, 2018 at 11:51:19AM +0100, Mark Rutland wrote: > >>> I think we also need to update kvm->arch.vttbr before updating > >>> kvm->arch.vmid_gen, otherwise another CPU can come in, see that the > >>> vmid_gen is up-to-date, jump to hyp, and program a stale VTTBR (with the > >>> old VMID). > >>> > >>> With the smp_wmb() and update of kvm->arch.vmid_gen moved to the end of > >>> the critical section, I think that works, modulo using READ_ONCE() and > >>> WRITE_ONCE() to ensure single-copy-atomicity of the fields we access > >>> locklessly. > >> > >> Indeed, you're right. I would look something like this, then: > >> > >> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c > >> index 2e43f9d42bd5..6cb08995e7ff 100644 > >> --- a/virt/kvm/arm/arm.c > >> +++ b/virt/kvm/arm/arm.c > >> @@ -450,7 +450,9 @@ void force_vm_exit(const cpumask_t *mask) > >> */ > >> static bool need_new_vmid_gen(struct kvm *kvm) > >> { > >> - return unlikely(kvm->arch.vmid_gen != atomic64_read(_vmid_gen)); > >> + u64 current_vmid_gen = atomic64_read(_vmid_gen); > >> + smp_rmb(); /* Orders read of kvm_vmid_gen and kvm->arch.vmid */ > >> + return unlikely(READ_ONCE(kvm->arch.vmid_gen) != current_vmid_gen); > >> } > >> > >> /** > >> @@ -500,7 +502,6 @@ static void update_vttbr(struct kvm *kvm) > >>kvm_call_hyp(__kvm_flush_vm_context); > >>} > >> > >> - kvm->arch.vmid_gen = atomic64_read(_vmid_gen); > >>kvm->arch.vmid = kvm_next_vmid; > >>kvm_next_vmid++; > >>kvm_next_vmid &= (1 << kvm_vmid_bits) - 1; > >> @@ -509,7 +510,10 @@ static void update_vttbr(struct kvm *kvm) > >>pgd_phys = virt_to_phys(kvm->arch.pgd); > >>BUG_ON(pgd_phys & ~VTTBR_BADDR_MASK); > >>vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & > >> VTTBR_VMID_MASK(kvm_vmid_bits); > >> - kvm->arch.vttbr = pgd_phys | vmid; > >> + WRITE_ONCE(kvm->arch.vttbr, pgd_phys | vmid); > >> + > >> + smp_wmb(); /* Ensure vttbr update is observed before vmid_gen update */ > >> + kvm->arch.vmid_gen = atomic64_read(_vmid_gen); > >> > >>spin_unlock(_vmid_lock); > >> } > > > > I think that's right, yes. > > > > We could replace the smp_{r,w}mb() barriers with an acquire of the > > kvm_vmid_gen and a release of kvm->arch.vmid_gen, but if we're really > > trying to optimize things there are larger algorithmic changes necessary > > anyhow. > > > >> It's probably easier to convince ourselves about the correctness of > >> Marc's code using a rwlock instead, though. Thoughts? > > > > I believe that Marc's preference was the rwlock; I have no preference > > either way. > > I don't mind either way. If you can be bothered to write a proper commit > log for this, I'll take it. You've already done the work, and your patch is easier to read, so let's just go ahead with that. I was just curious to which degree my original implementation was broken; was I trying to achieve something impossible or was I just writing buggy code. Seems the latter. Oh well. > What I'd really want is Shannon to indicate > whether or not this solves the issue he was seeing. > Agreed, would like to see that too. Thanks (and sorry for being noisy), -Christoffer ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH] KVM: arm/arm64: Close VMID generation race
On 10/04/18 16:24, Mark Rutland wrote: > On Tue, Apr 10, 2018 at 05:05:40PM +0200, Christoffer Dall wrote: >> On Tue, Apr 10, 2018 at 11:51:19AM +0100, Mark Rutland wrote: >>> I think we also need to update kvm->arch.vttbr before updating >>> kvm->arch.vmid_gen, otherwise another CPU can come in, see that the >>> vmid_gen is up-to-date, jump to hyp, and program a stale VTTBR (with the >>> old VMID). >>> >>> With the smp_wmb() and update of kvm->arch.vmid_gen moved to the end of >>> the critical section, I think that works, modulo using READ_ONCE() and >>> WRITE_ONCE() to ensure single-copy-atomicity of the fields we access >>> locklessly. >> >> Indeed, you're right. I would look something like this, then: >> >> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c >> index 2e43f9d42bd5..6cb08995e7ff 100644 >> --- a/virt/kvm/arm/arm.c >> +++ b/virt/kvm/arm/arm.c >> @@ -450,7 +450,9 @@ void force_vm_exit(const cpumask_t *mask) >> */ >> static bool need_new_vmid_gen(struct kvm *kvm) >> { >> -return unlikely(kvm->arch.vmid_gen != atomic64_read(_vmid_gen)); >> +u64 current_vmid_gen = atomic64_read(_vmid_gen); >> +smp_rmb(); /* Orders read of kvm_vmid_gen and kvm->arch.vmid */ >> +return unlikely(READ_ONCE(kvm->arch.vmid_gen) != current_vmid_gen); >> } >> >> /** >> @@ -500,7 +502,6 @@ static void update_vttbr(struct kvm *kvm) >> kvm_call_hyp(__kvm_flush_vm_context); >> } >> >> -kvm->arch.vmid_gen = atomic64_read(_vmid_gen); >> kvm->arch.vmid = kvm_next_vmid; >> kvm_next_vmid++; >> kvm_next_vmid &= (1 << kvm_vmid_bits) - 1; >> @@ -509,7 +510,10 @@ static void update_vttbr(struct kvm *kvm) >> pgd_phys = virt_to_phys(kvm->arch.pgd); >> BUG_ON(pgd_phys & ~VTTBR_BADDR_MASK); >> vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & >> VTTBR_VMID_MASK(kvm_vmid_bits); >> -kvm->arch.vttbr = pgd_phys | vmid; >> +WRITE_ONCE(kvm->arch.vttbr, pgd_phys | vmid); >> + >> +smp_wmb(); /* Ensure vttbr update is observed before vmid_gen update */ >> +kvm->arch.vmid_gen = atomic64_read(_vmid_gen); >> >> spin_unlock(_vmid_lock); >> } > > I think that's right, yes. > > We could replace the smp_{r,w}mb() barriers with an acquire of the > kvm_vmid_gen and a release of kvm->arch.vmid_gen, but if we're really > trying to optimize things there are larger algorithmic changes necessary > anyhow. > >> It's probably easier to convince ourselves about the correctness of >> Marc's code using a rwlock instead, though. Thoughts? > > I believe that Marc's preference was the rwlock; I have no preference > either way. I don't mind either way. If you can be bothered to write a proper commit log for this, I'll take it. What I'd really want is Shannon to indicate whether or not this solves the issue he was seeing. Thanks, M. -- Jazz is not dead. It just smells funny... ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH] KVM: arm/arm64: Close VMID generation race
On Tue, Apr 10, 2018 at 04:24:20PM +0100, Mark Rutland wrote: > On Tue, Apr 10, 2018 at 05:05:40PM +0200, Christoffer Dall wrote: > > On Tue, Apr 10, 2018 at 11:51:19AM +0100, Mark Rutland wrote: > > > I think we also need to update kvm->arch.vttbr before updating > > > kvm->arch.vmid_gen, otherwise another CPU can come in, see that the > > > vmid_gen is up-to-date, jump to hyp, and program a stale VTTBR (with the > > > old VMID). > > > > > > With the smp_wmb() and update of kvm->arch.vmid_gen moved to the end of > > > the critical section, I think that works, modulo using READ_ONCE() and > > > WRITE_ONCE() to ensure single-copy-atomicity of the fields we access > > > locklessly. > > > > Indeed, you're right. I would look something like this, then: > > > > diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c > > index 2e43f9d42bd5..6cb08995e7ff 100644 > > --- a/virt/kvm/arm/arm.c > > +++ b/virt/kvm/arm/arm.c > > @@ -450,7 +450,9 @@ void force_vm_exit(const cpumask_t *mask) > > */ > > static bool need_new_vmid_gen(struct kvm *kvm) > > { > > - return unlikely(kvm->arch.vmid_gen != atomic64_read(_vmid_gen)); > > + u64 current_vmid_gen = atomic64_read(_vmid_gen); > > + smp_rmb(); /* Orders read of kvm_vmid_gen and kvm->arch.vmid */ > > + return unlikely(READ_ONCE(kvm->arch.vmid_gen) != current_vmid_gen); > > } > > > > /** > > @@ -500,7 +502,6 @@ static void update_vttbr(struct kvm *kvm) > > kvm_call_hyp(__kvm_flush_vm_context); > > } > > > > - kvm->arch.vmid_gen = atomic64_read(_vmid_gen); > > kvm->arch.vmid = kvm_next_vmid; > > kvm_next_vmid++; > > kvm_next_vmid &= (1 << kvm_vmid_bits) - 1; > > @@ -509,7 +510,10 @@ static void update_vttbr(struct kvm *kvm) > > pgd_phys = virt_to_phys(kvm->arch.pgd); > > BUG_ON(pgd_phys & ~VTTBR_BADDR_MASK); > > vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & > > VTTBR_VMID_MASK(kvm_vmid_bits); > > - kvm->arch.vttbr = pgd_phys | vmid; > > + WRITE_ONCE(kvm->arch.vttbr, pgd_phys | vmid); > > + > > + smp_wmb(); /* Ensure vttbr update is observed before vmid_gen update */ > > + kvm->arch.vmid_gen = atomic64_read(_vmid_gen); > > > > spin_unlock(_vmid_lock); > > } > > I think that's right, yes. > > We could replace the smp_{r,w}mb() barriers with an acquire of the > kvm_vmid_gen and a release of kvm->arch.vmid_gen, but if we're really > trying to optimize things there are larger algorithmic changes necessary > anyhow. > > > It's probably easier to convince ourselves about the correctness of > > Marc's code using a rwlock instead, though. Thoughts? > > I believe that Marc's preference was the rwlock; I have no preference > either way. > I'm fine with both approaches as well, but it was educational for me to see if this could be done in the lockless way as well. Thanks for having a look at that! -Christoffer ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH] KVM: arm/arm64: Close VMID generation race
On Tue, Apr 10, 2018 at 05:05:40PM +0200, Christoffer Dall wrote: > On Tue, Apr 10, 2018 at 11:51:19AM +0100, Mark Rutland wrote: > > I think we also need to update kvm->arch.vttbr before updating > > kvm->arch.vmid_gen, otherwise another CPU can come in, see that the > > vmid_gen is up-to-date, jump to hyp, and program a stale VTTBR (with the > > old VMID). > > > > With the smp_wmb() and update of kvm->arch.vmid_gen moved to the end of > > the critical section, I think that works, modulo using READ_ONCE() and > > WRITE_ONCE() to ensure single-copy-atomicity of the fields we access > > locklessly. > > Indeed, you're right. I would look something like this, then: > > diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c > index 2e43f9d42bd5..6cb08995e7ff 100644 > --- a/virt/kvm/arm/arm.c > +++ b/virt/kvm/arm/arm.c > @@ -450,7 +450,9 @@ void force_vm_exit(const cpumask_t *mask) > */ > static bool need_new_vmid_gen(struct kvm *kvm) > { > - return unlikely(kvm->arch.vmid_gen != atomic64_read(_vmid_gen)); > + u64 current_vmid_gen = atomic64_read(_vmid_gen); > + smp_rmb(); /* Orders read of kvm_vmid_gen and kvm->arch.vmid */ > + return unlikely(READ_ONCE(kvm->arch.vmid_gen) != current_vmid_gen); > } > > /** > @@ -500,7 +502,6 @@ static void update_vttbr(struct kvm *kvm) > kvm_call_hyp(__kvm_flush_vm_context); > } > > - kvm->arch.vmid_gen = atomic64_read(_vmid_gen); > kvm->arch.vmid = kvm_next_vmid; > kvm_next_vmid++; > kvm_next_vmid &= (1 << kvm_vmid_bits) - 1; > @@ -509,7 +510,10 @@ static void update_vttbr(struct kvm *kvm) > pgd_phys = virt_to_phys(kvm->arch.pgd); > BUG_ON(pgd_phys & ~VTTBR_BADDR_MASK); > vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & > VTTBR_VMID_MASK(kvm_vmid_bits); > - kvm->arch.vttbr = pgd_phys | vmid; > + WRITE_ONCE(kvm->arch.vttbr, pgd_phys | vmid); > + > + smp_wmb(); /* Ensure vttbr update is observed before vmid_gen update */ > + kvm->arch.vmid_gen = atomic64_read(_vmid_gen); > > spin_unlock(_vmid_lock); > } I think that's right, yes. We could replace the smp_{r,w}mb() barriers with an acquire of the kvm_vmid_gen and a release of kvm->arch.vmid_gen, but if we're really trying to optimize things there are larger algorithmic changes necessary anyhow. > It's probably easier to convince ourselves about the correctness of > Marc's code using a rwlock instead, though. Thoughts? I believe that Marc's preference was the rwlock; I have no preference either way. Thanks, Mark. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH] KVM: arm/arm64: Close VMID generation race
On Tue, Apr 10, 2018 at 11:51:19AM +0100, Mark Rutland wrote: > On Mon, Apr 09, 2018 at 10:51:39PM +0200, Christoffer Dall wrote: > > On Mon, Apr 09, 2018 at 06:07:06PM +0100, Marc Zyngier wrote: > > > Before entering the guest, we check whether our VMID is still > > > part of the current generation. In order to avoid taking a lock, > > > we start with checking that the generation is still current, and > > > only if not current do we take the lock, recheck, and update the > > > generation and VMID. > > > > > > This leaves open a small race: A vcpu can bump up the global > > > generation number as well as the VM's, but has not updated > > > the VMID itself yet. > > > > > > At that point another vcpu from the same VM comes in, checks > > > the generation (and finds it not needing anything), and jumps > > > into the guest. At this point, we end-up with two vcpus belonging > > > to the same VM running with two different VMIDs. Eventually, the > > > VMID used by the second vcpu will get reassigned, and things will > > > really go wrong... > > > > > > A simple solution would be to drop this initial check, and always take > > > the lock. This is likely to cause performance issues. A middle ground > > > is to convert the spinlock to a rwlock, and only take the read lock > > > on the fast path. If the check fails at that point, drop it and > > > acquire the write lock, rechecking the condition. > > > > > > This ensures that the above scenario doesn't occur. > > > > > > Reported-by: Mark Rutland> > > Signed-off-by: Marc Zyngier > > > --- > > > I haven't seen any reply from Shannon, so reposting this to > > > a slightly wider audience for feedback. > > > > > > virt/kvm/arm/arm.c | 15 ++- > > > 1 file changed, 10 insertions(+), 5 deletions(-) > > > > > > diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c > > > index dba629c5f8ac..a4c1b76240df 100644 > > > --- a/virt/kvm/arm/arm.c > > > +++ b/virt/kvm/arm/arm.c > > > @@ -63,7 +63,7 @@ static DEFINE_PER_CPU(struct kvm_vcpu *, > > > kvm_arm_running_vcpu); > > > static atomic64_t kvm_vmid_gen = ATOMIC64_INIT(1); > > > static u32 kvm_next_vmid; > > > static unsigned int kvm_vmid_bits __read_mostly; > > > -static DEFINE_SPINLOCK(kvm_vmid_lock); > > > +static DEFINE_RWLOCK(kvm_vmid_lock); > > > > > > static bool vgic_present; > > > > > > @@ -473,11 +473,16 @@ static void update_vttbr(struct kvm *kvm) > > > { > > > phys_addr_t pgd_phys; > > > u64 vmid; > > > + bool new_gen; > > > > > > - if (!need_new_vmid_gen(kvm)) > > > + read_lock(_vmid_lock); > > > + new_gen = need_new_vmid_gen(kvm); > > > + read_unlock(_vmid_lock); > > > + > > > + if (!new_gen) > > > return; > > > > > > - spin_lock(_vmid_lock); > > > + write_lock(_vmid_lock); > > > > > > /* > > >* We need to re-check the vmid_gen here to ensure that if another vcpu > > > @@ -485,7 +490,7 @@ static void update_vttbr(struct kvm *kvm) > > >* use the same vmid. > > >*/ > > > if (!need_new_vmid_gen(kvm)) { > > > - spin_unlock(_vmid_lock); > > > + write_unlock(_vmid_lock); > > > return; > > > } > > > > > > @@ -519,7 +524,7 @@ static void update_vttbr(struct kvm *kvm) > > > vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & > > > VTTBR_VMID_MASK(kvm_vmid_bits); > > > kvm->arch.vttbr = kvm_phys_to_vttbr(pgd_phys) | vmid; > > > > > > - spin_unlock(_vmid_lock); > > > + write_unlock(_vmid_lock); > > > } > > > > > > static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu) > > > -- > > > 2.14.2 > > > > > > > The above looks correct to me. I am wondering if something like the > > following would also work, which may be slightly more efficient, > > although I doubt the difference can be measured: > > [...] > > I think we also need to update kvm->arch.vttbr before updating > kvm->arch.vmid_gen, otherwise another CPU can come in, see that the > vmid_gen is up-to-date, jump to hyp, and program a stale VTTBR (with the > old VMID). > > With the smp_wmb() and update of kvm->arch.vmid_gen moved to the end of > the critical section, I think that works, modulo using READ_ONCE() and > WRITE_ONCE() to ensure single-copy-atomicity of the fields we access > locklessly. Indeed, you're right. I would look something like this, then: diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c index 2e43f9d42bd5..6cb08995e7ff 100644 --- a/virt/kvm/arm/arm.c +++ b/virt/kvm/arm/arm.c @@ -450,7 +450,9 @@ void force_vm_exit(const cpumask_t *mask) */ static bool need_new_vmid_gen(struct kvm *kvm) { - return unlikely(kvm->arch.vmid_gen != atomic64_read(_vmid_gen)); + u64 current_vmid_gen = atomic64_read(_vmid_gen); + smp_rmb(); /* Orders read of kvm_vmid_gen and kvm->arch.vmid */ + return unlikely(READ_ONCE(kvm->arch.vmid_gen) != current_vmid_gen); } /** @@ -500,7 +502,6 @@ static void update_vttbr(struct kvm *kvm)
Re: [PATCH] KVM: arm/arm64: Close VMID generation race
On Mon, Apr 09, 2018 at 10:51:39PM +0200, Christoffer Dall wrote: > On Mon, Apr 09, 2018 at 06:07:06PM +0100, Marc Zyngier wrote: > > Before entering the guest, we check whether our VMID is still > > part of the current generation. In order to avoid taking a lock, > > we start with checking that the generation is still current, and > > only if not current do we take the lock, recheck, and update the > > generation and VMID. > > > > This leaves open a small race: A vcpu can bump up the global > > generation number as well as the VM's, but has not updated > > the VMID itself yet. > > > > At that point another vcpu from the same VM comes in, checks > > the generation (and finds it not needing anything), and jumps > > into the guest. At this point, we end-up with two vcpus belonging > > to the same VM running with two different VMIDs. Eventually, the > > VMID used by the second vcpu will get reassigned, and things will > > really go wrong... > > > > A simple solution would be to drop this initial check, and always take > > the lock. This is likely to cause performance issues. A middle ground > > is to convert the spinlock to a rwlock, and only take the read lock > > on the fast path. If the check fails at that point, drop it and > > acquire the write lock, rechecking the condition. > > > > This ensures that the above scenario doesn't occur. > > > > Reported-by: Mark Rutland> > Signed-off-by: Marc Zyngier > > --- > > I haven't seen any reply from Shannon, so reposting this to > > a slightly wider audience for feedback. > > > > virt/kvm/arm/arm.c | 15 ++- > > 1 file changed, 10 insertions(+), 5 deletions(-) > > > > diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c > > index dba629c5f8ac..a4c1b76240df 100644 > > --- a/virt/kvm/arm/arm.c > > +++ b/virt/kvm/arm/arm.c > > @@ -63,7 +63,7 @@ static DEFINE_PER_CPU(struct kvm_vcpu *, > > kvm_arm_running_vcpu); > > static atomic64_t kvm_vmid_gen = ATOMIC64_INIT(1); > > static u32 kvm_next_vmid; > > static unsigned int kvm_vmid_bits __read_mostly; > > -static DEFINE_SPINLOCK(kvm_vmid_lock); > > +static DEFINE_RWLOCK(kvm_vmid_lock); > > > > static bool vgic_present; > > > > @@ -473,11 +473,16 @@ static void update_vttbr(struct kvm *kvm) > > { > > phys_addr_t pgd_phys; > > u64 vmid; > > + bool new_gen; > > > > - if (!need_new_vmid_gen(kvm)) > > + read_lock(_vmid_lock); > > + new_gen = need_new_vmid_gen(kvm); > > + read_unlock(_vmid_lock); > > + > > + if (!new_gen) > > return; > > > > - spin_lock(_vmid_lock); > > + write_lock(_vmid_lock); > > > > /* > > * We need to re-check the vmid_gen here to ensure that if another vcpu > > @@ -485,7 +490,7 @@ static void update_vttbr(struct kvm *kvm) > > * use the same vmid. > > */ > > if (!need_new_vmid_gen(kvm)) { > > - spin_unlock(_vmid_lock); > > + write_unlock(_vmid_lock); > > return; > > } > > > > @@ -519,7 +524,7 @@ static void update_vttbr(struct kvm *kvm) > > vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & > > VTTBR_VMID_MASK(kvm_vmid_bits); > > kvm->arch.vttbr = kvm_phys_to_vttbr(pgd_phys) | vmid; > > > > - spin_unlock(_vmid_lock); > > + write_unlock(_vmid_lock); > > } > > > > static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu) > > -- > > 2.14.2 > > > > The above looks correct to me. I am wondering if something like the > following would also work, which may be slightly more efficient, > although I doubt the difference can be measured: > > diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c > index dba629c5f8ac..7ac869bcad21 100644 > --- a/virt/kvm/arm/arm.c > +++ b/virt/kvm/arm/arm.c > @@ -458,7 +458,9 @@ void force_vm_exit(const cpumask_t *mask) > */ > static bool need_new_vmid_gen(struct kvm *kvm) > { > - return unlikely(kvm->arch.vmid_gen != atomic64_read(_vmid_gen)); > + u64 current_vmid_gen = atomic64_read(_vmid_gen); > + smp_rmb(); /* Orders read of kvm_vmid_gen and kvm->arch.vmid */ > + return unlikely(kvm->arch.vmid_gen != current_vmid_gen); > } > > /** > @@ -508,10 +510,11 @@ static void update_vttbr(struct kvm *kvm) > kvm_call_hyp(__kvm_flush_vm_context); > } > > - kvm->arch.vmid_gen = atomic64_read(_vmid_gen); > kvm->arch.vmid = kvm_next_vmid; > kvm_next_vmid++; > kvm_next_vmid &= (1 << kvm_vmid_bits) - 1; > + smp_wmb(); > + kvm->arch.vmid_gen = atomic64_read(_vmid_gen); > > /* update vttbr to be used with the new vmid */ > pgd_phys = virt_to_phys(kvm->arch.pgd); > I think we also need to update kvm->arch.vttbr before updating kvm->arch.vmid_gen, otherwise another CPU can come in, see that the vmid_gen is up-to-date, jump to hyp, and program a stale VTTBR (with the old VMID). With the smp_wmb() and update of kvm->arch.vmid_gen moved to the end of the critical section, I think that works, modulo using
Re: [PATCH] KVM: arm/arm64: Close VMID generation race
On Mon, Apr 09, 2018 at 06:07:06PM +0100, Marc Zyngier wrote: > Before entering the guest, we check whether our VMID is still > part of the current generation. In order to avoid taking a lock, > we start with checking that the generation is still current, and > only if not current do we take the lock, recheck, and update the > generation and VMID. > > This leaves open a small race: A vcpu can bump up the global > generation number as well as the VM's, but has not updated > the VMID itself yet. > > At that point another vcpu from the same VM comes in, checks > the generation (and finds it not needing anything), and jumps > into the guest. At this point, we end-up with two vcpus belonging > to the same VM running with two different VMIDs. Eventually, the > VMID used by the second vcpu will get reassigned, and things will > really go wrong... > > A simple solution would be to drop this initial check, and always take > the lock. This is likely to cause performance issues. A middle ground > is to convert the spinlock to a rwlock, and only take the read lock > on the fast path. If the check fails at that point, drop it and > acquire the write lock, rechecking the condition. > > This ensures that the above scenario doesn't occur. > > Reported-by: Mark Rutland> Signed-off-by: Marc Zyngier > --- > I haven't seen any reply from Shannon, so reposting this to > a slightly wider audience for feedback. > > virt/kvm/arm/arm.c | 15 ++- > 1 file changed, 10 insertions(+), 5 deletions(-) > > diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c > index dba629c5f8ac..a4c1b76240df 100644 > --- a/virt/kvm/arm/arm.c > +++ b/virt/kvm/arm/arm.c > @@ -63,7 +63,7 @@ static DEFINE_PER_CPU(struct kvm_vcpu *, > kvm_arm_running_vcpu); > static atomic64_t kvm_vmid_gen = ATOMIC64_INIT(1); > static u32 kvm_next_vmid; > static unsigned int kvm_vmid_bits __read_mostly; > -static DEFINE_SPINLOCK(kvm_vmid_lock); > +static DEFINE_RWLOCK(kvm_vmid_lock); > > static bool vgic_present; > > @@ -473,11 +473,16 @@ static void update_vttbr(struct kvm *kvm) > { > phys_addr_t pgd_phys; > u64 vmid; > + bool new_gen; > > - if (!need_new_vmid_gen(kvm)) > + read_lock(_vmid_lock); > + new_gen = need_new_vmid_gen(kvm); > + read_unlock(_vmid_lock); > + > + if (!new_gen) > return; > > - spin_lock(_vmid_lock); > + write_lock(_vmid_lock); > > /* >* We need to re-check the vmid_gen here to ensure that if another vcpu > @@ -485,7 +490,7 @@ static void update_vttbr(struct kvm *kvm) >* use the same vmid. >*/ > if (!need_new_vmid_gen(kvm)) { > - spin_unlock(_vmid_lock); > + write_unlock(_vmid_lock); > return; > } > > @@ -519,7 +524,7 @@ static void update_vttbr(struct kvm *kvm) > vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & > VTTBR_VMID_MASK(kvm_vmid_bits); > kvm->arch.vttbr = kvm_phys_to_vttbr(pgd_phys) | vmid; > > - spin_unlock(_vmid_lock); > + write_unlock(_vmid_lock); > } > > static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu) > -- > 2.14.2 > The above looks correct to me. I am wondering if something like the following would also work, which may be slightly more efficient, although I doubt the difference can be measured: diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c index dba629c5f8ac..7ac869bcad21 100644 --- a/virt/kvm/arm/arm.c +++ b/virt/kvm/arm/arm.c @@ -458,7 +458,9 @@ void force_vm_exit(const cpumask_t *mask) */ static bool need_new_vmid_gen(struct kvm *kvm) { - return unlikely(kvm->arch.vmid_gen != atomic64_read(_vmid_gen)); + u64 current_vmid_gen = atomic64_read(_vmid_gen); + smp_rmb(); /* Orders read of kvm_vmid_gen and kvm->arch.vmid */ + return unlikely(kvm->arch.vmid_gen != current_vmid_gen); } /** @@ -508,10 +510,11 @@ static void update_vttbr(struct kvm *kvm) kvm_call_hyp(__kvm_flush_vm_context); } - kvm->arch.vmid_gen = atomic64_read(_vmid_gen); kvm->arch.vmid = kvm_next_vmid; kvm_next_vmid++; kvm_next_vmid &= (1 << kvm_vmid_bits) - 1; + smp_wmb(); + kvm->arch.vmid_gen = atomic64_read(_vmid_gen); /* update vttbr to be used with the new vmid */ pgd_phys = virt_to_phys(kvm->arch.pgd); Thanks, -Christoffer ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH] KVM: arm/arm64: Close VMID generation race
Before entering the guest, we check whether our VMID is still part of the current generation. In order to avoid taking a lock, we start with checking that the generation is still current, and only if not current do we take the lock, recheck, and update the generation and VMID. This leaves open a small race: A vcpu can bump up the global generation number as well as the VM's, but has not updated the VMID itself yet. At that point another vcpu from the same VM comes in, checks the generation (and finds it not needing anything), and jumps into the guest. At this point, we end-up with two vcpus belonging to the same VM running with two different VMIDs. Eventually, the VMID used by the second vcpu will get reassigned, and things will really go wrong... A simple solution would be to drop this initial check, and always take the lock. This is likely to cause performance issues. A middle ground is to convert the spinlock to a rwlock, and only take the read lock on the fast path. If the check fails at that point, drop it and acquire the write lock, rechecking the condition. This ensures that the above scenario doesn't occur. Reported-by: Mark RutlandSigned-off-by: Marc Zyngier --- I haven't seen any reply from Shannon, so reposting this to a slightly wider audience for feedback. virt/kvm/arm/arm.c | 15 ++- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c index dba629c5f8ac..a4c1b76240df 100644 --- a/virt/kvm/arm/arm.c +++ b/virt/kvm/arm/arm.c @@ -63,7 +63,7 @@ static DEFINE_PER_CPU(struct kvm_vcpu *, kvm_arm_running_vcpu); static atomic64_t kvm_vmid_gen = ATOMIC64_INIT(1); static u32 kvm_next_vmid; static unsigned int kvm_vmid_bits __read_mostly; -static DEFINE_SPINLOCK(kvm_vmid_lock); +static DEFINE_RWLOCK(kvm_vmid_lock); static bool vgic_present; @@ -473,11 +473,16 @@ static void update_vttbr(struct kvm *kvm) { phys_addr_t pgd_phys; u64 vmid; + bool new_gen; - if (!need_new_vmid_gen(kvm)) + read_lock(_vmid_lock); + new_gen = need_new_vmid_gen(kvm); + read_unlock(_vmid_lock); + + if (!new_gen) return; - spin_lock(_vmid_lock); + write_lock(_vmid_lock); /* * We need to re-check the vmid_gen here to ensure that if another vcpu @@ -485,7 +490,7 @@ static void update_vttbr(struct kvm *kvm) * use the same vmid. */ if (!need_new_vmid_gen(kvm)) { - spin_unlock(_vmid_lock); + write_unlock(_vmid_lock); return; } @@ -519,7 +524,7 @@ static void update_vttbr(struct kvm *kvm) vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & VTTBR_VMID_MASK(kvm_vmid_bits); kvm->arch.vttbr = kvm_phys_to_vttbr(pgd_phys) | vmid; - spin_unlock(_vmid_lock); + write_unlock(_vmid_lock); } static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu) -- 2.14.2 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm