Re: [PATCH] KVM: arm/arm64: Close VMID generation race
On 16/04/18 11:05, Shannon Zhao wrote: > > > On 2018/4/11 9:30, Shannon Zhao wrote: >> >> On 2018/4/10 23:37, Marc Zyngier wrote: [...] I don't mind either way. If you can be bothered to write a proper commit log for this, I'll take it. What I'd really want is Shannon to indicate whether or not this solves the issue he was seeing. >> I'll test Marc's patch. This will take about 3 days since it's not 100% >> reproducible. > Hi Marc, > > I've run the test for about 4 days. The issue doesn't appear. > So Tested-by: Shannon Zhao Thanks Shannon, much appreciated. I'll send the fix upstream towards the end of the week. Cheers, M. -- Jazz is not dead. It just smells funny... ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH] KVM: arm/arm64: Close VMID generation race
On 2018/4/11 9:30, Shannon Zhao wrote: > > On 2018/4/10 23:37, Marc Zyngier wrote: >> > On 10/04/18 16:24, Mark Rutland wrote: >>> >> On Tue, Apr 10, 2018 at 05:05:40PM +0200, Christoffer Dall wrote: >>> On Tue, Apr 10, 2018 at 11:51:19AM +0100, Mark Rutland wrote: > I think we also need to update kvm->arch.vttbr before updating > kvm->arch.vmid_gen, otherwise another CPU can come in, see that the > vmid_gen is up-to-date, jump to hyp, and program a stale VTTBR (with > the > old VMID). > > With the smp_wmb() and update of kvm->arch.vmid_gen moved to the end > of > the critical section, I think that works, modulo using READ_ONCE() > and > WRITE_ONCE() to ensure single-copy-atomicity of the fields we access > locklessly. >>> >>> Indeed, you're right. I would look something like this, then: >>> >>> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c >>> index 2e43f9d42bd5..6cb08995e7ff 100644 >>> --- a/virt/kvm/arm/arm.c >>> +++ b/virt/kvm/arm/arm.c >>> @@ -450,7 +450,9 @@ void force_vm_exit(const cpumask_t *mask) >>> */ >>> static bool need_new_vmid_gen(struct kvm *kvm) >>> { >>> - return unlikely(kvm->arch.vmid_gen != >>> atomic64_read(&kvm_vmid_gen)); >>> + u64 current_vmid_gen = atomic64_read(&kvm_vmid_gen); >>> + smp_rmb(); /* Orders read of kvm_vmid_gen and kvm->arch.vmid */ >>> + return unlikely(READ_ONCE(kvm->arch.vmid_gen) != >>> current_vmid_gen); >>> } >>> >>> /** >>> @@ -500,7 +502,6 @@ static void update_vttbr(struct kvm *kvm) >>>kvm_call_hyp(__kvm_flush_vm_context); >>>} >>> >>> - kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen); >>>kvm->arch.vmid = kvm_next_vmid; >>>kvm_next_vmid++; >>>kvm_next_vmid &= (1 << kvm_vmid_bits) - 1; >>> @@ -509,7 +510,10 @@ static void update_vttbr(struct kvm *kvm) >>>pgd_phys = virt_to_phys(kvm->arch.pgd); >>>BUG_ON(pgd_phys & ~VTTBR_BADDR_MASK); >>>vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & >>> VTTBR_VMID_MASK(kvm_vmid_bits); >>> - kvm->arch.vttbr = pgd_phys | vmid; >>> + WRITE_ONCE(kvm->arch.vttbr, pgd_phys | vmid); >>> + >>> + smp_wmb(); /* Ensure vttbr update is observed before vmid_gen >>> update */ >>> + kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen); >>> >>>spin_unlock(&kvm_vmid_lock); >>> } >>> >> >>> >> I think that's right, yes. >>> >> >>> >> We could replace the smp_{r,w}mb() barriers with an acquire of the >>> >> kvm_vmid_gen and a release of kvm->arch.vmid_gen, but if we're really >>> >> trying to optimize things there are larger algorithmic changes necessary >>> >> anyhow. >>> >> >>> It's probably easier to convince ourselves about the correctness of >>> Marc's code using a rwlock instead, though. Thoughts? >>> >> >>> >> I believe that Marc's preference was the rwlock; I have no preference >>> >> either way. >> > >> > I don't mind either way. If you can be bothered to write a proper commit >> > log for this, I'll take it. What I'd really want is Shannon to indicate >> > whether or not this solves the issue he was seeing. >> > > I'll test Marc's patch. This will take about 3 days since it's not 100% > reproducible. Hi Marc, I've run the test for about 4 days. The issue doesn't appear. So Tested-by: Shannon Zhao Thanks, -- Shannon ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH] KVM: arm/arm64: Close VMID generation race
On 2018/4/10 23:37, Marc Zyngier wrote: > On 10/04/18 16:24, Mark Rutland wrote: >> On Tue, Apr 10, 2018 at 05:05:40PM +0200, Christoffer Dall wrote: >>> On Tue, Apr 10, 2018 at 11:51:19AM +0100, Mark Rutland wrote: I think we also need to update kvm->arch.vttbr before updating kvm->arch.vmid_gen, otherwise another CPU can come in, see that the vmid_gen is up-to-date, jump to hyp, and program a stale VTTBR (with the old VMID). With the smp_wmb() and update of kvm->arch.vmid_gen moved to the end of the critical section, I think that works, modulo using READ_ONCE() and WRITE_ONCE() to ensure single-copy-atomicity of the fields we access locklessly. >>> >>> Indeed, you're right. I would look something like this, then: >>> >>> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c >>> index 2e43f9d42bd5..6cb08995e7ff 100644 >>> --- a/virt/kvm/arm/arm.c >>> +++ b/virt/kvm/arm/arm.c >>> @@ -450,7 +450,9 @@ void force_vm_exit(const cpumask_t *mask) >>> */ >>> static bool need_new_vmid_gen(struct kvm *kvm) >>> { >>> - return unlikely(kvm->arch.vmid_gen != atomic64_read(&kvm_vmid_gen)); >>> + u64 current_vmid_gen = atomic64_read(&kvm_vmid_gen); >>> + smp_rmb(); /* Orders read of kvm_vmid_gen and kvm->arch.vmid */ >>> + return unlikely(READ_ONCE(kvm->arch.vmid_gen) != current_vmid_gen); >>> } >>> >>> /** >>> @@ -500,7 +502,6 @@ static void update_vttbr(struct kvm *kvm) >>> kvm_call_hyp(__kvm_flush_vm_context); >>> } >>> >>> - kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen); >>> kvm->arch.vmid = kvm_next_vmid; >>> kvm_next_vmid++; >>> kvm_next_vmid &= (1 << kvm_vmid_bits) - 1; >>> @@ -509,7 +510,10 @@ static void update_vttbr(struct kvm *kvm) >>> pgd_phys = virt_to_phys(kvm->arch.pgd); >>> BUG_ON(pgd_phys & ~VTTBR_BADDR_MASK); >>> vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & >>> VTTBR_VMID_MASK(kvm_vmid_bits); >>> - kvm->arch.vttbr = pgd_phys | vmid; >>> + WRITE_ONCE(kvm->arch.vttbr, pgd_phys | vmid); >>> + >>> + smp_wmb(); /* Ensure vttbr update is observed before vmid_gen update */ >>> + kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen); >>> >>> spin_unlock(&kvm_vmid_lock); >>> } >> >> I think that's right, yes. >> >> We could replace the smp_{r,w}mb() barriers with an acquire of the >> kvm_vmid_gen and a release of kvm->arch.vmid_gen, but if we're really >> trying to optimize things there are larger algorithmic changes necessary >> anyhow. >> >>> It's probably easier to convince ourselves about the correctness of >>> Marc's code using a rwlock instead, though. Thoughts? >> >> I believe that Marc's preference was the rwlock; I have no preference >> either way. > > I don't mind either way. If you can be bothered to write a proper commit > log for this, I'll take it. What I'd really want is Shannon to indicate > whether or not this solves the issue he was seeing. > I'll test Marc's patch. This will take about 3 days since it's not 100% reproducible. Thanks, -- Shannon ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH] KVM: arm/arm64: Close VMID generation race
On Tue, Apr 10, 2018 at 04:37:12PM +0100, Marc Zyngier wrote: > On 10/04/18 16:24, Mark Rutland wrote: > > On Tue, Apr 10, 2018 at 05:05:40PM +0200, Christoffer Dall wrote: > >> On Tue, Apr 10, 2018 at 11:51:19AM +0100, Mark Rutland wrote: > >>> I think we also need to update kvm->arch.vttbr before updating > >>> kvm->arch.vmid_gen, otherwise another CPU can come in, see that the > >>> vmid_gen is up-to-date, jump to hyp, and program a stale VTTBR (with the > >>> old VMID). > >>> > >>> With the smp_wmb() and update of kvm->arch.vmid_gen moved to the end of > >>> the critical section, I think that works, modulo using READ_ONCE() and > >>> WRITE_ONCE() to ensure single-copy-atomicity of the fields we access > >>> locklessly. > >> > >> Indeed, you're right. I would look something like this, then: > >> > >> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c > >> index 2e43f9d42bd5..6cb08995e7ff 100644 > >> --- a/virt/kvm/arm/arm.c > >> +++ b/virt/kvm/arm/arm.c > >> @@ -450,7 +450,9 @@ void force_vm_exit(const cpumask_t *mask) > >> */ > >> static bool need_new_vmid_gen(struct kvm *kvm) > >> { > >> - return unlikely(kvm->arch.vmid_gen != atomic64_read(&kvm_vmid_gen)); > >> + u64 current_vmid_gen = atomic64_read(&kvm_vmid_gen); > >> + smp_rmb(); /* Orders read of kvm_vmid_gen and kvm->arch.vmid */ > >> + return unlikely(READ_ONCE(kvm->arch.vmid_gen) != current_vmid_gen); > >> } > >> > >> /** > >> @@ -500,7 +502,6 @@ static void update_vttbr(struct kvm *kvm) > >>kvm_call_hyp(__kvm_flush_vm_context); > >>} > >> > >> - kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen); > >>kvm->arch.vmid = kvm_next_vmid; > >>kvm_next_vmid++; > >>kvm_next_vmid &= (1 << kvm_vmid_bits) - 1; > >> @@ -509,7 +510,10 @@ static void update_vttbr(struct kvm *kvm) > >>pgd_phys = virt_to_phys(kvm->arch.pgd); > >>BUG_ON(pgd_phys & ~VTTBR_BADDR_MASK); > >>vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & > >> VTTBR_VMID_MASK(kvm_vmid_bits); > >> - kvm->arch.vttbr = pgd_phys | vmid; > >> + WRITE_ONCE(kvm->arch.vttbr, pgd_phys | vmid); > >> + > >> + smp_wmb(); /* Ensure vttbr update is observed before vmid_gen update */ > >> + kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen); > >> > >>spin_unlock(&kvm_vmid_lock); > >> } > > > > I think that's right, yes. > > > > We could replace the smp_{r,w}mb() barriers with an acquire of the > > kvm_vmid_gen and a release of kvm->arch.vmid_gen, but if we're really > > trying to optimize things there are larger algorithmic changes necessary > > anyhow. > > > >> It's probably easier to convince ourselves about the correctness of > >> Marc's code using a rwlock instead, though. Thoughts? > > > > I believe that Marc's preference was the rwlock; I have no preference > > either way. > > I don't mind either way. If you can be bothered to write a proper commit > log for this, I'll take it. You've already done the work, and your patch is easier to read, so let's just go ahead with that. I was just curious to which degree my original implementation was broken; was I trying to achieve something impossible or was I just writing buggy code. Seems the latter. Oh well. > What I'd really want is Shannon to indicate > whether or not this solves the issue he was seeing. > Agreed, would like to see that too. Thanks (and sorry for being noisy), -Christoffer ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH] KVM: arm/arm64: Close VMID generation race
On 10/04/18 16:24, Mark Rutland wrote: > On Tue, Apr 10, 2018 at 05:05:40PM +0200, Christoffer Dall wrote: >> On Tue, Apr 10, 2018 at 11:51:19AM +0100, Mark Rutland wrote: >>> I think we also need to update kvm->arch.vttbr before updating >>> kvm->arch.vmid_gen, otherwise another CPU can come in, see that the >>> vmid_gen is up-to-date, jump to hyp, and program a stale VTTBR (with the >>> old VMID). >>> >>> With the smp_wmb() and update of kvm->arch.vmid_gen moved to the end of >>> the critical section, I think that works, modulo using READ_ONCE() and >>> WRITE_ONCE() to ensure single-copy-atomicity of the fields we access >>> locklessly. >> >> Indeed, you're right. I would look something like this, then: >> >> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c >> index 2e43f9d42bd5..6cb08995e7ff 100644 >> --- a/virt/kvm/arm/arm.c >> +++ b/virt/kvm/arm/arm.c >> @@ -450,7 +450,9 @@ void force_vm_exit(const cpumask_t *mask) >> */ >> static bool need_new_vmid_gen(struct kvm *kvm) >> { >> -return unlikely(kvm->arch.vmid_gen != atomic64_read(&kvm_vmid_gen)); >> +u64 current_vmid_gen = atomic64_read(&kvm_vmid_gen); >> +smp_rmb(); /* Orders read of kvm_vmid_gen and kvm->arch.vmid */ >> +return unlikely(READ_ONCE(kvm->arch.vmid_gen) != current_vmid_gen); >> } >> >> /** >> @@ -500,7 +502,6 @@ static void update_vttbr(struct kvm *kvm) >> kvm_call_hyp(__kvm_flush_vm_context); >> } >> >> -kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen); >> kvm->arch.vmid = kvm_next_vmid; >> kvm_next_vmid++; >> kvm_next_vmid &= (1 << kvm_vmid_bits) - 1; >> @@ -509,7 +510,10 @@ static void update_vttbr(struct kvm *kvm) >> pgd_phys = virt_to_phys(kvm->arch.pgd); >> BUG_ON(pgd_phys & ~VTTBR_BADDR_MASK); >> vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & >> VTTBR_VMID_MASK(kvm_vmid_bits); >> -kvm->arch.vttbr = pgd_phys | vmid; >> +WRITE_ONCE(kvm->arch.vttbr, pgd_phys | vmid); >> + >> +smp_wmb(); /* Ensure vttbr update is observed before vmid_gen update */ >> +kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen); >> >> spin_unlock(&kvm_vmid_lock); >> } > > I think that's right, yes. > > We could replace the smp_{r,w}mb() barriers with an acquire of the > kvm_vmid_gen and a release of kvm->arch.vmid_gen, but if we're really > trying to optimize things there are larger algorithmic changes necessary > anyhow. > >> It's probably easier to convince ourselves about the correctness of >> Marc's code using a rwlock instead, though. Thoughts? > > I believe that Marc's preference was the rwlock; I have no preference > either way. I don't mind either way. If you can be bothered to write a proper commit log for this, I'll take it. What I'd really want is Shannon to indicate whether or not this solves the issue he was seeing. Thanks, M. -- Jazz is not dead. It just smells funny... ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH] KVM: arm/arm64: Close VMID generation race
On Tue, Apr 10, 2018 at 04:24:20PM +0100, Mark Rutland wrote: > On Tue, Apr 10, 2018 at 05:05:40PM +0200, Christoffer Dall wrote: > > On Tue, Apr 10, 2018 at 11:51:19AM +0100, Mark Rutland wrote: > > > I think we also need to update kvm->arch.vttbr before updating > > > kvm->arch.vmid_gen, otherwise another CPU can come in, see that the > > > vmid_gen is up-to-date, jump to hyp, and program a stale VTTBR (with the > > > old VMID). > > > > > > With the smp_wmb() and update of kvm->arch.vmid_gen moved to the end of > > > the critical section, I think that works, modulo using READ_ONCE() and > > > WRITE_ONCE() to ensure single-copy-atomicity of the fields we access > > > locklessly. > > > > Indeed, you're right. I would look something like this, then: > > > > diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c > > index 2e43f9d42bd5..6cb08995e7ff 100644 > > --- a/virt/kvm/arm/arm.c > > +++ b/virt/kvm/arm/arm.c > > @@ -450,7 +450,9 @@ void force_vm_exit(const cpumask_t *mask) > > */ > > static bool need_new_vmid_gen(struct kvm *kvm) > > { > > - return unlikely(kvm->arch.vmid_gen != atomic64_read(&kvm_vmid_gen)); > > + u64 current_vmid_gen = atomic64_read(&kvm_vmid_gen); > > + smp_rmb(); /* Orders read of kvm_vmid_gen and kvm->arch.vmid */ > > + return unlikely(READ_ONCE(kvm->arch.vmid_gen) != current_vmid_gen); > > } > > > > /** > > @@ -500,7 +502,6 @@ static void update_vttbr(struct kvm *kvm) > > kvm_call_hyp(__kvm_flush_vm_context); > > } > > > > - kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen); > > kvm->arch.vmid = kvm_next_vmid; > > kvm_next_vmid++; > > kvm_next_vmid &= (1 << kvm_vmid_bits) - 1; > > @@ -509,7 +510,10 @@ static void update_vttbr(struct kvm *kvm) > > pgd_phys = virt_to_phys(kvm->arch.pgd); > > BUG_ON(pgd_phys & ~VTTBR_BADDR_MASK); > > vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & > > VTTBR_VMID_MASK(kvm_vmid_bits); > > - kvm->arch.vttbr = pgd_phys | vmid; > > + WRITE_ONCE(kvm->arch.vttbr, pgd_phys | vmid); > > + > > + smp_wmb(); /* Ensure vttbr update is observed before vmid_gen update */ > > + kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen); > > > > spin_unlock(&kvm_vmid_lock); > > } > > I think that's right, yes. > > We could replace the smp_{r,w}mb() barriers with an acquire of the > kvm_vmid_gen and a release of kvm->arch.vmid_gen, but if we're really > trying to optimize things there are larger algorithmic changes necessary > anyhow. > > > It's probably easier to convince ourselves about the correctness of > > Marc's code using a rwlock instead, though. Thoughts? > > I believe that Marc's preference was the rwlock; I have no preference > either way. > I'm fine with both approaches as well, but it was educational for me to see if this could be done in the lockless way as well. Thanks for having a look at that! -Christoffer ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH] KVM: arm/arm64: Close VMID generation race
On Tue, Apr 10, 2018 at 05:05:40PM +0200, Christoffer Dall wrote: > On Tue, Apr 10, 2018 at 11:51:19AM +0100, Mark Rutland wrote: > > I think we also need to update kvm->arch.vttbr before updating > > kvm->arch.vmid_gen, otherwise another CPU can come in, see that the > > vmid_gen is up-to-date, jump to hyp, and program a stale VTTBR (with the > > old VMID). > > > > With the smp_wmb() and update of kvm->arch.vmid_gen moved to the end of > > the critical section, I think that works, modulo using READ_ONCE() and > > WRITE_ONCE() to ensure single-copy-atomicity of the fields we access > > locklessly. > > Indeed, you're right. I would look something like this, then: > > diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c > index 2e43f9d42bd5..6cb08995e7ff 100644 > --- a/virt/kvm/arm/arm.c > +++ b/virt/kvm/arm/arm.c > @@ -450,7 +450,9 @@ void force_vm_exit(const cpumask_t *mask) > */ > static bool need_new_vmid_gen(struct kvm *kvm) > { > - return unlikely(kvm->arch.vmid_gen != atomic64_read(&kvm_vmid_gen)); > + u64 current_vmid_gen = atomic64_read(&kvm_vmid_gen); > + smp_rmb(); /* Orders read of kvm_vmid_gen and kvm->arch.vmid */ > + return unlikely(READ_ONCE(kvm->arch.vmid_gen) != current_vmid_gen); > } > > /** > @@ -500,7 +502,6 @@ static void update_vttbr(struct kvm *kvm) > kvm_call_hyp(__kvm_flush_vm_context); > } > > - kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen); > kvm->arch.vmid = kvm_next_vmid; > kvm_next_vmid++; > kvm_next_vmid &= (1 << kvm_vmid_bits) - 1; > @@ -509,7 +510,10 @@ static void update_vttbr(struct kvm *kvm) > pgd_phys = virt_to_phys(kvm->arch.pgd); > BUG_ON(pgd_phys & ~VTTBR_BADDR_MASK); > vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & > VTTBR_VMID_MASK(kvm_vmid_bits); > - kvm->arch.vttbr = pgd_phys | vmid; > + WRITE_ONCE(kvm->arch.vttbr, pgd_phys | vmid); > + > + smp_wmb(); /* Ensure vttbr update is observed before vmid_gen update */ > + kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen); > > spin_unlock(&kvm_vmid_lock); > } I think that's right, yes. We could replace the smp_{r,w}mb() barriers with an acquire of the kvm_vmid_gen and a release of kvm->arch.vmid_gen, but if we're really trying to optimize things there are larger algorithmic changes necessary anyhow. > It's probably easier to convince ourselves about the correctness of > Marc's code using a rwlock instead, though. Thoughts? I believe that Marc's preference was the rwlock; I have no preference either way. Thanks, Mark. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH] KVM: arm/arm64: Close VMID generation race
On Tue, Apr 10, 2018 at 11:51:19AM +0100, Mark Rutland wrote: > On Mon, Apr 09, 2018 at 10:51:39PM +0200, Christoffer Dall wrote: > > On Mon, Apr 09, 2018 at 06:07:06PM +0100, Marc Zyngier wrote: > > > Before entering the guest, we check whether our VMID is still > > > part of the current generation. In order to avoid taking a lock, > > > we start with checking that the generation is still current, and > > > only if not current do we take the lock, recheck, and update the > > > generation and VMID. > > > > > > This leaves open a small race: A vcpu can bump up the global > > > generation number as well as the VM's, but has not updated > > > the VMID itself yet. > > > > > > At that point another vcpu from the same VM comes in, checks > > > the generation (and finds it not needing anything), and jumps > > > into the guest. At this point, we end-up with two vcpus belonging > > > to the same VM running with two different VMIDs. Eventually, the > > > VMID used by the second vcpu will get reassigned, and things will > > > really go wrong... > > > > > > A simple solution would be to drop this initial check, and always take > > > the lock. This is likely to cause performance issues. A middle ground > > > is to convert the spinlock to a rwlock, and only take the read lock > > > on the fast path. If the check fails at that point, drop it and > > > acquire the write lock, rechecking the condition. > > > > > > This ensures that the above scenario doesn't occur. > > > > > > Reported-by: Mark Rutland > > > Signed-off-by: Marc Zyngier > > > --- > > > I haven't seen any reply from Shannon, so reposting this to > > > a slightly wider audience for feedback. > > > > > > virt/kvm/arm/arm.c | 15 ++- > > > 1 file changed, 10 insertions(+), 5 deletions(-) > > > > > > diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c > > > index dba629c5f8ac..a4c1b76240df 100644 > > > --- a/virt/kvm/arm/arm.c > > > +++ b/virt/kvm/arm/arm.c > > > @@ -63,7 +63,7 @@ static DEFINE_PER_CPU(struct kvm_vcpu *, > > > kvm_arm_running_vcpu); > > > static atomic64_t kvm_vmid_gen = ATOMIC64_INIT(1); > > > static u32 kvm_next_vmid; > > > static unsigned int kvm_vmid_bits __read_mostly; > > > -static DEFINE_SPINLOCK(kvm_vmid_lock); > > > +static DEFINE_RWLOCK(kvm_vmid_lock); > > > > > > static bool vgic_present; > > > > > > @@ -473,11 +473,16 @@ static void update_vttbr(struct kvm *kvm) > > > { > > > phys_addr_t pgd_phys; > > > u64 vmid; > > > + bool new_gen; > > > > > > - if (!need_new_vmid_gen(kvm)) > > > + read_lock(&kvm_vmid_lock); > > > + new_gen = need_new_vmid_gen(kvm); > > > + read_unlock(&kvm_vmid_lock); > > > + > > > + if (!new_gen) > > > return; > > > > > > - spin_lock(&kvm_vmid_lock); > > > + write_lock(&kvm_vmid_lock); > > > > > > /* > > >* We need to re-check the vmid_gen here to ensure that if another vcpu > > > @@ -485,7 +490,7 @@ static void update_vttbr(struct kvm *kvm) > > >* use the same vmid. > > >*/ > > > if (!need_new_vmid_gen(kvm)) { > > > - spin_unlock(&kvm_vmid_lock); > > > + write_unlock(&kvm_vmid_lock); > > > return; > > > } > > > > > > @@ -519,7 +524,7 @@ static void update_vttbr(struct kvm *kvm) > > > vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & > > > VTTBR_VMID_MASK(kvm_vmid_bits); > > > kvm->arch.vttbr = kvm_phys_to_vttbr(pgd_phys) | vmid; > > > > > > - spin_unlock(&kvm_vmid_lock); > > > + write_unlock(&kvm_vmid_lock); > > > } > > > > > > static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu) > > > -- > > > 2.14.2 > > > > > > > The above looks correct to me. I am wondering if something like the > > following would also work, which may be slightly more efficient, > > although I doubt the difference can be measured: > > [...] > > I think we also need to update kvm->arch.vttbr before updating > kvm->arch.vmid_gen, otherwise another CPU can come in, see that the > vmid_gen is up-to-date, jump to hyp, and program a stale VTTBR (with the > old VMID). > > With the smp_wmb() and update of kvm->arch.vmid_gen moved to the end of > the critical section, I think that works, modulo using READ_ONCE() and > WRITE_ONCE() to ensure single-copy-atomicity of the fields we access > locklessly. Indeed, you're right. I would look something like this, then: diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c index 2e43f9d42bd5..6cb08995e7ff 100644 --- a/virt/kvm/arm/arm.c +++ b/virt/kvm/arm/arm.c @@ -450,7 +450,9 @@ void force_vm_exit(const cpumask_t *mask) */ static bool need_new_vmid_gen(struct kvm *kvm) { - return unlikely(kvm->arch.vmid_gen != atomic64_read(&kvm_vmid_gen)); + u64 current_vmid_gen = atomic64_read(&kvm_vmid_gen); + smp_rmb(); /* Orders read of kvm_vmid_gen and kvm->arch.vmid */ + return unlikely(READ_ONCE(kvm->arch.vmid_gen) != current_vmid_gen); } /** @@ -500,7 +502,6 @@ static void update_vttbr(struct kvm *kvm) kvm_call_hyp(__kvm_flush_vm_context);
Re: [PATCH] KVM: arm/arm64: Close VMID generation race
On Mon, Apr 09, 2018 at 10:51:39PM +0200, Christoffer Dall wrote: > On Mon, Apr 09, 2018 at 06:07:06PM +0100, Marc Zyngier wrote: > > Before entering the guest, we check whether our VMID is still > > part of the current generation. In order to avoid taking a lock, > > we start with checking that the generation is still current, and > > only if not current do we take the lock, recheck, and update the > > generation and VMID. > > > > This leaves open a small race: A vcpu can bump up the global > > generation number as well as the VM's, but has not updated > > the VMID itself yet. > > > > At that point another vcpu from the same VM comes in, checks > > the generation (and finds it not needing anything), and jumps > > into the guest. At this point, we end-up with two vcpus belonging > > to the same VM running with two different VMIDs. Eventually, the > > VMID used by the second vcpu will get reassigned, and things will > > really go wrong... > > > > A simple solution would be to drop this initial check, and always take > > the lock. This is likely to cause performance issues. A middle ground > > is to convert the spinlock to a rwlock, and only take the read lock > > on the fast path. If the check fails at that point, drop it and > > acquire the write lock, rechecking the condition. > > > > This ensures that the above scenario doesn't occur. > > > > Reported-by: Mark Rutland > > Signed-off-by: Marc Zyngier > > --- > > I haven't seen any reply from Shannon, so reposting this to > > a slightly wider audience for feedback. > > > > virt/kvm/arm/arm.c | 15 ++- > > 1 file changed, 10 insertions(+), 5 deletions(-) > > > > diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c > > index dba629c5f8ac..a4c1b76240df 100644 > > --- a/virt/kvm/arm/arm.c > > +++ b/virt/kvm/arm/arm.c > > @@ -63,7 +63,7 @@ static DEFINE_PER_CPU(struct kvm_vcpu *, > > kvm_arm_running_vcpu); > > static atomic64_t kvm_vmid_gen = ATOMIC64_INIT(1); > > static u32 kvm_next_vmid; > > static unsigned int kvm_vmid_bits __read_mostly; > > -static DEFINE_SPINLOCK(kvm_vmid_lock); > > +static DEFINE_RWLOCK(kvm_vmid_lock); > > > > static bool vgic_present; > > > > @@ -473,11 +473,16 @@ static void update_vttbr(struct kvm *kvm) > > { > > phys_addr_t pgd_phys; > > u64 vmid; > > + bool new_gen; > > > > - if (!need_new_vmid_gen(kvm)) > > + read_lock(&kvm_vmid_lock); > > + new_gen = need_new_vmid_gen(kvm); > > + read_unlock(&kvm_vmid_lock); > > + > > + if (!new_gen) > > return; > > > > - spin_lock(&kvm_vmid_lock); > > + write_lock(&kvm_vmid_lock); > > > > /* > > * We need to re-check the vmid_gen here to ensure that if another vcpu > > @@ -485,7 +490,7 @@ static void update_vttbr(struct kvm *kvm) > > * use the same vmid. > > */ > > if (!need_new_vmid_gen(kvm)) { > > - spin_unlock(&kvm_vmid_lock); > > + write_unlock(&kvm_vmid_lock); > > return; > > } > > > > @@ -519,7 +524,7 @@ static void update_vttbr(struct kvm *kvm) > > vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & > > VTTBR_VMID_MASK(kvm_vmid_bits); > > kvm->arch.vttbr = kvm_phys_to_vttbr(pgd_phys) | vmid; > > > > - spin_unlock(&kvm_vmid_lock); > > + write_unlock(&kvm_vmid_lock); > > } > > > > static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu) > > -- > > 2.14.2 > > > > The above looks correct to me. I am wondering if something like the > following would also work, which may be slightly more efficient, > although I doubt the difference can be measured: > > diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c > index dba629c5f8ac..7ac869bcad21 100644 > --- a/virt/kvm/arm/arm.c > +++ b/virt/kvm/arm/arm.c > @@ -458,7 +458,9 @@ void force_vm_exit(const cpumask_t *mask) > */ > static bool need_new_vmid_gen(struct kvm *kvm) > { > - return unlikely(kvm->arch.vmid_gen != atomic64_read(&kvm_vmid_gen)); > + u64 current_vmid_gen = atomic64_read(&kvm_vmid_gen); > + smp_rmb(); /* Orders read of kvm_vmid_gen and kvm->arch.vmid */ > + return unlikely(kvm->arch.vmid_gen != current_vmid_gen); > } > > /** > @@ -508,10 +510,11 @@ static void update_vttbr(struct kvm *kvm) > kvm_call_hyp(__kvm_flush_vm_context); > } > > - kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen); > kvm->arch.vmid = kvm_next_vmid; > kvm_next_vmid++; > kvm_next_vmid &= (1 << kvm_vmid_bits) - 1; > + smp_wmb(); > + kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen); > > /* update vttbr to be used with the new vmid */ > pgd_phys = virt_to_phys(kvm->arch.pgd); > I think we also need to update kvm->arch.vttbr before updating kvm->arch.vmid_gen, otherwise another CPU can come in, see that the vmid_gen is up-to-date, jump to hyp, and program a stale VTTBR (with the old VMID). With the smp_wmb() and update of kvm->arch.vmid_gen moved to the end of the critical section, I think that works, modulo using READ
Re: [PATCH] KVM: arm/arm64: Close VMID generation race
On Mon, Apr 09, 2018 at 06:07:06PM +0100, Marc Zyngier wrote: > Before entering the guest, we check whether our VMID is still > part of the current generation. In order to avoid taking a lock, > we start with checking that the generation is still current, and > only if not current do we take the lock, recheck, and update the > generation and VMID. > > This leaves open a small race: A vcpu can bump up the global > generation number as well as the VM's, but has not updated > the VMID itself yet. > > At that point another vcpu from the same VM comes in, checks > the generation (and finds it not needing anything), and jumps > into the guest. At this point, we end-up with two vcpus belonging > to the same VM running with two different VMIDs. Eventually, the > VMID used by the second vcpu will get reassigned, and things will > really go wrong... > > A simple solution would be to drop this initial check, and always take > the lock. This is likely to cause performance issues. A middle ground > is to convert the spinlock to a rwlock, and only take the read lock > on the fast path. If the check fails at that point, drop it and > acquire the write lock, rechecking the condition. > > This ensures that the above scenario doesn't occur. > > Reported-by: Mark Rutland > Signed-off-by: Marc Zyngier > --- > I haven't seen any reply from Shannon, so reposting this to > a slightly wider audience for feedback. > > virt/kvm/arm/arm.c | 15 ++- > 1 file changed, 10 insertions(+), 5 deletions(-) > > diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c > index dba629c5f8ac..a4c1b76240df 100644 > --- a/virt/kvm/arm/arm.c > +++ b/virt/kvm/arm/arm.c > @@ -63,7 +63,7 @@ static DEFINE_PER_CPU(struct kvm_vcpu *, > kvm_arm_running_vcpu); > static atomic64_t kvm_vmid_gen = ATOMIC64_INIT(1); > static u32 kvm_next_vmid; > static unsigned int kvm_vmid_bits __read_mostly; > -static DEFINE_SPINLOCK(kvm_vmid_lock); > +static DEFINE_RWLOCK(kvm_vmid_lock); > > static bool vgic_present; > > @@ -473,11 +473,16 @@ static void update_vttbr(struct kvm *kvm) > { > phys_addr_t pgd_phys; > u64 vmid; > + bool new_gen; > > - if (!need_new_vmid_gen(kvm)) > + read_lock(&kvm_vmid_lock); > + new_gen = need_new_vmid_gen(kvm); > + read_unlock(&kvm_vmid_lock); > + > + if (!new_gen) > return; > > - spin_lock(&kvm_vmid_lock); > + write_lock(&kvm_vmid_lock); > > /* >* We need to re-check the vmid_gen here to ensure that if another vcpu > @@ -485,7 +490,7 @@ static void update_vttbr(struct kvm *kvm) >* use the same vmid. >*/ > if (!need_new_vmid_gen(kvm)) { > - spin_unlock(&kvm_vmid_lock); > + write_unlock(&kvm_vmid_lock); > return; > } > > @@ -519,7 +524,7 @@ static void update_vttbr(struct kvm *kvm) > vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & > VTTBR_VMID_MASK(kvm_vmid_bits); > kvm->arch.vttbr = kvm_phys_to_vttbr(pgd_phys) | vmid; > > - spin_unlock(&kvm_vmid_lock); > + write_unlock(&kvm_vmid_lock); > } > > static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu) > -- > 2.14.2 > The above looks correct to me. I am wondering if something like the following would also work, which may be slightly more efficient, although I doubt the difference can be measured: diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c index dba629c5f8ac..7ac869bcad21 100644 --- a/virt/kvm/arm/arm.c +++ b/virt/kvm/arm/arm.c @@ -458,7 +458,9 @@ void force_vm_exit(const cpumask_t *mask) */ static bool need_new_vmid_gen(struct kvm *kvm) { - return unlikely(kvm->arch.vmid_gen != atomic64_read(&kvm_vmid_gen)); + u64 current_vmid_gen = atomic64_read(&kvm_vmid_gen); + smp_rmb(); /* Orders read of kvm_vmid_gen and kvm->arch.vmid */ + return unlikely(kvm->arch.vmid_gen != current_vmid_gen); } /** @@ -508,10 +510,11 @@ static void update_vttbr(struct kvm *kvm) kvm_call_hyp(__kvm_flush_vm_context); } - kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen); kvm->arch.vmid = kvm_next_vmid; kvm_next_vmid++; kvm_next_vmid &= (1 << kvm_vmid_bits) - 1; + smp_wmb(); + kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen); /* update vttbr to be used with the new vmid */ pgd_phys = virt_to_phys(kvm->arch.pgd); Thanks, -Christoffer ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm