On Thu, Apr 04, 2013 at 12:39:47PM -0700, Andrew Honig wrote:
> If userspace creates and destroys multiple VMs within the same process
> we leak 20k of memory in the userspace process context per VM. This
> patch frees the memory in kvm_arch_destroy_vm. If the process exits
> without closing the VM file descriptor or the file descriptor has been
> shared with another process then we don't need to free the memory.
>
> Messing with user space memory from an fd is not ideal, but other changes
> would require user space changes and this is consistent with how the
> memory is currently allocated.
>
> Tested: Test ran several VMs and ran against test program meant to
> demonstrate the leak (www.spinics.net/lists/kvm/msg83734.html).
>
> Signed-off-by: Andrew Honig <[email protected]>
>
> ---
> arch/x86/include/asm/kvm_host.h | 3 +++
> arch/x86/kvm/vmx.c | 3 +++
> arch/x86/kvm/x86.c | 11 +++++++++++
> 3 files changed, 17 insertions(+)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 4979778..975a74d 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -553,6 +553,9 @@ struct kvm_arch {
> struct page *ept_identity_pagetable;
> bool ept_identity_pagetable_done;
> gpa_t ept_identity_map_addr;
> + unsigned long ept_ptr;
> + unsigned long apic_ptr;
> + unsigned long tss_ptr;
>
Better to use __kvm_set_memory_region() with memory_size = 0 to delete
the slot and fix kvm_arch_prepare_memory_region() to unmap if
change == KVM_MR_DELETE.
> unsigned long irq_sources_bitmap;
> s64 kvmclock_offset;
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 6667042..8aa5d81 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -3703,6 +3703,7 @@ static int alloc_apic_access_page(struct kvm *kvm)
> }
>
> kvm->arch.apic_access_page = page;
> + kvm->arch.apic_ptr = kvm_userspace_mem.userspace_addr;
> out:
> mutex_unlock(&kvm->slots_lock);
> return r;
> @@ -3733,6 +3734,7 @@ static int alloc_identity_pagetable(struct kvm *kvm)
> }
>
> kvm->arch.ept_identity_pagetable = page;
> + kvm->arch.ept_ptr = kvm_userspace_mem.userspace_addr;
> out:
> mutex_unlock(&kvm->slots_lock);
> return r;
> @@ -4366,6 +4368,7 @@ static int vmx_set_tss_addr(struct kvm *kvm, unsigned
> int addr)
> if (ret)
> return ret;
> kvm->arch.tss_addr = addr;
> + kvm->arch.tss_ptr = tss_mem.userspace_addr;
> if (!init_rmode_tss(kvm))
> return -ENOMEM;
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index f19ac0a..411ff2a 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -6812,6 +6812,16 @@ void kvm_arch_sync_events(struct kvm *kvm)
>
> void kvm_arch_destroy_vm(struct kvm *kvm)
> {
> + if (current->mm == kvm->mm) {
> + /*
> + * Free pages allocated on behalf of userspace, unless the
> + * the memory map has changed due to process exit or fd
> + * copying.
> + */
Why mm changes during process exit? And what do you mean by fd copying?
One process creates kvm fd and pass it to another? In this case I think
the leak will still be there since all of the addresses bellow are
mapped after kvm fd is created. apic_access_page and identity_pagetable
during first vcpu creation and tss when KVM_SET_TSS_ADDR ioctl is
called. Vcpu creation and ioctl call can be done by different process
from the one that created kvm fd.
> + vm_munmap(kvm->arch.apic_ptr, PAGE_SIZE);
> + vm_munmap(kvm->arch.ept_ptr, PAGE_SIZE);
> + vm_munmap(kvm->arch.tss_ptr, PAGE_SIZE * 3);
> + }
> kvm_iommu_unmap_guest(kvm);
> kfree(kvm->arch.vpic);
> kfree(kvm->arch.vioapic);
> @@ -6929,6 +6939,7 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
> return PTR_ERR((void *)userspace_addr);
>
> memslot->userspace_addr = userspace_addr;
> + mem->userspace_addr = userspace_addr;
> }
>
> return 0;
> --
> 1.7.10.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html