At the moment VFIO IOMMU SPAPR v2 driver pins all guest RAM pages when the userspace starts using VFIO. When the userspace process finishes, all the pinned pages need to be put; this is done as a part of the userspace memory context (MM) destruction which happens on the very last mmdrop().
This approach has a problem that a MM of the userspace process may live longer than the userspace process itself as kernel threads usually execute on a MM of a userspace process which was runnning on a CPU where the kernel thread was scheduled to. If this happened, the MM remains referenced until this exact kernel thread wakes up again and releases the very last reference to the MM, on an idle system this can take even hours. This fixes the issue by moving mm_iommu_cleanup() (the helper which puts pages) from destroy_context() (called on the last mmdrop()) to the arch-specific arch_exit_mmap() hook (called on the last mmput()). mmdrop() decrements the mm->mm_count which is a total reference number; mmput() decrements the mm->mm_users which is a number of user spaces and this is actually the counter we want to watch for here. Cc: David Gibson <da...@gibson.dropbear.id.au> Cc: Benjamin Herrenschmidt <b...@kernel.crashing.org> Cc: Paul Mackerras <pau...@samba.org> Cc: Balbir Singh <bsinghar...@gmail.com> Cc: Nick Piggin <npig...@kernel.dk> Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> --- arch/powerpc/include/asm/mmu_context.h | 3 +++ arch/powerpc/mm/mmu_context_book3s64.c | 4 ---- 2 files changed, 3 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h index 9d2cd0c..24b590d 100644 --- a/arch/powerpc/include/asm/mmu_context.h +++ b/arch/powerpc/include/asm/mmu_context.h @@ -138,6 +138,9 @@ static inline void arch_dup_mmap(struct mm_struct *oldmm, static inline void arch_exit_mmap(struct mm_struct *mm) { +#ifdef CONFIG_SPAPR_TCE_IOMMU + mm_iommu_cleanup(&mm->context); +#endif } static inline void arch_unmap(struct mm_struct *mm, diff --git a/arch/powerpc/mm/mmu_context_book3s64.c b/arch/powerpc/mm/mmu_context_book3s64.c index 19622222..aaeba74 100644 --- a/arch/powerpc/mm/mmu_context_book3s64.c +++ b/arch/powerpc/mm/mmu_context_book3s64.c @@ -159,10 +159,6 @@ static inline void destroy_pagetable_page(struct mm_struct *mm) void destroy_context(struct mm_struct *mm) { -#ifdef CONFIG_SPAPR_TCE_IOMMU - mm_iommu_cleanup(&mm->context); -#endif - #ifdef CONFIG_PPC_ICSWX drop_cop(mm->context.acop, mm); kfree(mm->context.cop_lockp); -- 2.5.0.rc3