Re: [PATCH] powerpc/kvm/book3s64/vio: fix some RCU-list locks
On Sun, May 10, 2020 at 01:18:34AM -0400, Qian Cai wrote: > It is unsafe to traverse kvm->arch.spapr_tce_tables and > stt->iommu_tables without the RCU read lock held. Also, add > cond_resched_rcu() in places with the RCU read lock held that could take > a while to finish. > > arch/powerpc/kvm/book3s_64_vio.c:76 RCU-list traversed in non-reader > section!! > > other info that might help us debug this: > > rcu_scheduler_active = 2, debug_locks = 1 > no locks held by qemu-kvm/4265. > > stack backtrace: > CPU: 96 PID: 4265 Comm: qemu-kvm Not tainted 5.7.0-rc4-next-20200508+ #2 > Call Trace: > [c000201a8690f720] [c0715948] dump_stack+0xfc/0x174 (unreliable) > [c000201a8690f770] [c01d9470] lockdep_rcu_suspicious+0x140/0x164 > [c000201a8690f7f0] [c00810b9fb48] > kvm_spapr_tce_release_iommu_group+0x1f0/0x220 [kvm] > [c000201a8690f870] [c00810b8462c] > kvm_spapr_tce_release_vfio_group+0x54/0xb0 [kvm] > [c000201a8690f8a0] [c00810b84710] kvm_vfio_destroy+0x88/0x140 [kvm] > [c000201a8690f8f0] [c00810b7d488] kvm_put_kvm+0x370/0x600 [kvm] > [c000201a8690f990] [c00810b7e3c0] kvm_vm_release+0x38/0x60 [kvm] > [c000201a8690f9c0] [c05223f4] __fput+0x124/0x330 > [c000201a8690fa20] [c0151cd8] task_work_run+0xb8/0x130 > [c000201a8690fa70] [c01197e8] do_exit+0x4e8/0xfa0 > [c000201a8690fb70] [c011a374] do_group_exit+0x64/0xd0 > [c000201a8690fbb0] [c0132c90] get_signal+0x1f0/0x1200 > [c000201a8690fcc0] [c0020690] do_notify_resume+0x130/0x3c0 > [c000201a8690fda0] [c0038d64] syscall_exit_prepare+0x1a4/0x280 > [c000201a8690fe20] [c000c8f8] system_call_common+0xf8/0x278 > > > arch/powerpc/kvm/book3s_64_vio.c:368 RCU-list traversed in non-reader > section!! > > other info that might help us debug this: > > rcu_scheduler_active = 2, debug_locks = 1 > 2 locks held by qemu-kvm/4264: > #0: c000201ae2d000d8 (>mutex){+.+.}-{3:3}, at: > kvm_vcpu_ioctl+0xdc/0x950 [kvm] > #1: c000200c9ed0c468 (>srcu){}-{0:0}, at: > kvmppc_h_put_tce+0x88/0x340 [kvm] > > > arch/powerpc/kvm/book3s_64_vio.c:108 RCU-list traversed in non-reader > section!! > > other info that might help us debug this: > > rcu_scheduler_active = 2, debug_locks = 1 > 1 lock held by qemu-kvm/4257: > #0: c000200b1b363a40 (>lock){+.+.}-{3:3}, at: > kvm_vfio_set_attr+0x598/0x6c0 [kvm] > > > arch/powerpc/kvm/book3s_64_vio.c:146 RCU-list traversed in non-reader > section!! > > other info that might help us debug this: > > rcu_scheduler_active = 2, debug_locks = 1 > 1 lock held by qemu-kvm/4257: > #0: c000200b1b363a40 (>lock){+.+.}-{3:3}, at: > kvm_vfio_set_attr+0x598/0x6c0 [kvm] > > Signed-off-by: Qian Cai Thanks, applied to my kvm-ppc-next branch, with the cond_resched_rcu() in kvmppc_tce_validate removed. Paul.
Re: [PATCH] powerpc/kvm/book3s64/vio: fix some RCU-list locks
On Wed, May 27, 2020 at 11:13:23AM +1000, Paul Mackerras wrote: > On Sun, May 10, 2020 at 01:18:34AM -0400, Qian Cai wrote: > > It is unsafe to traverse kvm->arch.spapr_tce_tables and > > stt->iommu_tables without the RCU read lock held. Also, add > > cond_resched_rcu() in places with the RCU read lock held that could take > > a while to finish. > > This mostly looks fine. The cond_resched_rcu() in kvmppc_tce_validate > doesn't seem necessary (the list would rarely have more than a few > dozen entries) and could be a performance problem given that TCE > validation is a hot-path. > > Are you OK with me modifying the patch to take out that > cond_resched_rcu(), or is there some reason why it's essential that it > be there? Feel free to take out that cond_resched_rcu(). Your reasoning makes sense.
Re: [PATCH] powerpc/kvm/book3s64/vio: fix some RCU-list locks
On Sun, May 10, 2020 at 01:18:34AM -0400, Qian Cai wrote: > It is unsafe to traverse kvm->arch.spapr_tce_tables and > stt->iommu_tables without the RCU read lock held. Also, add > cond_resched_rcu() in places with the RCU read lock held that could take > a while to finish. This mostly looks fine. The cond_resched_rcu() in kvmppc_tce_validate doesn't seem necessary (the list would rarely have more than a few dozen entries) and could be a performance problem given that TCE validation is a hot-path. Are you OK with me modifying the patch to take out that cond_resched_rcu(), or is there some reason why it's essential that it be there? Paul.
[PATCH] powerpc/kvm/book3s64/vio: fix some RCU-list locks
It is unsafe to traverse kvm->arch.spapr_tce_tables and stt->iommu_tables without the RCU read lock held. Also, add cond_resched_rcu() in places with the RCU read lock held that could take a while to finish. arch/powerpc/kvm/book3s_64_vio.c:76 RCU-list traversed in non-reader section!! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 no locks held by qemu-kvm/4265. stack backtrace: CPU: 96 PID: 4265 Comm: qemu-kvm Not tainted 5.7.0-rc4-next-20200508+ #2 Call Trace: [c000201a8690f720] [c0715948] dump_stack+0xfc/0x174 (unreliable) [c000201a8690f770] [c01d9470] lockdep_rcu_suspicious+0x140/0x164 [c000201a8690f7f0] [c00810b9fb48] kvm_spapr_tce_release_iommu_group+0x1f0/0x220 [kvm] [c000201a8690f870] [c00810b8462c] kvm_spapr_tce_release_vfio_group+0x54/0xb0 [kvm] [c000201a8690f8a0] [c00810b84710] kvm_vfio_destroy+0x88/0x140 [kvm] [c000201a8690f8f0] [c00810b7d488] kvm_put_kvm+0x370/0x600 [kvm] [c000201a8690f990] [c00810b7e3c0] kvm_vm_release+0x38/0x60 [kvm] [c000201a8690f9c0] [c05223f4] __fput+0x124/0x330 [c000201a8690fa20] [c0151cd8] task_work_run+0xb8/0x130 [c000201a8690fa70] [c01197e8] do_exit+0x4e8/0xfa0 [c000201a8690fb70] [c011a374] do_group_exit+0x64/0xd0 [c000201a8690fbb0] [c0132c90] get_signal+0x1f0/0x1200 [c000201a8690fcc0] [c0020690] do_notify_resume+0x130/0x3c0 [c000201a8690fda0] [c0038d64] syscall_exit_prepare+0x1a4/0x280 [c000201a8690fe20] [c000c8f8] system_call_common+0xf8/0x278 arch/powerpc/kvm/book3s_64_vio.c:368 RCU-list traversed in non-reader section!! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 2 locks held by qemu-kvm/4264: #0: c000201ae2d000d8 (>mutex){+.+.}-{3:3}, at: kvm_vcpu_ioctl+0xdc/0x950 [kvm] #1: c000200c9ed0c468 (>srcu){}-{0:0}, at: kvmppc_h_put_tce+0x88/0x340 [kvm] arch/powerpc/kvm/book3s_64_vio.c:108 RCU-list traversed in non-reader section!! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 1 lock held by qemu-kvm/4257: #0: c000200b1b363a40 (>lock){+.+.}-{3:3}, at: kvm_vfio_set_attr+0x598/0x6c0 [kvm] arch/powerpc/kvm/book3s_64_vio.c:146 RCU-list traversed in non-reader section!! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 1 lock held by qemu-kvm/4257: #0: c000200b1b363a40 (>lock){+.+.}-{3:3}, at: kvm_vfio_set_attr+0x598/0x6c0 [kvm] Signed-off-by: Qian Cai --- arch/powerpc/kvm/book3s_64_vio.c | 19 +++ 1 file changed, 15 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c index 50555ad1db93..4f5016bab723 100644 --- a/arch/powerpc/kvm/book3s_64_vio.c +++ b/arch/powerpc/kvm/book3s_64_vio.c @@ -73,6 +73,7 @@ extern void kvm_spapr_tce_release_iommu_group(struct kvm *kvm, struct kvmppc_spapr_tce_iommu_table *stit, *tmp; struct iommu_table_group *table_group = NULL; + rcu_read_lock(); list_for_each_entry_rcu(stt, >arch.spapr_tce_tables, list) { table_group = iommu_group_get_iommudata(grp); @@ -87,7 +88,9 @@ extern void kvm_spapr_tce_release_iommu_group(struct kvm *kvm, kref_put(>kref, kvm_spapr_tce_liobn_put); } } + cond_resched_rcu(); } + rcu_read_unlock(); } extern long kvm_spapr_tce_attach_iommu_group(struct kvm *kvm, int tablefd, @@ -105,12 +108,14 @@ extern long kvm_spapr_tce_attach_iommu_group(struct kvm *kvm, int tablefd, if (!f.file) return -EBADF; + rcu_read_lock(); list_for_each_entry_rcu(stt, >arch.spapr_tce_tables, list) { if (stt == f.file->private_data) { found = true; break; } } + rcu_read_unlock(); fdput(f); @@ -143,6 +148,7 @@ extern long kvm_spapr_tce_attach_iommu_group(struct kvm *kvm, int tablefd, if (!tbl) return -EINVAL; + rcu_read_lock(); list_for_each_entry_rcu(stit, >iommu_tables, next) { if (tbl != stit->tbl) continue; @@ -150,14 +156,17 @@ extern long kvm_spapr_tce_attach_iommu_group(struct kvm *kvm, int tablefd, if (!kref_get_unless_zero(>kref)) { /* stit is being destroyed */ iommu_tce_table_put(tbl); + rcu_read_unlock(); return -ENOTTY; } /* * The table is already known to this KVM, we just increased * its KVM reference counter and can return. */ + rcu_read_unlock(); return 0; } + rcu_read_unlock(); stit = kzalloc(sizeof(*stit),