On 3/20/2024 5:52 PM, Mukul Joshi wrote:
Caution: This message originated from an External Source. Use proper caution
when opening attachments, clicking links, or responding.
Destroy the high priority workqueue that handles interrupts
during KFD node cleanup.
Signed-off-by: Mukul Joshi<[email protected]>
---
drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c
b/drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c
index dd3c43c1ad70..9b6b6e882593 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c
@@ -104,6 +104,8 @@ void kfd_interrupt_exit(struct kfd_node *node)
*/
flush_workqueue(node->ih_wq);
+ destroy_workqueue(node->ih_wq);
+
Here I think we should cancel work items that are still in the work
queue, not flush workqueue node->ih_wq. In this case the kfd functions
have been terminated, there is no way to handle the left work items.
That would make work queue flush never finish. I think it is the reason
there are orphan kernel tasks.
After cancel left work items we can call destroy_workqueue.
Regards
Xiaogang
kfifo_free(&node->ih_fifo);
}
--
2.35.1