In kernel amdgpu driver, kfd_wait_on_events is used to support user space 
signal event wait
function. For multiple threads waiting on same event scenery, race condition 
could occur
since some threads after checking signal condition, before calling 
kfd_wait_on_events, the
event interrupt could be fired and wake up other thread which are sleeping on 
this event.
Then those threads could fall into sleep without waking up again. Adding event 
age tracking
in both kernel and user mode, will help avoiding this race condition.

The changes for The user space ROCT-Thunk-Interface/ROCR-Runtime are listed 
below for
review togehter with kernel mode changes.

ROCT-Thunk-Interface:
https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/commit/efdbf6cfbc026bd68ac3c35d00dacf84370eb81e
https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/commit/910108272091d1ce61dbc48bd9519731e0e9cf52

ROCR-Runtime:
https://github.com/RadeonOpenCompute/ROCR-Runtime/compare/master...zhums:ROCR-Runtime:new_event_wait_review
https://github.com/RadeonOpenCompute/ROCR-Runtime/commit/e1f5bdb88eb882ac798aeca2c00ea3fbb2dba459
https://github.com/RadeonOpenCompute/ROCR-Runtime/commit/7d26afd14107b5c2a754c1a3f415d89f3aabb503

-v2: remove unnecessay link

-v3: 1. update kfd test cases (910108272091d1ce61dbc48bd9519731e0e9cf52)
     2. move event age match checking into init_event_waiter
     3. move last event age update into copy_signaled_event_data

James Zhu (5):
  drm/amdkfd: add event age tracking
  drm/amdkfd: add event_age tracking when receiving interrupt
  drm/amdkfd: set activated flag true when event age unmatchs
  drm/amdkfd: update user space last_event_age
  drm/amdkfd: bump kfd ioctl minor version for event age availability

 drivers/gpu/drm/amd/amdkfd/kfd_events.c | 44 ++++++++++++++++++-------
 drivers/gpu/drm/amd/amdkfd/kfd_events.h |  1 +
 include/uapi/linux/kfd_ioctl.h          | 13 ++++++--
 3 files changed, 44 insertions(+), 14 deletions(-)

-- 
2.34.1

Reply via email to