Currently kfd manages kfd_process in a one context (kfd_process) per program manner, thus each user space program only onws one kfd context (kfd_process).
This model works fine for most of the programs, but imperfect for a hypervisor like QEMU. Because all programs in the guest user space share the same only one kfd context, which is problematic, including but not limited to: As illustrated in Figure 1, all guest user space programs share the same fd of /dev/kfd and the same kfd_process, and the same PASID leading to the same GPU_VM address space. Therefore the IOVA range of each guest user space programs are not isolated, they can attack each other through GPU DMA. +----------------------------------------------------------------------------------+ | | | +-----------+ +-----------+ +------------+ +------------+ | | | | | | | | | | | | | Program 1 | | Program 2 | | Program 3 | | Program N | | | | | | | | | | | | | +----+------+ +--------+--+ +--+---------+ +-----+------+ | | | | | | | | | | | | Guest | | | | | | | +-------+----------------------+------------+----------------------+---------------+ | | | | | | | | | | | | | | | | | +--+------------+---+ | | | file descriptor | | +-------------------+ of /dev/kfd +------------------+ | opened by QEMU | | | +---------+---------+ User Space | QEMU | ---------------------------------------+----------------------------------------------------- | Kernel Space | KFD Module | +--------+--------+ | | | kfd_process |<------------------The only one KFD context | | +--------+--------+ | +--------+--------+ | PASID | +--------+--------+ | +--------+--------+ | GPU_VM | +-----------------+ Fiture 1 This series implements a multiple contexts solution: - Allow each program to create and hold multiple contexts (kfd processes) - Each context has its own fd of /dev/kfd and an exclusive kfd_process, which is a secondary kfd context. So that PASID/GPU VM isolates their IOVA address spaces. Therefore, they can not attack each other through GPU DMA. The design is illustrated in Figure 2 below: +---------------------------------------------------------------------------------------------------------+ | | | | | | | +----------------------------------------------------------------------------------+ | | | | | | | +-----------+ +-----------+ +-----------+ +-----------+ | | | | | | | | | | | | | | | | | Program 1 | | Program 2 | | Program 3 | | Program N | | | | | | | | | | | | | | | | | +-----+-----+ +-----+-----+ +-----+-----+ +-----+-----+ | | | | | | | | | | | | | | | | Guest | | | | | | | | | | | +-------+------------------+-----------------+----------------+--------------------+ | | | | | | QEMU | | | | | | | +---------------+------------------+-----------------+----------------+--------------------------+--------+ | | | | | | | | | | | | | | | +---+----+ +---+----+ +---+----+ +---+----+ +---+-----+ | | | | | | | | | Primary | | FD 1 | | FD 2 | | FD 3 | | FD 4 | | FD | | | | | | | | | | | +---+----+ +---+----+ +---+----+ +----+---+ +----+----+ | | | | | User Space | | | | | -------------------+------------------+-----------------+-----------------+--------------------------+---------------------------- | | | | | Kernel SPace | | | | | | | | | | +--------------------------------------------------------------------------------------------------------------------------+ | +------+------+ +------+------+ +------+------+ +------+------+ +------+------+ | | | Secondary | | Secondary | | Secondary | | Secondary | | Primary | KFD Module | | |kfd_process 1| |kfd_process 2| |kfd_process 3| |kfd_process 4| | kfd_process | | | | | | | | | | | | | | | +------+------+ +------+------+ +------+------+ +------+------+ +------+------+ | | | | | | | | | +------+------+ +------+------+ +------+------+ +------+------+ +------+------+ | | | PASID | | PASID | | PASID | | PASID | | PASID | | | +------+------+ +------+------+ +------+------+ +------+------+ +------+------+ | | | | | | | | | | | | | | | | +------+------+ +------+------+ +------+------+ +------+------+ +------+------+ | | | GPU_VM | | GPU_VM | | GPU_VM | | GPU_VM | | GPU_VM | | | +-------------+ +-------------+ +-------------+ +-------------+ +-------------+ | | | +--------------------------------------------------------------------------------------------------------------------------+ Figure 2 Appendix, a minimal test program: #include <stdio.h> #include <stdlib.h> #include <fcntl.h> #include <unistd.h> #include <errno.h> #include <sys/ioctl.h> #include <linux/kfd_ioctl.h> int main() { int fd1, fd2, ret; // open FDs fd1 = open("/dev/kfd", O_RDWR); if (fd1 < 0) { perror("Failed to open FD1 /dev/kfd"); return EXIT_FAILURE; } printf("FD1 == %d /dev/kfd opened successfully\n", fd1); getchar(); fd2 = open("/dev/kfd", O_RDWR); if (fd2 < 0) { perror("Failed to open FD2 /dev/kfd"); return EXIT_FAILURE; } printf("FD2 == %d /dev/kfd opened successfully\n", fd2); getchar(); // create a new secondary context ret = ioctl(fd2, AMDKFD_IOC_CREATE_PROCESS); printf("AMDKFD_IOC_CREATE_PROCESS ioctl ret = %d\n", ret); getchar(); // close FDs close(fd2); getchar(); close(fd1); getchar(); return EXIT_SUCCESS; } Zhu Lingshan (13): amdkfd: enlarge the hashtable of kfd_process amdkfd: mark the first kfd_process as the primary one amdkfd: find_process_by_mm always return the primary context amdkfd: Introduce kfd_create_process_sysfs as a separate function amdkfd: destroy kfd secondary contexts through fd close amdkfd: process svm ioctl only on the primary kfd process amdkfd: process USERPTR allocation only on the primary kfd process amdkfd: identify a secondary kfd process by its id amdkfd: find kfd_process by filep->private_data in kfd_mmap amdkfd: process pointer of a HIQ should be set to NULL amdkfd: remove DIQ support amdkfd: remove test_kq amdkfd: introduce new ioctl AMDKFD_IOC_CREATE_PROCESS drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 73 +++++- .../drm/amd/amdkfd/kfd_device_queue_manager.c | 6 +- drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c | 60 +---- .../drm/amd/amdkfd/kfd_packet_manager_v9.c | 4 - .../drm/amd/amdkfd/kfd_packet_manager_vi.c | 4 - drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 15 +- drivers/gpu/drm/amd/amdkfd/kfd_process.c | 223 ++++++++++++------ .../amd/amdkfd/kfd_process_queue_manager.c | 39 +-- include/uapi/linux/kfd_ioctl.h | 8 +- 9 files changed, 248 insertions(+), 184 deletions(-) -- 2.47.1