Re: Does gbm_bo_map() implicitly synchronise?

James Jones Mon, 24 Jun 2024 12:09:13 -0700

FWIW, the NVIDIA binary driver's implementation of gbm_bo_map/unmap()

1) Don't do any synchronization against in-flight work. The assumptionis that if the content is going to be read, the API writing the data hasestablished that coherence. Likewise, if it's going to be written, theAPI reading it afterwards does any invalidates or whatever are neededfor coherence.

2) We don't blit anything or format convert, because our GBMimplementation has no DMA engine access, and I'd like to keep it thatway. Setting up a DMA-capable driver instance is much more expensive asfar as runtime resources than setting up a simple allocator+mmap driver,at least in our driver architecture. Our GBM map just does an mmap(),and if it's not linear, you're not going to be able to interpret thedata unless you've read up on our tiling formats. I'm aware this isdifferent from Mesa, and no one has complained thus far. If we wereforced to fix it, I imagine we'd do something like ask a shared enginein the kernel to do the blit on userspace's behalf, which would probablybe slow but save resources.

Basically, don't use gbm_bo_map() for anything non-trivial on ourimplementation. It's not the right tool for e.g., reading back orpopulating OpenGL textures or X pixmaps. If you don't want to run on theNV implementation, feel free to ignore this advice, but I'd stillsuggest it's not the best tool for most jobs.


Thanks,
-James

On 6/17/24 03:29, Pierre Ossman wrote:

On 17/06/2024 10:13, Christian König wrote:
Let me try to clarify a couple of things:
The DMA_BUF_IOCTL_SYNC function is to flush and invalidate caches sothat the GPU can see values written by the CPU and the CPU can seevalues written by the GPU. But that IOCTL does *not* wait for anyasync GPU operation to finish.
If you want to wait for async GPU operations you either need to callthe OpenGL functions to read pixels or do a select() (or poll, epolletc...) call on the DMA-buf file descriptor.
Thanks for the clarification!
Just to avoid any uncertainty, are both of these things done implicitlyby gbm_bo_map()/gbm_bo_unmap()?
I did test adding those steps just in case, but unfortunately did notsee an improvement. My order was:
1. gbm_bo_import(GBM_BO_USE_RENDERING)
2. gbm_bo_get_fd()
3. Wait for client to request displaying the buffer
4. gbm_bo_map(GBM_BO_TRANSFER_READ)
5. select(fd+1, &fds, NULL, NULL, NULL)
6. ioctl(DMA_BUF_IOCTL_SYNC, &{ .flags = DMA_BUF_SYNC_START |DMA_BUF_SYNC_READ })
7. pixman_blt()
8. gbm_bo_unmap()
So if you want to do some rendering with OpenGL and then see theresult in a buffer memory mapping the correct sequence would be thefollowing:
1. Issue OpenGL rendering commands.
2. Call glFlush() to make sure the hw actually starts working on therendering.3. Call select() on the DMA-buf file descriptor to wait for therendering to complete.
4. Use DMA_BUF_IOCTL_SYNC to make the rendering result CPU visible.
What I want to do is implement the X server side of DRI3 in just CPU. Itworks for every application I've tested except gnome-shell.
I would assume that 1. and 2. are supposed to be done by the X client,i.e. gnome-shell?
What I need to be able to do is access the result of that, once the Xclient tries to draw using that GBM backed pixmap (e.g. usingPresentPixmap).
So far, we've only tested Intel GPUs, but we are setting up Nvidia andAMD GPUs at the moment. It will be interesting to see if the issueremains on those or not.
Regards

Re: Does gbm_bo_map() implicitly synchronise?

Reply via email to