On 01/08/17 05:01 PM, Jerome Glisse wrote:
Unless you can nop that in a config invariant fashion (like you can for
tracers) that's a NAK from the get go.  And we'd need to buffer them to be
practical since you might run the debugger out of sync with the application
(e.g. app hangs then you fire up umr to see what's going on).

Again when you start snooping you can access all the existing bo
informations and start monitoring event to keep userspace in sync
with any changes from the first snapshot you take.

You'd have to be able to do this post-mortem as well though. And ideally I don't really care about getting live updates (at least not yet though...). The way it would work now is you read the ring and get IB pointers which you then have to fetch and decode. So all I care about now is what does the pointer point to right now.

Though at this point I'm decoding a GPUVM address not a DMA nor physical address so I then have to decode the GPUVM address "somehow" (assuming I don't use my "unsafe" methods) then I can proceed to look up the buffer based on the DMA address.

(* being notified of changes might be useful later on. We do have more grandiose plans to integrate umr into gdb/etc functionality).

So you want to add something to the kernel just for a corner case ie
when debugging GPU hang. Not for more generic tools. I don't think
that the work needed for that is worth the effort just for such a
small usecase.

You can't really read GPU data when it's in flight anyways. Ask a GFX hardware person about reading registers when the GPU is running and they'll say "NAK."

So even if the application hasn't crashed if you want to inspect things you need to halt the waves (effectively halting the GPU).

    - GPU device memory is not necessary accessible (due to PCIE bar size
      limit i know that there is patches to allow to remap all GPU memory).

We have ways of accessing all of VRAM from basic MMIO accesses :)

Yes and it is insane to have to write VRAM address in one register and
read the VRAM value in another, insane from tedious point of view :)

I'd think double buffering **all** of VRAM in case you might want parts of it to be more insane.

Typical VRAM reads to decode a GPUVM address for instance involve reading 16 or 40 bytes total. It's pretty quick to be honest.

Kernel allocation should be out of your reach really.

Not if you're debugging ... the kernel (or more specifically your module).

Except again you're looking at this from the lens that the KMD is working
perfectly.  We use umr (our debugger) to debug the kernel driver itself.
That means you need to be able to read/write MMIO registers, peek at VRAM,
decode VM addresses, read rings, etc...

And you can do that with the scheme i propose.

Not everything is a BO though ...

Locking up is one thing, allowing to access data that should not be accessible
from userspace is another issue entirely. The former is bad, the latter is
worse.

Depends on what you're doing. Locking up a medical device that somehow allows unprivileged users on it ... :-)

No it works also if you are debugging kernel side ie anything that is mapped
to the GPU either into userspace GPU virtual address space or into kernel
GPU virtual address space.

Not all kernel allocations go through DRM though. You can just as easily pci_alloc_*() some memory and pass the dma address to the GPU.

No i don't think it would be anything big unless they object to an if()
in a hotpath.

They probably wouldn't.

I was saying that your design is restrictive and can not be use for other
thing and thus it is hard to justify the work for a single use case.

I'm not married to my trace idea. It just happens to work for user applications and is better than nothing (and is very unobtrusive). If you want to take a crack at a proper interface that accomplishes the same (and more) by all means I'll gladly adopt using it.

Just keep in mind we have no plans to remove our existing debugfs facilities.

Well maybe i will code it down latter today on the train back home.

Good luck Fermat.  :-)

Cheers,
Tom
_______________________________________________
iommu mailing list
[email protected]
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Reply via email to