Re: [virtio-dev] [PATCH] [RFC RESEND] vdec: Add virtio video decode device specification
> > I doubt you can handle pci memory bars like regular ram when it comes to > > dma and iommu support. There is a reason we have p2pdma in the first > > place ... > > The thing is that such bars would be actually backed by regular host > RAM. Do we really need the complexity of real PCI bar handling for > that? Well, taking shortcuts because of virtualization-specific assumptions already caused problems in the past. See the messy iommu handling we have in virtio-pci for example. So I don't feel like going the "we know it's just normal pages, so lets simplify things" route. Beside that hostmap isn't important for secure buffers, we wouldn't allow the guest mapping them anyway ;) cheers, Gerd
Re: [virtio-dev] [PATCH] [RFC RESEND] vdec: Add virtio video decode device specification
Hi, > That could be still a guest physical address. Like on a bare metal > system with TrustZone, there could be physical memory that is not > accessible to the CPU. Hmm. Yes, maybe. We could use the dma address of the (first page of the) guest buffer. In case of a secure buffer the guest has no access to the guest buffer would be unused, but it would at least make sure that things don't crash in case someone tries to map & access the buffer. The host should be able to figure the corresponding host buffer from the guest buffer address. When running drm-misc-next you should be able to test whenever that'll actually work without any virtio-gpu driver changes. cheers, Gerd
Re: [virtio-dev] [PATCH] [RFC RESEND] vdec: Add virtio video decode device specification
On Thu, Oct 17, 2019 at 4:44 PM Gerd Hoffmann wrote: > > Hi, > > > > Also note that the guest manages the address space, so the host can't > > > simply allocate guest page addresses. > > > > Is this really true? I'm not an expert in this area, but on a bare > > metal system it's the hardware or firmware that sets up the various > > physical address allocations on a hardware level and most of the time > > most of the addresses are already pre-assigned in hardware (like the > > DRAM base, various IOMEM spaces, etc.). > > Yes, the firmware does it. Same in a VM, ovmf or seabios (which runs > inside the guest) typically does it. And sometimes the linux kernel > too. > > > I think that means that we could have a reserved region that could be > > used by the host for dynamic memory hot-plug-like operation. The > > reference to memory hot-plug here is fully intentional, we could even > > use this feature of Linux to get struct pages for such memory if we > > really wanted. > > We try to avoid such quirks whenever possible. Negotiating such things > between qemu and firmware can be done if really needed (and actually is > done for memory hotplug support), but it's an extra interface which > needs maintenance. > > > > Mapping host virtio-gpu resources > > > into guest address space is planned, it'll most likely use a pci memory > > > bar to reserve some address space. The host can map resources into that > > > pci bar, on guest request. > > > > Sounds like a viable option too. Do you have a pointer to some > > description on how this would work on both host and guest side? > > Some early code: > https://git.kraxel.org/cgit/qemu/log/?h=sirius/virtio-gpu-memory-v2 > https://git.kraxel.org/cgit/linux/log/?h=drm-virtio-memory-v2 > > Branches have other stuff too, look for "hostmem" commits. > > Not much code yet beyond creating a pci bar on the host and detecting > presence in the guest. > > On the host side qemu would create subregions inside the hostmem memory > region for the resources. > > Oh the guest side we can ioremap stuff, like vram. > > > > Hmm, well, pci memory bars are *not* backed by pages. Maybe we can use > > > Documentation/driver-api/pci/p2pdma.rst though. With that we might be > > > able to lookup buffers using device and dma address, without explicitly > > > creating some identifier. Not investigated yet in detail. > > > > Not backed by pages as in "struct page", but those are still regular > > pages of the physical address space. > > Well, maybe not. Host gem object could live in device memory, and if we > map them into the guest ... That's an interesting scenario, but in that case would we still want to map it into the guest? I think in such case may need to have some shadow buffer in regular RAM and that's already implemented in virtio-gpu. > > > That said, currently the sg_table interface is only able to describe > > physical memory using struct page pointers. It's been a long standing > > limitation affecting even bare metal systems, so perhaps it's just the > > right time to make them possible to use some other identifiers, like > > PFNs? > > I doubt you can handle pci memory bars like regular ram when it comes to > dma and iommu support. There is a reason we have p2pdma in the first > place ... The thing is that such bars would be actually backed by regular host RAM. Do we really need the complexity of real PCI bar handling for that? Best regards, Tomasz
Re: [virtio-dev] [PATCH] [RFC RESEND] vdec: Add virtio video decode device specification
On Thu, Oct 17, 2019 at 4:19 PM Gerd Hoffmann wrote: > > Hi, > > > That said, Chrome OS would use a similar model, except that we don't > > use ION. We would likely use minigbm backed by virtio-gpu to allocate > > appropriate secure buffers for us and then import them to the V4L2 > > driver. > > What exactly is a "secure buffer"? I guess a gem object where read > access is not allowed, only scanout to display? Who enforces this? > The hardware? Or the kernel driver? In general, it's a buffer which can be accessed only by a specific set of entities. The set depends on the use case and the level of security you want to achieve. In Chrome OS we at least want to make such buffers completely inaccessible for the guest, enforced by the VMM, for example by not installing corresponding memory into the guest address space (and not allowing transfers if the virtio-gpu shadow buffer model is used). Beyond that, the host memory itself could be further protected by some hardware mechanisms or another hypervisor running above the host OS, like in the ARM TrustZone model. That shouldn't matter for a VM guest, though. > > It might make sense for virtio-gpu to know that concept, to allow guests > ask for secure buffers. > > And of course we'll need some way to pass around identifiers for these > (and maybe other) buffers (from virtio-gpu device via guest drivers to > virtio-vdec device). virtio-gpu guest driver could generate a uuid for > that, attach it to the dma-buf and also notify the host so qemu can > maintain a uuid -> buffer lookup table. That could be still a guest physical address. Like on a bare metal system with TrustZone, there could be physical memory that is not accessible to the CPU. Best regards, Tomasz
Re: [virtio-dev] [PATCH] [RFC RESEND] vdec: Add virtio video decode device specification
On Tue, Oct 15, 2019 at 11:06 PM Dmitry Morozov wrote: > > Hello Gerd, > > On Dienstag, 15. Oktober 2019 09:54:22 CEST Gerd Hoffmann wrote: > > On Mon, Oct 14, 2019 at 03:05:03PM +0200, Dmitry Morozov wrote: > > > On Montag, 14. Oktober 2019 14:34:43 CEST Gerd Hoffmann wrote: > > > > Hi, > > > > > > > > > My take on this (for a decoder) would be to allocate memory for output > > > > > buffers from a secure ION heap, import in the v4l2 driver, and then to > > > > > provide those to the device using virtio. The device side then uses > > > > > the > > > > > dmabuf framework to make the buffers accessible for the hardware. I'm > > > > > not > > > > > sure about that, it's just an idea. > > > > > > > > Virtualization aside, how does the complete video decoding workflow > > > > work? I assume along the lines of ... > > > > > > > > (1) allocate buffer for decoded video frames (from ion). > > > > (2) export those buffers as dma-buf. > > > > (3) import dma-buf to video decoder. > > > > (4) import dma-buf to gpu. > > > > > > > > ... to establish buffers shared between video decoder and gpu? > > > > > > > > Then feed the video stream into the decoder, which decodes into the ion > > > > buffers? Ask the gpu to scanout the ion buffers to show the video? > > > > > > > > cheers, > > > > > > > > Gerd > > > > > > Yes, exactly. > > > > > > [decoder] > > > 1) Input buffers are allocated using VIDIOC_*BUFS. > > > > Ok. > > > > > 2) Output buffers are allocated in a guest specific manner (ION, gbm). > > > > Who decides whenever ION or gbm is used? The phrase "secure ION heap" > > used above sounds like using ION is required for decoding drm-protected > > content. > > I mention the secure ION heap to address this Chrome OS related point: > > 3) protected content decoding: the memory for decoded video frames > > must not be accessible to the guest at all > > There was an RFC to implement a secure memory allocation framework, but > apparently it was not accepted: https://lwn.net/Articles/661549/. > > In case of Android, it allocates GPU buffers for output frames, so it is the > gralloc implementation who decides how to allocate memory. It can use some > dedicated ION heap or can use libgbm. It can also be some proprietary > implementation. > > > > > So, do we have to worry about ION here? Or can we just use gbm? > > If we replace vendor specific code in the Android guest and provide a way to > communicate meatdata for buffer allocations from the device to the driver, we > can use gbm. In the PC world it might be easier. > > > > > [ Note: don't know much about ion, other than that it is used by > > android, is in staging right now and patches to move it > > out of staging are floating around @ dri-devel ] Chrome OS has cros_gralloc, which is an open source implementation of gralloc on top of minigbm (which itself is built on top of the Linux DRM interfaces). It's not limited to Chrome OS and I believe Intel also uses it for their native Android setups. With that, we could completely disregard ION, but I feel like it's not a core problem here. Whoever wants to use ION should be still able to do so if they back the allocations with guest pages or memory coming from the host using some other interface and it can be described using an identifier compatible with what we're discussing here. Best regards, Tomasz
Re: [virtio-dev] [PATCH] [RFC RESEND] vdec: Add virtio video decode device specification
Hi, > > Also note that the guest manages the address space, so the host can't > > simply allocate guest page addresses. > > Is this really true? I'm not an expert in this area, but on a bare > metal system it's the hardware or firmware that sets up the various > physical address allocations on a hardware level and most of the time > most of the addresses are already pre-assigned in hardware (like the > DRAM base, various IOMEM spaces, etc.). Yes, the firmware does it. Same in a VM, ovmf or seabios (which runs inside the guest) typically does it. And sometimes the linux kernel too. > I think that means that we could have a reserved region that could be > used by the host for dynamic memory hot-plug-like operation. The > reference to memory hot-plug here is fully intentional, we could even > use this feature of Linux to get struct pages for such memory if we > really wanted. We try to avoid such quirks whenever possible. Negotiating such things between qemu and firmware can be done if really needed (and actually is done for memory hotplug support), but it's an extra interface which needs maintenance. > > Mapping host virtio-gpu resources > > into guest address space is planned, it'll most likely use a pci memory > > bar to reserve some address space. The host can map resources into that > > pci bar, on guest request. > > Sounds like a viable option too. Do you have a pointer to some > description on how this would work on both host and guest side? Some early code: https://git.kraxel.org/cgit/qemu/log/?h=sirius/virtio-gpu-memory-v2 https://git.kraxel.org/cgit/linux/log/?h=drm-virtio-memory-v2 Branches have other stuff too, look for "hostmem" commits. Not much code yet beyond creating a pci bar on the host and detecting presence in the guest. On the host side qemu would create subregions inside the hostmem memory region for the resources. Oh the guest side we can ioremap stuff, like vram. > > Hmm, well, pci memory bars are *not* backed by pages. Maybe we can use > > Documentation/driver-api/pci/p2pdma.rst though. With that we might be > > able to lookup buffers using device and dma address, without explicitly > > creating some identifier. Not investigated yet in detail. > > Not backed by pages as in "struct page", but those are still regular > pages of the physical address space. Well, maybe not. Host gem object could live in device memory, and if we map them into the guest ... > That said, currently the sg_table interface is only able to describe > physical memory using struct page pointers. It's been a long standing > limitation affecting even bare metal systems, so perhaps it's just the > right time to make them possible to use some other identifiers, like > PFNs? I doubt you can handle pci memory bars like regular ram when it comes to dma and iommu support. There is a reason we have p2pdma in the first place ... cheers, Gerd
Re: [virtio-dev] [PATCH] [RFC RESEND] vdec: Add virtio video decode device specification
Hi, > That said, Chrome OS would use a similar model, except that we don't > use ION. We would likely use minigbm backed by virtio-gpu to allocate > appropriate secure buffers for us and then import them to the V4L2 > driver. What exactly is a "secure buffer"? I guess a gem object where read access is not allowed, only scanout to display? Who enforces this? The hardware? Or the kernel driver? It might make sense for virtio-gpu to know that concept, to allow guests ask for secure buffers. And of course we'll need some way to pass around identifiers for these (and maybe other) buffers (from virtio-gpu device via guest drivers to virtio-vdec device). virtio-gpu guest driver could generate a uuid for that, attach it to the dma-buf and also notify the host so qemu can maintain a uuid -> buffer lookup table. cheers, Gerd
Re: [virtio-dev] [PATCH] [RFC RESEND] vdec: Add virtio video decode device specification
> Hmm, the cross-device buffer sharing framework I have in mind would > basically be a buffer registry. virtio-gpu would create buffers as > usual, create a identifier somehow (details to be hashed out), attach > the identifier to the dma-buf so it can be used as outlined above. Using physical addresses to identify buffers is using the guest physical address space as the buffer registry. Especially if every device should be able to operate in isolation, then each virtio protocol will have some way to allocate buffers that are accessible to the guest and host. This requires guest physical addresses, and the guest physical address of the start of the buffer can serve as the unique identifier for the buffer in both the guest and the host. Even with buffers that are only accessible to the host, I think it's reasonable to allocate guest physical addresses since the pages still exist (in the same way physical addresses for secure physical memory make sense). This approach also sidesteps the need for explicit registration. With explicit registration, either there would need to be some centralized buffer exporter device or each protocol would need to have its own export function. Using guest physical addresses means that buffers get a unique identifier during creation. For example, in the virtio-gpu protocol, buffers would get this identifier through VIRTIO_GPU_CMD_RESOURCE_ATTACH_BACKING, or through VIRTIO_GPU_CMD_RESOURCE_CREATE_V2 with impending additions to resource creation. > Also note that the guest manages the address space, so the host can't > simply allocate guest page addresses. Mapping host virtio-gpu resources > into guest address space is planned, it'll most likely use a pci memory > bar to reserve some address space. The host can map resources into that > pci bar, on guest request. > > > - virtio-gpu driver could then create a regular DMA-buf object for > > such memory, because it's just backed by pages (even though they may > > not be accessible to the guest; just like in the case of TrustZone > > memory protection on bare metal systems), > > Hmm, well, pci memory bars are *not* backed by pages. Maybe we can use > Documentation/driver-api/pci/p2pdma.rst though. With that we might be > able to lookup buffers using device and dma address, without explicitly > creating some identifier. Not investigated yet in detail. For the linux guest implementation, mapping a dma-buf doesn't necessarily require actual pages. The exporting driver's map_dma_buf function just needs to provide a sg_table with populated dma_addres fields, it doesn't actually need to populate the sg_table with pages. At the very least, there are places such as i915_gem_stolen.c and (some situations of) videobuf-dma-sg.c that take this approach. Cheers, David
Re: [virtio-dev] [PATCH] [RFC RESEND] vdec: Add virtio video decode device specification
On Mon, Oct 14, 2019 at 9:19 PM Gerd Hoffmann wrote: > > > > Well. I think before even discussing the protocol details we need a > > > reasonable plan for buffer handling. I think using virtio-gpu buffers > > > should be an optional optimization and not a requirement. Also the > > > motivation for that should be clear (Let the host decoder write directly > > > to virtio-gpu resources, to display video without copying around the > > > decoded framebuffers from one device to another). > > > > Just to make sure we're on the same page, what would the buffers come > > from if we don't use this optimization? > > > > I can imagine a setup like this; > > 1) host device allocates host memory appropriate for usage with host > > video decoder, > > 2) guest driver allocates arbitrary guest pages for storage > > accessible to the guest software, > > 3) guest userspace writes input for the decoder to guest pages, > > 4) guest driver passes the list of pages for the input and output > > buffers to the host device > > 5) host device copies data from input guest pages to host buffer > > 6) host device runs the decoding > > 7) host device copies decoded frame to output guest pages > > 8) guest userspace can access decoded frame from those pages; back to 3 > > > > Is that something you have in mind? > > I don't have any specific workflow in mind. > > If you want display the decoded video frames you want use dma-bufs shared > by video decoder and gpu, right? So the userspace application (video > player probably) would create the buffers using one of the drivers, > export them as dma-buf, then import them into the other driver. Just > like you would do on physical hardware. So, when using virtio-gpu > buffers: > > (1) guest app creates buffers using virtio-gpu. > (2) guest app exports virtio-gpu buffers buffers as dma-buf. > (3) guest app imports the dma-bufs into virtio-vdec. > (4) guest app asks the virtio-vdec driver to write the decoded > frames into the dma-bufs. > (5) guest app asks the virtio-gpu driver to display the decoded > frame. > > The guest video decoder driver passes the dma-buf pages to the host, and > it is the host driver's job to fill the buffer. How this is done > exactly might depend on hardware capabilities (whenever a host-allocated > bounce buffer is needed or whenever the hardware can decode directly to > the dma-buf passed by the guest driver) and is an implementation detail. > > Now, with cross-device sharing added the virtio-gpu would attach some > kind of identifier to the dma-buf, virtio-vdec could fetch the > identifier and pass it to the host too, and the host virtio-vdec device > can use the identifier to get a host dma-buf handle for the (virtio-gpu) > buffer. Ask the host video decoder driver to import the host dma-buf. > If it all worked fine it can ask the host hardware to decode directly to > the host virtio-gpu resource. > Agreed. > > > Referencing virtio-gpu buffers needs a better plan than just re-using > > > virtio-gpu resource handles. The handles are device-specific. What if > > > there are multiple virtio-gpu devices present in the guest? > > > > > > I think we need a framework for cross-device buffer sharing. One > > > possible option would be to have some kind of buffer registry, where > > > buffers can be registered for cross-device sharing and get a unique > > > id (a uuid maybe?). Drivers would typically register buffers on > > > dma-buf export. > > > > This approach could possibly let us handle this transparently to > > importers, which would work for guest kernel subsystems that rely on > > the ability to handle buffers like native memory (e.g. having a > > sgtable or DMA address) for them. > > > > How about allocating guest physical addresses for memory corresponding > > to those buffers? On the virtio-gpu example, that could work like > > this: > > - by default a virtio-gpu buffer has only a resource handle, > > - VIRTIO_GPU_RESOURCE_EXPORT command could be called to have the > > virtio-gpu device export the buffer to a host framework (inside the > > VMM) that would allocate guest page addresses for it, which the > > command would return in a response to the guest, > > Hmm, the cross-device buffer sharing framework I have in mind would > basically be a buffer registry. virtio-gpu would create buffers as > usual, create a identifier somehow (details to be hashed out), attach > the identifier to the dma-buf so it can be used as outlined above. > > Also note that the guest manages the address space, so the host can't > simply allocate guest page addresses. Is this really true? I'm not an expert in this area, but on a bare metal system it's the hardware or firmware that sets up the various physical address allocations on a hardware level and most of the time most of the addresses are already pre-assigned in hardware (like the DRAM base, various IOMEM spaces, etc.). I think that means that we could have a reserved region that co
Re: [virtio-dev] [PATCH] [RFC RESEND] vdec: Add virtio video decode device specification
On Fri, Oct 11, 2019 at 5:54 PM Dmitry Morozov wrote: > > Hi Tomasz, > > On Mittwoch, 9. Oktober 2019 05:55:45 CEST Tomasz Figa wrote: > > On Tue, Oct 8, 2019 at 12:09 AM Dmitry Morozov > > > > wrote: > > > Hi Tomasz, > > > > > > On Montag, 7. Oktober 2019 16:14:13 CEST Tomasz Figa wrote: > > > > Hi Dmitry, > > > > > > > > On Mon, Oct 7, 2019 at 11:01 PM Dmitry Morozov > > > > > > > > wrote: > > > > > Hello, > > > > > > > > > > We at OpenSynergy are also working on an abstract paravirtualized > > > > > video > > > > > streaming device that operates input and/or output data buffers and > > > > > can be > > > > > used as a generic video decoder/encoder/input/output device. > > > > > > > > > > We would be glad to share our thoughts and contribute to the > > > > > discussion. > > > > > Please see some comments regarding buffer allocation inline. > > > > > > > > > > Best regards, > > > > > Dmitry. > > > > > > > > > > On Samstag, 5. Oktober 2019 08:08:12 CEST Tomasz Figa wrote: > > > > > > Hi Gerd, > > > > > > > > > > > > On Mon, Sep 23, 2019 at 5:56 PM Gerd Hoffmann > wrote: > > > > > > > Hi, > > > > > > > > > > > > > > > Our prototype implementation uses [4], which allows the > > > > > > > > virtio-vdec > > > > > > > > device to use buffers allocated by virtio-gpu device. > > > > > > > > > > > > > > > > [4] https://lkml.org/lkml/2019/9/12/157 > > > > > > > > > > > > First of all, thanks for taking a look at this RFC and for valuable > > > > > > feedback. Sorry for the late reply. > > > > > > > > > > > > For reference, Keiichi is working with me and David Stevens on > > > > > > accelerated video support for virtual machines and integration with > > > > > > other virtual devices, like virtio-gpu for rendering or our > > > > > > currently-downstream virtio-wayland for display (I believe there is > > > > > > ongoing work to solve this problem in upstream too). > > > > > > > > > > > > > Well. I think before even discussing the protocol details we need > > > > > > > a > > > > > > > reasonable plan for buffer handling. I think using virtio-gpu > > > > > > > buffers > > > > > > > should be an optional optimization and not a requirement. Also > > > > > > > the > > > > > > > motivation for that should be clear (Let the host decoder write > > > > > > > directly > > > > > > > to virtio-gpu resources, to display video without copying around > > > > > > > the > > > > > > > decoded framebuffers from one device to another). > > > > > > > > > > > > Just to make sure we're on the same page, what would the buffers > > > > > > come > > > > > > from if we don't use this optimization? > > > > > > > > > > > > I can imagine a setup like this; > > > > > > > > > > > > 1) host device allocates host memory appropriate for usage with > > > > > > host > > > > > > > > > > > > video decoder, > > > > > > > > > > > > 2) guest driver allocates arbitrary guest pages for storage > > > > > > > > > > > > accessible to the guest software, > > > > > > > > > > > > 3) guest userspace writes input for the decoder to guest pages, > > > > > > 4) guest driver passes the list of pages for the input and output > > > > > > > > > > > > buffers to the host device > > > > > > > > > > > > 5) host device copies data from input guest pages to host buffer > > > > > > 6) host device runs the decoding > > > > > > 7) host device copies decoded frame to output guest pages > > > > > > 8) guest userspace can access decoded frame from those pages; back > > > > > > to 3 > > > > > > > > > > > > Is that something you have in mind? > > > > > > > > > > While GPU side allocations can be useful (especially in case of > > > > > decoder), > > > > > it could be more practical to stick to driver side allocations. This > > > > > is > > > > > also due to the fact that paravirtualized encoders and cameras are not > > > > > necessarily require a GPU device. > > > > > > > > > > Also, the v4l2 framework already features convenient helpers for CMA > > > > > and > > > > > SG > > > > > allocations. The buffers can be used in the same manner as in > > > > > virtio-gpu: > > > > > buffers are first attached to an already allocated buffer/resource > > > > > descriptor and then are made available for processing by the device > > > > > using > > > > > a dedicated command from the driver. > > > > > > > > First of all, thanks a lot for your input. This is a relatively new > > > > area of virtualization and we definitely need to collect various > > > > possible perspectives in the discussion. > > > > > > > > From Chrome OS point of view, there are several aspects for which the > > > > guest side allocation doesn't really work well: > > > > 1) host-side hardware has a lot of specific low level allocation > > > > requirements, like alignments, paddings, address space limitations and > > > > so on, which is not something that can be (easily) taught to the guest > > > > OS, > > > > > > I couldn't agree more. There are some changes by Greg to add support for > > > querying GPU buffer met
Re: [virtio-dev] [PATCH] [RFC RESEND] vdec: Add virtio video decode device specification
Hello Gerd, On Dienstag, 15. Oktober 2019 09:54:22 CEST Gerd Hoffmann wrote: > On Mon, Oct 14, 2019 at 03:05:03PM +0200, Dmitry Morozov wrote: > > On Montag, 14. Oktober 2019 14:34:43 CEST Gerd Hoffmann wrote: > > > Hi, > > > > > > > My take on this (for a decoder) would be to allocate memory for output > > > > buffers from a secure ION heap, import in the v4l2 driver, and then to > > > > provide those to the device using virtio. The device side then uses > > > > the > > > > dmabuf framework to make the buffers accessible for the hardware. I'm > > > > not > > > > sure about that, it's just an idea. > > > > > > Virtualization aside, how does the complete video decoding workflow > > > work? I assume along the lines of ... > > > > > > (1) allocate buffer for decoded video frames (from ion). > > > (2) export those buffers as dma-buf. > > > (3) import dma-buf to video decoder. > > > (4) import dma-buf to gpu. > > > > > > ... to establish buffers shared between video decoder and gpu? > > > > > > Then feed the video stream into the decoder, which decodes into the ion > > > buffers? Ask the gpu to scanout the ion buffers to show the video? > > > > > > cheers, > > > > > > Gerd > > > > Yes, exactly. > > > > [decoder] > > 1) Input buffers are allocated using VIDIOC_*BUFS. > > Ok. > > > 2) Output buffers are allocated in a guest specific manner (ION, gbm). > > Who decides whenever ION or gbm is used? The phrase "secure ION heap" > used above sounds like using ION is required for decoding drm-protected > content. I mention the secure ION heap to address this Chrome OS related point: > 3) protected content decoding: the memory for decoded video frames > must not be accessible to the guest at all There was an RFC to implement a secure memory allocation framework, but apparently it was not accepted: https://lwn.net/Articles/661549/. In case of Android, it allocates GPU buffers for output frames, so it is the gralloc implementation who decides how to allocate memory. It can use some dedicated ION heap or can use libgbm. It can also be some proprietary implementation. > > So, do we have to worry about ION here? Or can we just use gbm? If we replace vendor specific code in the Android guest and provide a way to communicate meatdata for buffer allocations from the device to the driver, we can use gbm. In the PC world it might be easier. > > [ Note: don't know much about ion, other than that it is used by > android, is in staging right now and patches to move it > out of staging are floating around @ dri-devel ] > > > 3) Both input and output buffers are exported as dma-bufs. > > 4) The backing storage of both inputs and outputs is made available to the > > device. > > 5) Decoder hardware writes to output buffers directly. > > As expected. > > > 6) Back to the guest side, the output dma-bufs are used by (virtio-) gpu. > > Ok. So, virtio-gpu has support for dma-buf exports (in drm-misc-next, > should land upstream in kernel 5.5). dma-buf imports are not that > simple unfortunately. When using the gbm allocation route dma-buf > exports are good enough though. > > The virtio-gpu resources have both a host buffer and a guest buffer[1] > Data can be copied using the DRM_IOCTL_VIRTGPU_TRANSFER_{FROM,TO}_HOST > ioctls. The dma-buf export will export the guest buffer (which lives > in guest ram). > > It would make sense for the decoded video to go directly to the host > buffer though. First because we want avoid copying the video frames for > performance reasons, and second because we might not be able to copy > video frames (drm ...). > > This is where the buffer registry idea comes in. Attach a (host) > identifier to (guest) dma-bufs, which then allows host device emulation > share buffers, i.e. virtio-vdec device emulation could decode to a > dma-buf it got from virtio-gpu device emulation. Yes. Also, as I mentioned above, in case of gbm the buffers already can originate from GPU. Best regards, Dmitry. > > Alternatively we could use virtual ION (or whatever it becomes after > de-staging) for buffer management, with both virtio-vdec and virtio-gpu > importing dma-bufs from virtual ION on both guest and host side. > > cheers, > Gerd > > [1] support for shared buffers is in progress.
Re: [virtio-dev] [PATCH] [RFC RESEND] vdec: Add virtio video decode device specification
On Mon, Oct 14, 2019 at 03:05:03PM +0200, Dmitry Morozov wrote: > > On Montag, 14. Oktober 2019 14:34:43 CEST Gerd Hoffmann wrote: > > Hi, > > > > > My take on this (for a decoder) would be to allocate memory for output > > > buffers from a secure ION heap, import in the v4l2 driver, and then to > > > provide those to the device using virtio. The device side then uses the > > > dmabuf framework to make the buffers accessible for the hardware. I'm not > > > sure about that, it's just an idea. > > > > Virtualization aside, how does the complete video decoding workflow > > work? I assume along the lines of ... > > > > (1) allocate buffer for decoded video frames (from ion). > > (2) export those buffers as dma-buf. > > (3) import dma-buf to video decoder. > > (4) import dma-buf to gpu. > > > > ... to establish buffers shared between video decoder and gpu? > > > > Then feed the video stream into the decoder, which decodes into the ion > > buffers? Ask the gpu to scanout the ion buffers to show the video? > > > > cheers, > > Gerd > > Yes, exactly. > > [decoder] > 1) Input buffers are allocated using VIDIOC_*BUFS. Ok. > 2) Output buffers are allocated in a guest specific manner (ION, gbm). Who decides whenever ION or gbm is used? The phrase "secure ION heap" used above sounds like using ION is required for decoding drm-protected content. So, do we have to worry about ION here? Or can we just use gbm? [ Note: don't know much about ion, other than that it is used by android, is in staging right now and patches to move it out of staging are floating around @ dri-devel ] > 3) Both input and output buffers are exported as dma-bufs. > 4) The backing storage of both inputs and outputs is made available to the > device. > 5) Decoder hardware writes to output buffers directly. As expected. > 6) Back to the guest side, the output dma-bufs are used by (virtio-) gpu. Ok. So, virtio-gpu has support for dma-buf exports (in drm-misc-next, should land upstream in kernel 5.5). dma-buf imports are not that simple unfortunately. When using the gbm allocation route dma-buf exports are good enough though. The virtio-gpu resources have both a host buffer and a guest buffer[1] Data can be copied using the DRM_IOCTL_VIRTGPU_TRANSFER_{FROM,TO}_HOST ioctls. The dma-buf export will export the guest buffer (which lives in guest ram). It would make sense for the decoded video to go directly to the host buffer though. First because we want avoid copying the video frames for performance reasons, and second because we might not be able to copy video frames (drm ...). This is where the buffer registry idea comes in. Attach a (host) identifier to (guest) dma-bufs, which then allows host device emulation share buffers, i.e. virtio-vdec device emulation could decode to a dma-buf it got from virtio-gpu device emulation. Alternatively we could use virtual ION (or whatever it becomes after de-staging) for buffer management, with both virtio-vdec and virtio-gpu importing dma-bufs from virtual ION on both guest and host side. cheers, Gerd [1] support for shared buffers is in progress.
Re: [virtio-dev] [PATCH] [RFC RESEND] vdec: Add virtio video decode device specification
On Montag, 14. Oktober 2019 14:34:43 CEST Gerd Hoffmann wrote: > Hi, > > > My take on this (for a decoder) would be to allocate memory for output > > buffers from a secure ION heap, import in the v4l2 driver, and then to > > provide those to the device using virtio. The device side then uses the > > dmabuf framework to make the buffers accessible for the hardware. I'm not > > sure about that, it's just an idea. > > Virtualization aside, how does the complete video decoding workflow > work? I assume along the lines of ... > > (1) allocate buffer for decoded video frames (from ion). > (2) export those buffers as dma-buf. > (3) import dma-buf to video decoder. > (4) import dma-buf to gpu. > > ... to establish buffers shared between video decoder and gpu? > > Then feed the video stream into the decoder, which decodes into the ion > buffers? Ask the gpu to scanout the ion buffers to show the video? > > cheers, > Gerd Yes, exactly. [decoder] 1) Input buffers are allocated using VIDIOC_*BUFS. 2) Output buffers are allocated in a guest specific manner (ION, gbm). 3) Both input and output buffers are exported as dma-bufs. 4) The backing storage of both inputs and outputs is made available to the device. 5) Decoder hardware writes to output buffers directly. 6) Back to the guest side, the output dma-bufs are used by (virtio-) gpu. Best regards, Dmitry
Re: [virtio-dev] [PATCH] [RFC RESEND] vdec: Add virtio video decode device specification
Hi, > My take on this (for a decoder) would be to allocate memory for output > buffers > from a secure ION heap, import in the v4l2 driver, and then to provide those > to the device using virtio. The device side then uses the dmabuf framework to > make the buffers accessible for the hardware. I'm not sure about that, it's > just an idea. Virtualization aside, how does the complete video decoding workflow work? I assume along the lines of ... (1) allocate buffer for decoded video frames (from ion). (2) export those buffers as dma-buf. (3) import dma-buf to video decoder. (4) import dma-buf to gpu. ... to establish buffers shared between video decoder and gpu? Then feed the video stream into the decoder, which decodes into the ion buffers? Ask the gpu to scanout the ion buffers to show the video? cheers, Gerd
Re: [virtio-dev] [PATCH] [RFC RESEND] vdec: Add virtio video decode device specification
> > Well. I think before even discussing the protocol details we need a > > reasonable plan for buffer handling. I think using virtio-gpu buffers > > should be an optional optimization and not a requirement. Also the > > motivation for that should be clear (Let the host decoder write directly > > to virtio-gpu resources, to display video without copying around the > > decoded framebuffers from one device to another). > > Just to make sure we're on the same page, what would the buffers come > from if we don't use this optimization? > > I can imagine a setup like this; > 1) host device allocates host memory appropriate for usage with host > video decoder, > 2) guest driver allocates arbitrary guest pages for storage > accessible to the guest software, > 3) guest userspace writes input for the decoder to guest pages, > 4) guest driver passes the list of pages for the input and output > buffers to the host device > 5) host device copies data from input guest pages to host buffer > 6) host device runs the decoding > 7) host device copies decoded frame to output guest pages > 8) guest userspace can access decoded frame from those pages; back to 3 > > Is that something you have in mind? I don't have any specific workflow in mind. If you want display the decoded video frames you want use dma-bufs shared by video decoder and gpu, right? So the userspace application (video player probably) would create the buffers using one of the drivers, export them as dma-buf, then import them into the other driver. Just like you would do on physical hardware. So, when using virtio-gpu buffers: (1) guest app creates buffers using virtio-gpu. (2) guest app exports virtio-gpu buffers buffers as dma-buf. (3) guest app imports the dma-bufs into virtio-vdec. (4) guest app asks the virtio-vdec driver to write the decoded frames into the dma-bufs. (5) guest app asks the virtio-gpu driver to display the decoded frame. The guest video decoder driver passes the dma-buf pages to the host, and it is the host driver's job to fill the buffer. How this is done exactly might depend on hardware capabilities (whenever a host-allocated bounce buffer is needed or whenever the hardware can decode directly to the dma-buf passed by the guest driver) and is an implementation detail. Now, with cross-device sharing added the virtio-gpu would attach some kind of identifier to the dma-buf, virtio-vdec could fetch the identifier and pass it to the host too, and the host virtio-vdec device can use the identifier to get a host dma-buf handle for the (virtio-gpu) buffer. Ask the host video decoder driver to import the host dma-buf. If it all worked fine it can ask the host hardware to decode directly to the host virtio-gpu resource. > > Referencing virtio-gpu buffers needs a better plan than just re-using > > virtio-gpu resource handles. The handles are device-specific. What if > > there are multiple virtio-gpu devices present in the guest? > > > > I think we need a framework for cross-device buffer sharing. One > > possible option would be to have some kind of buffer registry, where > > buffers can be registered for cross-device sharing and get a unique > > id (a uuid maybe?). Drivers would typically register buffers on > > dma-buf export. > > This approach could possibly let us handle this transparently to > importers, which would work for guest kernel subsystems that rely on > the ability to handle buffers like native memory (e.g. having a > sgtable or DMA address) for them. > > How about allocating guest physical addresses for memory corresponding > to those buffers? On the virtio-gpu example, that could work like > this: > - by default a virtio-gpu buffer has only a resource handle, > - VIRTIO_GPU_RESOURCE_EXPORT command could be called to have the > virtio-gpu device export the buffer to a host framework (inside the > VMM) that would allocate guest page addresses for it, which the > command would return in a response to the guest, Hmm, the cross-device buffer sharing framework I have in mind would basically be a buffer registry. virtio-gpu would create buffers as usual, create a identifier somehow (details to be hashed out), attach the identifier to the dma-buf so it can be used as outlined above. Also note that the guest manages the address space, so the host can't simply allocate guest page addresses. Mapping host virtio-gpu resources into guest address space is planned, it'll most likely use a pci memory bar to reserve some address space. The host can map resources into that pci bar, on guest request. > - virtio-gpu driver could then create a regular DMA-buf object for > such memory, because it's just backed by pages (even though they may > not be accessible to the guest; just like in the case of TrustZone > memory protection on bare metal systems), Hmm, well, pci memory bars are *not* backed by pages. Maybe we can use Documentation/driver-api/pci/p2pdma.rst though. With that we might be
Re: [virtio-dev] [PATCH] [RFC RESEND] vdec: Add virtio video decode device specification
Hi Tomasz, On Mittwoch, 9. Oktober 2019 05:55:45 CEST Tomasz Figa wrote: > On Tue, Oct 8, 2019 at 12:09 AM Dmitry Morozov > > wrote: > > Hi Tomasz, > > > > On Montag, 7. Oktober 2019 16:14:13 CEST Tomasz Figa wrote: > > > Hi Dmitry, > > > > > > On Mon, Oct 7, 2019 at 11:01 PM Dmitry Morozov > > > > > > wrote: > > > > Hello, > > > > > > > > We at OpenSynergy are also working on an abstract paravirtualized > > > > video > > > > streaming device that operates input and/or output data buffers and > > > > can be > > > > used as a generic video decoder/encoder/input/output device. > > > > > > > > We would be glad to share our thoughts and contribute to the > > > > discussion. > > > > Please see some comments regarding buffer allocation inline. > > > > > > > > Best regards, > > > > Dmitry. > > > > > > > > On Samstag, 5. Oktober 2019 08:08:12 CEST Tomasz Figa wrote: > > > > > Hi Gerd, > > > > > > > > > > On Mon, Sep 23, 2019 at 5:56 PM Gerd Hoffmann wrote: > > > > > > Hi, > > > > > > > > > > > > > Our prototype implementation uses [4], which allows the > > > > > > > virtio-vdec > > > > > > > device to use buffers allocated by virtio-gpu device. > > > > > > > > > > > > > > [4] https://lkml.org/lkml/2019/9/12/157 > > > > > > > > > > First of all, thanks for taking a look at this RFC and for valuable > > > > > feedback. Sorry for the late reply. > > > > > > > > > > For reference, Keiichi is working with me and David Stevens on > > > > > accelerated video support for virtual machines and integration with > > > > > other virtual devices, like virtio-gpu for rendering or our > > > > > currently-downstream virtio-wayland for display (I believe there is > > > > > ongoing work to solve this problem in upstream too). > > > > > > > > > > > Well. I think before even discussing the protocol details we need > > > > > > a > > > > > > reasonable plan for buffer handling. I think using virtio-gpu > > > > > > buffers > > > > > > should be an optional optimization and not a requirement. Also > > > > > > the > > > > > > motivation for that should be clear (Let the host decoder write > > > > > > directly > > > > > > to virtio-gpu resources, to display video without copying around > > > > > > the > > > > > > decoded framebuffers from one device to another). > > > > > > > > > > Just to make sure we're on the same page, what would the buffers > > > > > come > > > > > from if we don't use this optimization? > > > > > > > > > > I can imagine a setup like this; > > > > > > > > > > 1) host device allocates host memory appropriate for usage with > > > > > host > > > > > > > > > > video decoder, > > > > > > > > > > 2) guest driver allocates arbitrary guest pages for storage > > > > > > > > > > accessible to the guest software, > > > > > > > > > > 3) guest userspace writes input for the decoder to guest pages, > > > > > 4) guest driver passes the list of pages for the input and output > > > > > > > > > > buffers to the host device > > > > > > > > > > 5) host device copies data from input guest pages to host buffer > > > > > 6) host device runs the decoding > > > > > 7) host device copies decoded frame to output guest pages > > > > > 8) guest userspace can access decoded frame from those pages; back > > > > > to 3 > > > > > > > > > > Is that something you have in mind? > > > > > > > > While GPU side allocations can be useful (especially in case of > > > > decoder), > > > > it could be more practical to stick to driver side allocations. This > > > > is > > > > also due to the fact that paravirtualized encoders and cameras are not > > > > necessarily require a GPU device. > > > > > > > > Also, the v4l2 framework already features convenient helpers for CMA > > > > and > > > > SG > > > > allocations. The buffers can be used in the same manner as in > > > > virtio-gpu: > > > > buffers are first attached to an already allocated buffer/resource > > > > descriptor and then are made available for processing by the device > > > > using > > > > a dedicated command from the driver. > > > > > > First of all, thanks a lot for your input. This is a relatively new > > > area of virtualization and we definitely need to collect various > > > possible perspectives in the discussion. > > > > > > From Chrome OS point of view, there are several aspects for which the > > > guest side allocation doesn't really work well: > > > 1) host-side hardware has a lot of specific low level allocation > > > requirements, like alignments, paddings, address space limitations and > > > so on, which is not something that can be (easily) taught to the guest > > > OS, > > > > I couldn't agree more. There are some changes by Greg to add support for > > querying GPU buffer metadata. Probably those changes could be integrated > > with 'a framework for cross-device buffer sharing' (something that Greg > > mentioned earlier in the thread and that would totally make sense). > > Did you mean one of Gerd's proposals? > > I think we need some
Re: [virtio-dev] [PATCH] [RFC RESEND] vdec: Add virtio video decode device specification
On Tue, Oct 8, 2019 at 12:09 AM Dmitry Morozov wrote: > > Hi Tomasz, > > On Montag, 7. Oktober 2019 16:14:13 CEST Tomasz Figa wrote: > > Hi Dmitry, > > > > On Mon, Oct 7, 2019 at 11:01 PM Dmitry Morozov > > > > wrote: > > > Hello, > > > > > > We at OpenSynergy are also working on an abstract paravirtualized video > > > streaming device that operates input and/or output data buffers and can be > > > used as a generic video decoder/encoder/input/output device. > > > > > > We would be glad to share our thoughts and contribute to the discussion. > > > Please see some comments regarding buffer allocation inline. > > > > > > Best regards, > > > Dmitry. > > > > > > On Samstag, 5. Oktober 2019 08:08:12 CEST Tomasz Figa wrote: > > > > Hi Gerd, > > > > > > > > On Mon, Sep 23, 2019 at 5:56 PM Gerd Hoffmann wrote: > > > > > Hi, > > > > > > > > > > > Our prototype implementation uses [4], which allows the virtio-vdec > > > > > > device to use buffers allocated by virtio-gpu device. > > > > > > > > > > > > [4] https://lkml.org/lkml/2019/9/12/157 > > > > > > > > First of all, thanks for taking a look at this RFC and for valuable > > > > feedback. Sorry for the late reply. > > > > > > > > For reference, Keiichi is working with me and David Stevens on > > > > accelerated video support for virtual machines and integration with > > > > other virtual devices, like virtio-gpu for rendering or our > > > > currently-downstream virtio-wayland for display (I believe there is > > > > ongoing work to solve this problem in upstream too). > > > > > > > > > Well. I think before even discussing the protocol details we need a > > > > > reasonable plan for buffer handling. I think using virtio-gpu buffers > > > > > should be an optional optimization and not a requirement. Also the > > > > > motivation for that should be clear (Let the host decoder write > > > > > directly > > > > > to virtio-gpu resources, to display video without copying around the > > > > > decoded framebuffers from one device to another). > > > > > > > > Just to make sure we're on the same page, what would the buffers come > > > > from if we don't use this optimization? > > > > > > > > I can imagine a setup like this; > > > > > > > > 1) host device allocates host memory appropriate for usage with host > > > > > > > > video decoder, > > > > > > > > 2) guest driver allocates arbitrary guest pages for storage > > > > > > > > accessible to the guest software, > > > > > > > > 3) guest userspace writes input for the decoder to guest pages, > > > > 4) guest driver passes the list of pages for the input and output > > > > > > > > buffers to the host device > > > > > > > > 5) host device copies data from input guest pages to host buffer > > > > 6) host device runs the decoding > > > > 7) host device copies decoded frame to output guest pages > > > > 8) guest userspace can access decoded frame from those pages; back to 3 > > > > > > > > Is that something you have in mind? > > > > > > While GPU side allocations can be useful (especially in case of decoder), > > > it could be more practical to stick to driver side allocations. This is > > > also due to the fact that paravirtualized encoders and cameras are not > > > necessarily require a GPU device. > > > > > > Also, the v4l2 framework already features convenient helpers for CMA and > > > SG > > > allocations. The buffers can be used in the same manner as in virtio-gpu: > > > buffers are first attached to an already allocated buffer/resource > > > descriptor and then are made available for processing by the device using > > > a dedicated command from the driver. > > > > First of all, thanks a lot for your input. This is a relatively new > > area of virtualization and we definitely need to collect various > > possible perspectives in the discussion. > > > > From Chrome OS point of view, there are several aspects for which the > > guest side allocation doesn't really work well: > > 1) host-side hardware has a lot of specific low level allocation > > requirements, like alignments, paddings, address space limitations and > > so on, which is not something that can be (easily) taught to the guest > > OS, > I couldn't agree more. There are some changes by Greg to add support for > querying GPU buffer metadata. Probably those changes could be integrated with > 'a framework for cross-device buffer sharing' (something that Greg mentioned > earlier in the thread and that would totally make sense). > Did you mean one of Gerd's proposals? I think we need some clarification there, as it's not clear to me whether the framework is host-side, guest-side or both. The approach I suggested would rely on a host-side framework and guest-side wouldn't need any special handling for sharing, because the memory would behave as on bare metal. However allocation would still need some special API to express high level buffer parameters and delegate the exact allocation requirements to the host. Currently virtio-gpu already has such interfac
Re: [virtio-dev] [PATCH] [RFC RESEND] vdec: Add virtio video decode device specification
Hi Tomasz, On Montag, 7. Oktober 2019 16:14:13 CEST Tomasz Figa wrote: > Hi Dmitry, > > On Mon, Oct 7, 2019 at 11:01 PM Dmitry Morozov > > wrote: > > Hello, > > > > We at OpenSynergy are also working on an abstract paravirtualized video > > streaming device that operates input and/or output data buffers and can be > > used as a generic video decoder/encoder/input/output device. > > > > We would be glad to share our thoughts and contribute to the discussion. > > Please see some comments regarding buffer allocation inline. > > > > Best regards, > > Dmitry. > > > > On Samstag, 5. Oktober 2019 08:08:12 CEST Tomasz Figa wrote: > > > Hi Gerd, > > > > > > On Mon, Sep 23, 2019 at 5:56 PM Gerd Hoffmann wrote: > > > > Hi, > > > > > > > > > Our prototype implementation uses [4], which allows the virtio-vdec > > > > > device to use buffers allocated by virtio-gpu device. > > > > > > > > > > [4] https://lkml.org/lkml/2019/9/12/157 > > > > > > First of all, thanks for taking a look at this RFC and for valuable > > > feedback. Sorry for the late reply. > > > > > > For reference, Keiichi is working with me and David Stevens on > > > accelerated video support for virtual machines and integration with > > > other virtual devices, like virtio-gpu for rendering or our > > > currently-downstream virtio-wayland for display (I believe there is > > > ongoing work to solve this problem in upstream too). > > > > > > > Well. I think before even discussing the protocol details we need a > > > > reasonable plan for buffer handling. I think using virtio-gpu buffers > > > > should be an optional optimization and not a requirement. Also the > > > > motivation for that should be clear (Let the host decoder write > > > > directly > > > > to virtio-gpu resources, to display video without copying around the > > > > decoded framebuffers from one device to another). > > > > > > Just to make sure we're on the same page, what would the buffers come > > > from if we don't use this optimization? > > > > > > I can imagine a setup like this; > > > > > > 1) host device allocates host memory appropriate for usage with host > > > > > > video decoder, > > > > > > 2) guest driver allocates arbitrary guest pages for storage > > > > > > accessible to the guest software, > > > > > > 3) guest userspace writes input for the decoder to guest pages, > > > 4) guest driver passes the list of pages for the input and output > > > > > > buffers to the host device > > > > > > 5) host device copies data from input guest pages to host buffer > > > 6) host device runs the decoding > > > 7) host device copies decoded frame to output guest pages > > > 8) guest userspace can access decoded frame from those pages; back to 3 > > > > > > Is that something you have in mind? > > > > While GPU side allocations can be useful (especially in case of decoder), > > it could be more practical to stick to driver side allocations. This is > > also due to the fact that paravirtualized encoders and cameras are not > > necessarily require a GPU device. > > > > Also, the v4l2 framework already features convenient helpers for CMA and > > SG > > allocations. The buffers can be used in the same manner as in virtio-gpu: > > buffers are first attached to an already allocated buffer/resource > > descriptor and then are made available for processing by the device using > > a dedicated command from the driver. > > First of all, thanks a lot for your input. This is a relatively new > area of virtualization and we definitely need to collect various > possible perspectives in the discussion. > > From Chrome OS point of view, there are several aspects for which the > guest side allocation doesn't really work well: > 1) host-side hardware has a lot of specific low level allocation > requirements, like alignments, paddings, address space limitations and > so on, which is not something that can be (easily) taught to the guest > OS, I couldn't agree more. There are some changes by Greg to add support for querying GPU buffer metadata. Probably those changes could be integrated with 'a framework for cross-device buffer sharing' (something that Greg mentioned earlier in the thread and that would totally make sense). > 2) allocation system is designed to be centralized, like Android > gralloc, because there is almost never a case when a buffer is to be > used only with 1 specific device. 99% of the cases are pipelines like > decoder -> GPU/display, camera -> encoder + GPU/display, GPU -> > encoder and so on, which means that allocations need to take into > account multiple hardware constraints. > 3) protected content decoding: the memory for decoded video frames > must not be accessible to the guest at all This looks like a valid use case. Would it also be possible for instance to allocate mem from a secure ION heap on the guest and then to provide the sgt to the device? We don't necessarily need to map that sgt for guest access. Best regards, Dmitry. > > Th
Re: [virtio-dev] [PATCH] [RFC RESEND] vdec: Add virtio video decode device specification
Hi Dmitry, On Mon, Oct 7, 2019 at 11:01 PM Dmitry Morozov wrote: > > Hello, > > We at OpenSynergy are also working on an abstract paravirtualized video > streaming device that operates input and/or output data buffers and can be > used > as a generic video decoder/encoder/input/output device. > > We would be glad to share our thoughts and contribute to the discussion. > Please see some comments regarding buffer allocation inline. > > Best regards, > Dmitry. > > On Samstag, 5. Oktober 2019 08:08:12 CEST Tomasz Figa wrote: > > Hi Gerd, > > > > On Mon, Sep 23, 2019 at 5:56 PM Gerd Hoffmann wrote: > > > Hi, > > > > > > > Our prototype implementation uses [4], which allows the virtio-vdec > > > > device to use buffers allocated by virtio-gpu device. > > > > > > > > [4] https://lkml.org/lkml/2019/9/12/157 > > > > First of all, thanks for taking a look at this RFC and for valuable > > feedback. Sorry for the late reply. > > > > For reference, Keiichi is working with me and David Stevens on > > accelerated video support for virtual machines and integration with > > other virtual devices, like virtio-gpu for rendering or our > > currently-downstream virtio-wayland for display (I believe there is > > ongoing work to solve this problem in upstream too). > > > > > Well. I think before even discussing the protocol details we need a > > > reasonable plan for buffer handling. I think using virtio-gpu buffers > > > should be an optional optimization and not a requirement. Also the > > > motivation for that should be clear (Let the host decoder write directly > > > to virtio-gpu resources, to display video without copying around the > > > decoded framebuffers from one device to another). > > > > Just to make sure we're on the same page, what would the buffers come > > from if we don't use this optimization? > > > > I can imagine a setup like this; > > 1) host device allocates host memory appropriate for usage with host > > video decoder, > > 2) guest driver allocates arbitrary guest pages for storage > > accessible to the guest software, > > 3) guest userspace writes input for the decoder to guest pages, > > 4) guest driver passes the list of pages for the input and output > > buffers to the host device > > 5) host device copies data from input guest pages to host buffer > > 6) host device runs the decoding > > 7) host device copies decoded frame to output guest pages > > 8) guest userspace can access decoded frame from those pages; back to 3 > > > > Is that something you have in mind? > While GPU side allocations can be useful (especially in case of decoder), it > could be more practical to stick to driver side allocations. This is also due > to the fact that paravirtualized encoders and cameras are not necessarily > require a GPU device. > > Also, the v4l2 framework already features convenient helpers for CMA and SG > allocations. The buffers can be used in the same manner as in virtio-gpu: > buffers are first attached to an already allocated buffer/resource descriptor > and > then are made available for processing by the device using a dedicated command > from the driver. First of all, thanks a lot for your input. This is a relatively new area of virtualization and we definitely need to collect various possible perspectives in the discussion. >From Chrome OS point of view, there are several aspects for which the guest side allocation doesn't really work well: 1) host-side hardware has a lot of specific low level allocation requirements, like alignments, paddings, address space limitations and so on, which is not something that can be (easily) taught to the guest OS, 2) allocation system is designed to be centralized, like Android gralloc, because there is almost never a case when a buffer is to be used only with 1 specific device. 99% of the cases are pipelines like decoder -> GPU/display, camera -> encoder + GPU/display, GPU -> encoder and so on, which means that allocations need to take into account multiple hardware constraints. 3) protected content decoding: the memory for decoded video frames must not be accessible to the guest at all That said, the common desktop Linux model bases on allocating from the producer device (which is why videobuf2 has allocation capability) and we definitely need to consider this model, even if we just think about Linux V4L2 compliance. That's why I'm suggesting the unified memory handling based on guest physical addresses, which would handle both guest-allocated and host-allocated memory. Best regards, Tomasz > > > > > Referencing virtio-gpu buffers needs a better plan than just re-using > > > virtio-gpu resource handles. The handles are device-specific. What if > > > there are multiple virtio-gpu devices present in the guest? > > > > > > I think we need a framework for cross-device buffer sharing. One > > > possible option would be to have some kind of buffer registry, where > > > buffers can be registered for cross-device sharing and get a unique > > > id (
Re: [virtio-dev] [PATCH] [RFC RESEND] vdec: Add virtio video decode device specification
Hello, We at OpenSynergy are also working on an abstract paravirtualized video streaming device that operates input and/or output data buffers and can be used as a generic video decoder/encoder/input/output device. We would be glad to share our thoughts and contribute to the discussion. Please see some comments regarding buffer allocation inline. Best regards, Dmitry. On Samstag, 5. Oktober 2019 08:08:12 CEST Tomasz Figa wrote: > Hi Gerd, > > On Mon, Sep 23, 2019 at 5:56 PM Gerd Hoffmann wrote: > > Hi, > > > > > Our prototype implementation uses [4], which allows the virtio-vdec > > > device to use buffers allocated by virtio-gpu device. > > > > > > [4] https://lkml.org/lkml/2019/9/12/157 > > First of all, thanks for taking a look at this RFC and for valuable > feedback. Sorry for the late reply. > > For reference, Keiichi is working with me and David Stevens on > accelerated video support for virtual machines and integration with > other virtual devices, like virtio-gpu for rendering or our > currently-downstream virtio-wayland for display (I believe there is > ongoing work to solve this problem in upstream too). > > > Well. I think before even discussing the protocol details we need a > > reasonable plan for buffer handling. I think using virtio-gpu buffers > > should be an optional optimization and not a requirement. Also the > > motivation for that should be clear (Let the host decoder write directly > > to virtio-gpu resources, to display video without copying around the > > decoded framebuffers from one device to another). > > Just to make sure we're on the same page, what would the buffers come > from if we don't use this optimization? > > I can imagine a setup like this; > 1) host device allocates host memory appropriate for usage with host > video decoder, > 2) guest driver allocates arbitrary guest pages for storage > accessible to the guest software, > 3) guest userspace writes input for the decoder to guest pages, > 4) guest driver passes the list of pages for the input and output > buffers to the host device > 5) host device copies data from input guest pages to host buffer > 6) host device runs the decoding > 7) host device copies decoded frame to output guest pages > 8) guest userspace can access decoded frame from those pages; back to 3 > > Is that something you have in mind? While GPU side allocations can be useful (especially in case of decoder), it could be more practical to stick to driver side allocations. This is also due to the fact that paravirtualized encoders and cameras are not necessarily require a GPU device. Also, the v4l2 framework already features convenient helpers for CMA and SG allocations. The buffers can be used in the same manner as in virtio-gpu: buffers are first attached to an already allocated buffer/resource descriptor and then are made available for processing by the device using a dedicated command from the driver. > > > Referencing virtio-gpu buffers needs a better plan than just re-using > > virtio-gpu resource handles. The handles are device-specific. What if > > there are multiple virtio-gpu devices present in the guest? > > > > I think we need a framework for cross-device buffer sharing. One > > possible option would be to have some kind of buffer registry, where > > buffers can be registered for cross-device sharing and get a unique > > id (a uuid maybe?). Drivers would typically register buffers on > > dma-buf export. > > This approach could possibly let us handle this transparently to > importers, which would work for guest kernel subsystems that rely on > the ability to handle buffers like native memory (e.g. having a > sgtable or DMA address) for them. > > How about allocating guest physical addresses for memory corresponding > to those buffers? On the virtio-gpu example, that could work like > this: > - by default a virtio-gpu buffer has only a resource handle, > - VIRTIO_GPU_RESOURCE_EXPORT command could be called to have the > virtio-gpu device export the buffer to a host framework (inside the > VMM) that would allocate guest page addresses for it, which the > command would return in a response to the guest, > - virtio-gpu driver could then create a regular DMA-buf object for > such memory, because it's just backed by pages (even though they may > not be accessible to the guest; just like in the case of TrustZone > memory protection on bare metal systems), > - any consumer would be able to handle such buffer like a regular > guest memory, passing low-level scatter-gather tables to the host as > buffer descriptors - this would nicely integrate with the basic case > without buffer sharing, as described above. > > Another interesting side effect of the above approach would be the > ease of integration with virtio-iommu. If the virtio master device is > put behind a virtio-iommu, the guest page addresses become the input > to iommu page tables and IOVA addresses go to the host via the virtio > master
Re: [virtio-dev] [PATCH] [RFC RESEND] vdec: Add virtio video decode device specification
Hi Gerd, On Mon, Sep 23, 2019 at 5:56 PM Gerd Hoffmann wrote: > > Hi, > > > Our prototype implementation uses [4], which allows the virtio-vdec > > device to use buffers allocated by virtio-gpu device. > > > [4] https://lkml.org/lkml/2019/9/12/157 First of all, thanks for taking a look at this RFC and for valuable feedback. Sorry for the late reply. For reference, Keiichi is working with me and David Stevens on accelerated video support for virtual machines and integration with other virtual devices, like virtio-gpu for rendering or our currently-downstream virtio-wayland for display (I believe there is ongoing work to solve this problem in upstream too). > > Well. I think before even discussing the protocol details we need a > reasonable plan for buffer handling. I think using virtio-gpu buffers > should be an optional optimization and not a requirement. Also the > motivation for that should be clear (Let the host decoder write directly > to virtio-gpu resources, to display video without copying around the > decoded framebuffers from one device to another). Just to make sure we're on the same page, what would the buffers come from if we don't use this optimization? I can imagine a setup like this; 1) host device allocates host memory appropriate for usage with host video decoder, 2) guest driver allocates arbitrary guest pages for storage accessible to the guest software, 3) guest userspace writes input for the decoder to guest pages, 4) guest driver passes the list of pages for the input and output buffers to the host device 5) host device copies data from input guest pages to host buffer 6) host device runs the decoding 7) host device copies decoded frame to output guest pages 8) guest userspace can access decoded frame from those pages; back to 3 Is that something you have in mind? > > Referencing virtio-gpu buffers needs a better plan than just re-using > virtio-gpu resource handles. The handles are device-specific. What if > there are multiple virtio-gpu devices present in the guest? > > I think we need a framework for cross-device buffer sharing. One > possible option would be to have some kind of buffer registry, where > buffers can be registered for cross-device sharing and get a unique > id (a uuid maybe?). Drivers would typically register buffers on > dma-buf export. This approach could possibly let us handle this transparently to importers, which would work for guest kernel subsystems that rely on the ability to handle buffers like native memory (e.g. having a sgtable or DMA address) for them. How about allocating guest physical addresses for memory corresponding to those buffers? On the virtio-gpu example, that could work like this: - by default a virtio-gpu buffer has only a resource handle, - VIRTIO_GPU_RESOURCE_EXPORT command could be called to have the virtio-gpu device export the buffer to a host framework (inside the VMM) that would allocate guest page addresses for it, which the command would return in a response to the guest, - virtio-gpu driver could then create a regular DMA-buf object for such memory, because it's just backed by pages (even though they may not be accessible to the guest; just like in the case of TrustZone memory protection on bare metal systems), - any consumer would be able to handle such buffer like a regular guest memory, passing low-level scatter-gather tables to the host as buffer descriptors - this would nicely integrate with the basic case without buffer sharing, as described above. Another interesting side effect of the above approach would be the ease of integration with virtio-iommu. If the virtio master device is put behind a virtio-iommu, the guest page addresses become the input to iommu page tables and IOVA addresses go to the host via the virtio master device protocol, inside the low-level scatter-gather tables. What do you think? Best regards, Tomasz > > Another option would be to pass around both buffer handle and buffer > owner, i.e. instead of "u32 handle" have something like this: > > struct buffer_reference { > enum device_type; /* pci, virtio-mmio, ... */ > union device_address { > struct pci_address pci_addr; > u64 virtio_mmio_addr; > [ ... ] > }; > u64 device_buffer_handle; /* device-specific, virtio-gpu could use > resource ids here */ > }; > > cheers, > Gerd >
Re: [virtio-dev] [PATCH] [RFC RESEND] vdec: Add virtio video decode device specification
Hi, > Our prototype implementation uses [4], which allows the virtio-vdec > device to use buffers allocated by virtio-gpu device. > [4] https://lkml.org/lkml/2019/9/12/157 Well. I think before even discussing the protocol details we need a reasonable plan for buffer handling. I think using virtio-gpu buffers should be an optional optimization and not a requirement. Also the motivation for that should be clear (Let the host decoder write directly to virtio-gpu resources, to display video without copying around the decoded framebuffers from one device to another). Referencing virtio-gpu buffers needs a better plan than just re-using virtio-gpu resource handles. The handles are device-specific. What if there are multiple virtio-gpu devices present in the guest? I think we need a framework for cross-device buffer sharing. One possible option would be to have some kind of buffer registry, where buffers can be registered for cross-device sharing and get a unique id (a uuid maybe?). Drivers would typically register buffers on dma-buf export. Another option would be to pass around both buffer handle and buffer owner, i.e. instead of "u32 handle" have something like this: struct buffer_reference { enum device_type; /* pci, virtio-mmio, ... */ union device_address { struct pci_address pci_addr; u64 virtio_mmio_addr; [ ... ] }; u64 device_buffer_handle; /* device-specific, virtio-gpu could use resource ids here */ }; cheers, Gerd
Re: [PATCH] [RFC RESEND] vdec: Add virtio video decode device specification
I shared PDF version of this RFC on Google drive: https://drive.google.com/drive/folders/1hed-mTVI7dj0M_iab4DTfx5kPMLoeX8R On Thu, Sep 19, 2019 at 8:15 PM Keiichi Watanabe wrote: > > Hi Hans, > Thank you for your feedback. > > On Thu, Sep 19, 2019 at 6:53 PM Hans Verkuil wrote: > > > > Hi Keiichi, > > > > On 9/19/19 11:34 AM, Keiichi Watanabe wrote: > > > [Resending because of some issues with sending to virtio-dev. Sorry for > > > the noise.] > > > > > > This patch proposes virtio specification for new virtio video decode > > > device. > > > This device provides the functionality of hardware accelerated video > > > decoding from encoded video contents provided by the guest into frame > > > buffers accessible by the guest. > > > > > > We have prototype implementation for VMs on Chrome OS: > > > * virtio-vdec device in crosvm [1] > > > * virtio-vdec driver in Linux kernel v4.19 [2] > > > - This driver follows V4L2 stateful video decoder API [3]. > > > > > > Our prototype implementation uses [4], which allows the virtio-vdec > > > device to use buffers allocated by virtio-gpu device. > > > > > > Any feedback would be greatly appreciated. Thank you. > > > > I'm not a virtio expert, but as I understand it the virtio-vdec driver > > looks like a regular v4l2 stateful decoder devices to the guest, while > > on the host there is a driver (or something like that) that maps the > > virtio-vdec requests to the actual decoder hardware, right? > > > > What concerns me a bit (but there may be good reasons for this) is that > > this virtio driver is so specific for stateful decoders. > > > > We aim to design a platform-independent interface. The virtio-vdec protocol > should be designed to be workable regardless of APIs, OS, and platforms > eventually. > > Our prototype virtio-vdec device translates the virtio-vdec protocol to a > Chrome's video decode acceleration API instead of talking to hardware decoders > directly. This Chrome's API is an abstract layer for multiple decoder APIs on > Linux such as V4L2 stateful, V4L2 slice, Intel's VAAPI. > > That is to say the guest driver translates V4L2 stateful API to virtio-vdec > and > the host device translates virtio-vdec to Chrome's API. So, I could say that > this is already more general than a mere V4L2 stateful API wrapper, at least. > > I'd appreciate if you could let me know some parts are still specific to V4L2. > > > > How does this scale to stateful encoders? Stateless codecs? Other M2M > > devices like deinterlacers or colorspace converters? What about webcams? > > > > We're designing virtio protocol for encoder as well, but we are at an early > stage. So, I'm not sure if we should/can handle decoder and encoder in one > protocol. I don't have any plans for other media devices. > > > > In other words, I would like to see a bigger picture here. > > > > Note that there is also an effort for Xen to expose webcams to a guest: > > > > https://www.spinics.net/lists/linux-media/msg148629.html > > > > Good to know. Thanks. > > > This may or may not be of interest. This was an RFC only, and I haven't > > seen any follow-up patches with actual code. > > > > There will be a half-day meeting of media developers during the ELCE > > in October about codecs. I know Alexandre and Tomasz will be there. > > It might be a good idea to discuss this in more detail if needed. > > > > Sounds good. They are closely working with me. > > Regards, > Keiichi > > > Regards, > > > > Hans > > > > > > > > [1] > > > https://chromium-review.googlesource.com/q/hashtag:%22virtio-vdec-device%22+(status:open%20OR%20status:merged) > > > [2] > > > https://chromium-review.googlesource.com/q/hashtag:%22virtio-vdec-driver%22+(status:open%20OR%20status:merged) > > > [3] https://hverkuil.home.xs4all.nl/codec-api/uapi/v4l/dev-decoder.html > > > (to be merged to Linux 5.4) > > > [4] https://lkml.org/lkml/2019/9/12/157 > > > > > > Signed-off-by: Keiichi Watanabe > > > --- > > > content.tex | 1 + > > > virtio-vdec.tex | 750 > > > 2 files changed, 751 insertions(+) > > > create mode 100644 virtio-vdec.tex > > > > > > diff --git a/content.tex b/content.tex > > > index 37a2190..b57d4a9 100644 > > > --- a/content.tex > > > +++ b/content.tex > > > @@ -5682,6 +5682,7 @@ \subsubsection{Legacy Interface: Framing > > > Requirements}\label{sec:Device > > > \input{virtio-input.tex} > > > \input{virtio-crypto.tex} > > > \input{virtio-vsock.tex} > > > +\input{virtio-vdec.tex} > > > > > > \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits} > > > > > > diff --git a/virtio-vdec.tex b/virtio-vdec.tex > > > new file mode 100644 > > > index 000..d117129 > > > --- /dev/null > > > +++ b/virtio-vdec.tex > > > @@ -0,0 +1,750 @@ > > > +\section{Video Decode Device} > > > +\label{sec:Device Types / Video Decode Device} > > > + > > > +virtio-vdec is a virtio based video decoder. This device provides the > > > +functionality of hardware acc
Re: [PATCH] [RFC RESEND] vdec: Add virtio video decode device specification
Hi Hans, Thank you for your feedback. On Thu, Sep 19, 2019 at 6:53 PM Hans Verkuil wrote: > > Hi Keiichi, > > On 9/19/19 11:34 AM, Keiichi Watanabe wrote: > > [Resending because of some issues with sending to virtio-dev. Sorry for the > > noise.] > > > > This patch proposes virtio specification for new virtio video decode > > device. > > This device provides the functionality of hardware accelerated video > > decoding from encoded video contents provided by the guest into frame > > buffers accessible by the guest. > > > > We have prototype implementation for VMs on Chrome OS: > > * virtio-vdec device in crosvm [1] > > * virtio-vdec driver in Linux kernel v4.19 [2] > > - This driver follows V4L2 stateful video decoder API [3]. > > > > Our prototype implementation uses [4], which allows the virtio-vdec > > device to use buffers allocated by virtio-gpu device. > > > > Any feedback would be greatly appreciated. Thank you. > > I'm not a virtio expert, but as I understand it the virtio-vdec driver > looks like a regular v4l2 stateful decoder devices to the guest, while > on the host there is a driver (or something like that) that maps the > virtio-vdec requests to the actual decoder hardware, right? > > What concerns me a bit (but there may be good reasons for this) is that > this virtio driver is so specific for stateful decoders. > We aim to design a platform-independent interface. The virtio-vdec protocol should be designed to be workable regardless of APIs, OS, and platforms eventually. Our prototype virtio-vdec device translates the virtio-vdec protocol to a Chrome's video decode acceleration API instead of talking to hardware decoders directly. This Chrome's API is an abstract layer for multiple decoder APIs on Linux such as V4L2 stateful, V4L2 slice, Intel's VAAPI. That is to say the guest driver translates V4L2 stateful API to virtio-vdec and the host device translates virtio-vdec to Chrome's API. So, I could say that this is already more general than a mere V4L2 stateful API wrapper, at least. I'd appreciate if you could let me know some parts are still specific to V4L2. > How does this scale to stateful encoders? Stateless codecs? Other M2M > devices like deinterlacers or colorspace converters? What about webcams? > We're designing virtio protocol for encoder as well, but we are at an early stage. So, I'm not sure if we should/can handle decoder and encoder in one protocol. I don't have any plans for other media devices. > In other words, I would like to see a bigger picture here. > > Note that there is also an effort for Xen to expose webcams to a guest: > > https://www.spinics.net/lists/linux-media/msg148629.html > Good to know. Thanks. > This may or may not be of interest. This was an RFC only, and I haven't > seen any follow-up patches with actual code. > > There will be a half-day meeting of media developers during the ELCE > in October about codecs. I know Alexandre and Tomasz will be there. > It might be a good idea to discuss this in more detail if needed. > Sounds good. They are closely working with me. Regards, Keiichi > Regards, > > Hans > > > > > [1] > > https://chromium-review.googlesource.com/q/hashtag:%22virtio-vdec-device%22+(status:open%20OR%20status:merged) > > [2] > > https://chromium-review.googlesource.com/q/hashtag:%22virtio-vdec-driver%22+(status:open%20OR%20status:merged) > > [3] https://hverkuil.home.xs4all.nl/codec-api/uapi/v4l/dev-decoder.html (to > > be merged to Linux 5.4) > > [4] https://lkml.org/lkml/2019/9/12/157 > > > > Signed-off-by: Keiichi Watanabe > > --- > > content.tex | 1 + > > virtio-vdec.tex | 750 > > 2 files changed, 751 insertions(+) > > create mode 100644 virtio-vdec.tex > > > > diff --git a/content.tex b/content.tex > > index 37a2190..b57d4a9 100644 > > --- a/content.tex > > +++ b/content.tex > > @@ -5682,6 +5682,7 @@ \subsubsection{Legacy Interface: Framing > > Requirements}\label{sec:Device > > \input{virtio-input.tex} > > \input{virtio-crypto.tex} > > \input{virtio-vsock.tex} > > +\input{virtio-vdec.tex} > > > > \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits} > > > > diff --git a/virtio-vdec.tex b/virtio-vdec.tex > > new file mode 100644 > > index 000..d117129 > > --- /dev/null > > +++ b/virtio-vdec.tex > > @@ -0,0 +1,750 @@ > > +\section{Video Decode Device} > > +\label{sec:Device Types / Video Decode Device} > > + > > +virtio-vdec is a virtio based video decoder. This device provides the > > +functionality of hardware accelerated video decoding from encoded > > +video contents provided by the guest into frame buffers accessible by > > +the guest. > > + > > +\subsection{Device ID} > > +\label{sec:Device Types / Video Decode Device / Device ID} > > + > > +28 > > + > > +\subsection{Virtqueues} > > +\label{sec:Device Types / Video Decode Device / Virtqueues} > > + > > +\begin{description} > > +\item[0] outq - queue for sending requests from t
Re: [PATCH] [RFC RESEND] vdec: Add virtio video decode device specification
Hi Keiichi, On 9/19/19 11:34 AM, Keiichi Watanabe wrote: > [Resending because of some issues with sending to virtio-dev. Sorry for the > noise.] > > This patch proposes virtio specification for new virtio video decode > device. > This device provides the functionality of hardware accelerated video > decoding from encoded video contents provided by the guest into frame > buffers accessible by the guest. > > We have prototype implementation for VMs on Chrome OS: > * virtio-vdec device in crosvm [1] > * virtio-vdec driver in Linux kernel v4.19 [2] > - This driver follows V4L2 stateful video decoder API [3]. > > Our prototype implementation uses [4], which allows the virtio-vdec > device to use buffers allocated by virtio-gpu device. > > Any feedback would be greatly appreciated. Thank you. I'm not a virtio expert, but as I understand it the virtio-vdec driver looks like a regular v4l2 stateful decoder devices to the guest, while on the host there is a driver (or something like that) that maps the virtio-vdec requests to the actual decoder hardware, right? What concerns me a bit (but there may be good reasons for this) is that this virtio driver is so specific for stateful decoders. How does this scale to stateful encoders? Stateless codecs? Other M2M devices like deinterlacers or colorspace converters? What about webcams? In other words, I would like to see a bigger picture here. Note that there is also an effort for Xen to expose webcams to a guest: https://www.spinics.net/lists/linux-media/msg148629.html This may or may not be of interest. This was an RFC only, and I haven't seen any follow-up patches with actual code. There will be a half-day meeting of media developers during the ELCE in October about codecs. I know Alexandre and Tomasz will be there. It might be a good idea to discuss this in more detail if needed. Regards, Hans > > [1] > https://chromium-review.googlesource.com/q/hashtag:%22virtio-vdec-device%22+(status:open%20OR%20status:merged) > [2] > https://chromium-review.googlesource.com/q/hashtag:%22virtio-vdec-driver%22+(status:open%20OR%20status:merged) > [3] https://hverkuil.home.xs4all.nl/codec-api/uapi/v4l/dev-decoder.html (to > be merged to Linux 5.4) > [4] https://lkml.org/lkml/2019/9/12/157 > > Signed-off-by: Keiichi Watanabe > --- > content.tex | 1 + > virtio-vdec.tex | 750 > 2 files changed, 751 insertions(+) > create mode 100644 virtio-vdec.tex > > diff --git a/content.tex b/content.tex > index 37a2190..b57d4a9 100644 > --- a/content.tex > +++ b/content.tex > @@ -5682,6 +5682,7 @@ \subsubsection{Legacy Interface: Framing > Requirements}\label{sec:Device > \input{virtio-input.tex} > \input{virtio-crypto.tex} > \input{virtio-vsock.tex} > +\input{virtio-vdec.tex} > > \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits} > > diff --git a/virtio-vdec.tex b/virtio-vdec.tex > new file mode 100644 > index 000..d117129 > --- /dev/null > +++ b/virtio-vdec.tex > @@ -0,0 +1,750 @@ > +\section{Video Decode Device} > +\label{sec:Device Types / Video Decode Device} > + > +virtio-vdec is a virtio based video decoder. This device provides the > +functionality of hardware accelerated video decoding from encoded > +video contents provided by the guest into frame buffers accessible by > +the guest. > + > +\subsection{Device ID} > +\label{sec:Device Types / Video Decode Device / Device ID} > + > +28 > + > +\subsection{Virtqueues} > +\label{sec:Device Types / Video Decode Device / Virtqueues} > + > +\begin{description} > +\item[0] outq - queue for sending requests from the driver to the > + device > +\item[1] inq - queue for sending requests from the device to the > + driver > +\end{description} > + > +Each queue is used uni-directionally. outq is used to send requests > +from the driver to the device (i.e., guest requests) and inq is used > +to send requests in the other direction (i.e., host requests). > + > +\subsection{Feature bits} > +\label{sec:Device Types / Video Decode Device / Feature bits} > + > +There are currently no feature bits defined for this device. > + > +\subsection{Device configuration layout} > +\label{sec:Device Types / Video Decode Device / Device configuration layout} > + > +None. > + > +\subsection{Device Requirements: Device Initialization} > +\label{sec:Device Types / Video Decode Device / Device Requirements: Device > Initialization} > + > +The virtqueues are initialized. > + > +\subsection{Device Operation} > +\label{sec:Device Types / Video Decode Device / Device Operation} > + > +\subsubsection{Video Buffers} > +\label{sec:Device Types / Video Decode Device / Device Operation / Buffers} > + > +A virtio-vdec driver and a device use two types of video buffers: > +\emph{bitstream buffer} and \emph{frame buffer}. A bitstream buffer > +contains encoded video stream data. This buffer is similar to an > +OUTPUT buffer for Video for Linux Two (V4L2) API. A frame buff
[PATCH] [RFC RESEND] vdec: Add virtio video decode device specification
[Resending because of some issues with sending to virtio-dev. Sorry for the noise.] This patch proposes virtio specification for new virtio video decode device. This device provides the functionality of hardware accelerated video decoding from encoded video contents provided by the guest into frame buffers accessible by the guest. We have prototype implementation for VMs on Chrome OS: * virtio-vdec device in crosvm [1] * virtio-vdec driver in Linux kernel v4.19 [2] - This driver follows V4L2 stateful video decoder API [3]. Our prototype implementation uses [4], which allows the virtio-vdec device to use buffers allocated by virtio-gpu device. Any feedback would be greatly appreciated. Thank you. [1] https://chromium-review.googlesource.com/q/hashtag:%22virtio-vdec-device%22+(status:open%20OR%20status:merged) [2] https://chromium-review.googlesource.com/q/hashtag:%22virtio-vdec-driver%22+(status:open%20OR%20status:merged) [3] https://hverkuil.home.xs4all.nl/codec-api/uapi/v4l/dev-decoder.html (to be merged to Linux 5.4) [4] https://lkml.org/lkml/2019/9/12/157 Signed-off-by: Keiichi Watanabe --- content.tex | 1 + virtio-vdec.tex | 750 2 files changed, 751 insertions(+) create mode 100644 virtio-vdec.tex diff --git a/content.tex b/content.tex index 37a2190..b57d4a9 100644 --- a/content.tex +++ b/content.tex @@ -5682,6 +5682,7 @@ \subsubsection{Legacy Interface: Framing Requirements}\label{sec:Device \input{virtio-input.tex} \input{virtio-crypto.tex} \input{virtio-vsock.tex} +\input{virtio-vdec.tex} \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits} diff --git a/virtio-vdec.tex b/virtio-vdec.tex new file mode 100644 index 000..d117129 --- /dev/null +++ b/virtio-vdec.tex @@ -0,0 +1,750 @@ +\section{Video Decode Device} +\label{sec:Device Types / Video Decode Device} + +virtio-vdec is a virtio based video decoder. This device provides the +functionality of hardware accelerated video decoding from encoded +video contents provided by the guest into frame buffers accessible by +the guest. + +\subsection{Device ID} +\label{sec:Device Types / Video Decode Device / Device ID} + +28 + +\subsection{Virtqueues} +\label{sec:Device Types / Video Decode Device / Virtqueues} + +\begin{description} +\item[0] outq - queue for sending requests from the driver to the + device +\item[1] inq - queue for sending requests from the device to the + driver +\end{description} + +Each queue is used uni-directionally. outq is used to send requests +from the driver to the device (i.e., guest requests) and inq is used +to send requests in the other direction (i.e., host requests). + +\subsection{Feature bits} +\label{sec:Device Types / Video Decode Device / Feature bits} + +There are currently no feature bits defined for this device. + +\subsection{Device configuration layout} +\label{sec:Device Types / Video Decode Device / Device configuration layout} + +None. + +\subsection{Device Requirements: Device Initialization} +\label{sec:Device Types / Video Decode Device / Device Requirements: Device Initialization} + +The virtqueues are initialized. + +\subsection{Device Operation} +\label{sec:Device Types / Video Decode Device / Device Operation} + +\subsubsection{Video Buffers} +\label{sec:Device Types / Video Decode Device / Device Operation / Buffers} + +A virtio-vdec driver and a device use two types of video buffers: +\emph{bitstream buffer} and \emph{frame buffer}. A bitstream buffer +contains encoded video stream data. This buffer is similar to an +OUTPUT buffer for Video for Linux Two (V4L2) API. A frame buffer +contains decoded video frame data like a CAPTURE buffers for V4L2 API. +The driver and the device share these buffers, and each buffer is +identified by a unique integer called a \emph{resource handle}. + +\subsubsection{Guest Request} + +The driver queues requests to the outq virtqueue. The device MAY +process requests out-of-order. All requests on outq use the following +structure: + +\begin{lstlisting} +enum virtio_vdec_guest_req_type { +VIRTIO_VDEC_GUEST_REQ_UNDEFINED = 0, + +/* Global */ +VIRTIO_VDEC_GUEST_REQ_QUERY = 0x0100, + +/* Per instance */ +VIRTIO_VDEC_GUEST_REQ_OPEN = 0x0200, +VIRTIO_VDEC_GUEST_REQ_SET_BUFFER_COUNT, +VIRTIO_VDEC_GUEST_REQ_REGISTER_BUFFER, +VIRTIO_VDEC_GUEST_REQ_ACK_STREAM_INFO, +VIRTIO_VDEC_GUEST_REQ_FRAME_BUFFER, +VIRTIO_VDEC_GUEST_REQ_BITSTREAM_BUFFER, +VIRTIO_VDEC_GUEST_REQ_DRAIN, +VIRTIO_VDEC_GUEST_REQ_FLUSH, +VIRTIO_VDEC_GUEST_REQ_CLOSE, +}; + +struct virtio_vdec_guest_req { +le32 type; +le32 instance_id; +union { +struct virtio_vdec_guest_req_open open; +struct virtio_vdec_guest_req_set_buffer_count set_buffer_count; +struct virtio_vdec_guest_req_register_buffer register_buffer; +struct virtio_vdec_guest