On Wed, Sep 11, 2024 at 12:24 PM Michael S. Tsirkin <m...@redhat.com> wrote: > > On Tue, Sep 10, 2024 at 11:36:08PM +0200, Mattias Nissler wrote: > > On Tue, Sep 10, 2024 at 6:40 PM Michael S. Tsirkin <m...@redhat.com> wrote: > > > > > > On Tue, Sep 10, 2024 at 06:10:50PM +0200, Mattias Nissler wrote: > > > > On Tue, Sep 10, 2024 at 5:44 PM Peter Maydell > > > > <peter.mayd...@linaro.org> wrote: > > > > > > > > > > On Tue, 10 Sept 2024 at 15:53, Michael S. Tsirkin <m...@redhat.com> > > > > > wrote: > > > > > > > > > > > > On Mon, Aug 19, 2024 at 06:54:54AM -0700, Mattias Nissler wrote: > > > > > > > When DMA memory can't be directly accessed, as is the case when > > > > > > > running the device model in a separate process without shareable > > > > > > > DMA > > > > > > > file descriptors, bounce buffering is used. > > > > > > > > > > > > > > It is not uncommon for device models to request mapping of > > > > > > > several DMA > > > > > > > regions at the same time. Examples include: > > > > > > > * net devices, e.g. when transmitting a packet that is split > > > > > > > across > > > > > > > several TX descriptors (observed with igb) > > > > > > > * USB host controllers, when handling a packet with multiple > > > > > > > data TRBs > > > > > > > (observed with xhci) > > > > > > > > > > > > > > Previously, qemu only provided a single bounce buffer per > > > > > > > AddressSpace > > > > > > > and would fail DMA map requests while the buffer was already in > > > > > > > use. In > > > > > > > turn, this would cause DMA failures that ultimately manifest as > > > > > > > hardware > > > > > > > errors from the guest perspective. > > > > > > > > > > > > > > This change allocates DMA bounce buffers dynamically instead of > > > > > > > supporting only a single buffer. Thus, multiple DMA mappings work > > > > > > > correctly also when RAM can't be mmap()-ed. > > > > > > > > > > > > > > The total bounce buffer allocation size is limited individually > > > > > > > for each > > > > > > > AddressSpace. The default limit is 4096 bytes, matching the > > > > > > > previous > > > > > > > maximum buffer size. A new x-max-bounce-buffer-size parameter is > > > > > > > provided to configure the limit for PCI devices. > > > > > > > > > > > > > > Signed-off-by: Mattias Nissler <mniss...@rivosinc.com> > > > > > > > Reviewed-by: Philippe Mathieu-Daudé <phi...@linaro.org> > > > > > > > Acked-by: Peter Xu <pet...@redhat.com> > > > > > > > --- > > > > > > > This patch is split out from my "Support message-based DMA in > > > > > > > vfio-user server" > > > > > > > series. With the series having been partially applied, I'm > > > > > > > splitting this one > > > > > > > out as the only remaining patch to system emulation code in the > > > > > > > hope to > > > > > > > simplify getting it landed. The code has previously been reviewed > > > > > > > by Stefan > > > > > > > Hajnoczi and Peter Xu. This latest version includes changes to > > > > > > > switch the > > > > > > > bounce buffer size bookkeeping to `size_t` as requested and > > > > > > > LGTM'd by Phil in > > > > > > > v9. > > > > > > > --- > > > > > > > hw/pci/pci.c | 8 ++++ > > > > > > > include/exec/memory.h | 14 +++---- > > > > > > > include/hw/pci/pci_device.h | 3 ++ > > > > > > > system/memory.c | 5 ++- > > > > > > > system/physmem.c | 82 > > > > > > > ++++++++++++++++++++++++++----------- > > > > > > > 5 files changed, 76 insertions(+), 36 deletions(-) > > > > > > > > > > > > > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c > > > > > > > index fab86d0567..d2caf3ee8b 100644 > > > > > > > --- a/hw/pci/pci.c > > > > > > > +++ b/hw/pci/pci.c > > > > > > > @@ -85,6 +85,8 @@ static Property pci_props[] = { > > > > > > > QEMU_PCIE_ERR_UNC_MASK_BITNR, true), > > > > > > > DEFINE_PROP_BIT("x-pcie-ari-nextfn-1", PCIDevice, > > > > > > > cap_present, > > > > > > > QEMU_PCIE_ARI_NEXTFN_1_BITNR, false), > > > > > > > + DEFINE_PROP_SIZE32("x-max-bounce-buffer-size", PCIDevice, > > > > > > > + max_bounce_buffer_size, > > > > > > > DEFAULT_MAX_BOUNCE_BUFFER_SIZE), > > > > > > > DEFINE_PROP_END_OF_LIST() > > > > > > > }; > > > > > > > > > > > > > > > > > > > I'm a bit puzzled by now there being two fields named > > > > > > max_bounce_buffer_size, one directly controllable by > > > > > > a property. > > > > > > > > One is one the pci device, the other is on the address space. The > > > > former can be set via a command line parameter, and that value is used > > > > to initialize the field on the address space, which is then consulted > > > > when allocating bounce buffers. > > > > > > > > I'm not sure which aspect of this is unclear and/or deserves > > > > additional commenting - let me know and I'll be happy to send a patch. > > > > > > I'd document what does each field do. > > > > I have just sent a patch to expand the comments, let's discuss details > > there. > > > > > > > > > > > > > > > > > Pls add code comments explaining how they are related. > > > > > > > > > > > > > > > > > > Also, what is the point of adding a property without > > > > > > making it part of an API? No one will be able to rely on > > > > > > it working. > > > > All I needed was a practical way to allow the bounce buffer size limit > > to be adjusted in the somewhat exotic situations where we're making > > DMA requests to indirect memory regions (in my case it is a qemu > > vfio-user server accessed by a client that can't or doesn't want to > > provide direct memory-mapped access to its RAM). There was some > > discussion about the nature of the parameter when I first proposed the > > patch, see > > https://lore.kernel.org/qemu-devel/20230823092905.2259418-2-mniss...@rivosinc.com/ > > - an x-prefixed experimental command-line parameter was suggested > > there as a practical way to allow this without qemu committing to > > supporting this forever. For the unlikely case that this parameter > > proves popular, it can still be added to a stable API (or > > alternatively we could discuss whether a large-enough limit is > > feasible after all, or even consider DMA API changes to obviate the > > need for bounce buffering). > > > Yes but how happy will you be if we rename the parameter in the > future? All your scripts will break.
It's not that I'm running random qemu versions in production, in fact I'm using this for semi-automated testing of hardware designs. We'd find out when upgrading our qemu, and adjust. In fact, if you come up with a better way to handle this bounce buffering kludge, I'd be willing to not only adjust, but even help implement. > > > > > > > > > > > Note that this patch is already upstream as commit 637b0aa13. > > > > > > > > > > thanks > > > > > -- PMM > > > > > > Maybe you can answer this? > > > >