On Thu, Feb 29, 2024 at 12:12 PM Peter Maydell <peter.mayd...@linaro.org> wrote:
>
> On Thu, 29 Feb 2024 at 10:59, Jonathan Cameron
> <jonathan.came...@huawei.com> wrote:
> >
> > On Thu, 29 Feb 2024 09:38:29 +0000
> > Peter Maydell <peter.mayd...@linaro.org> wrote:
> >
> > > On Wed, 28 Feb 2024 at 19:07, Heinrich Schuchardt
> > > <heinrich.schucha...@canonical.com> wrote:
> > > >
> > > > On 28.02.24 19:39, Peter Maydell wrote:
> > > > > The limitation to a page dates back to commit 6d16c2f88f2a in 2009,
> > > > > which was the first implementation of this function. I don't think
> > > > > there's a particular reason for that value beyond that it was
> > > > > probably a convenient value that was assumed to be likely "big 
> > > > > enough".
> > > > >
> > > > > I think the idea with this bounce-buffer has always been that this
> > > > > isn't really a code path we expected to end up in very often --
> > > > > it's supposed to be for when devices are doing DMA, which they
> > > > > will typically be doing to memory (backed by host RAM), not
> > > > > devices (backed by MMIO and needing a bounce buffer). So the
> > > > > whole mechanism is a bit "last fallback to stop things breaking
> > > > > entirely".
> > > > >
> > > > > The address_space_map() API says that it's allowed to return
> > > > > a subset of the range you ask for, so if the virtio code doesn't
> > > > > cope with the minimum being set to TARGET_PAGE_SIZE then either
> > > > > we need to fix that virtio code or we need to change the API
> > > > > of this function. (But I think you will also get a reduced
> > > > > range if you try to use it across a boundary between normal
> > > > > host-memory-backed RAM and a device MemoryRegion.)
> > > >
> > > > If we allow a bounce buffer only to be used once (via the in_use flag),
> > > > why do we allow only a single bounce buffer?
> > > >
> > > > Could address_space_map() allocate a new bounce buffer on every call and
> > > > address_space_unmap() deallocate it?
> > > >
> > > > Isn't the design with a single bounce buffer bound to fail with a
> > > > multi-threaded client as collision can be expected?
> > >
> > > Yeah, I don't suppose multi-threaded was particularly expected.
> > > Again, this is really a "handle the case where the guest does
> > > something silly" setup, which is why only one bounce buffer.
> > >
> > > Why is your guest ending up in the bounce-buffer path?
> >
> > Happens for me with emulated CXL memory.
>
> Can we put that in the "something silly" bucket? :-)
> But yes, I'm not surprised that CXL runs into this. Heinrich,
> are you doing CXL testing, or is this some other workload?
>
> > I think the case I saw
> > was split descriptors in virtio via address space caches
> > https://elixir.bootlin.com/qemu/latest/source/hw/virtio/virtio.c#L4043
> >
> > One bounce buffer is in use for the outer loop and another for the 
> > descriptors
> > it is pointing to.
>
> Mmm. The other assumption made in the design of the address_space_map()
> API I think was that it was unlikely that a device would be trying
> to do two DMA operations simultaneously. This is clearly not
> true in practice. We definitely need to fix one end or other of
> this API.
>
> (I'm not sure why the bounce-buffer limit ought to be per-AddressSpace:
> is that just done in Matthias' series so that we can attach an
> x-thingy property to the individual PCI device?)

Yes, that's the result of review feedback to the early iterations of
my series. Specifically, (1) a limit is needed to prevent rogue guests
from hogging unlimited amounts of memory and (2) global parameters are
frowned upon. Setting a suitable limit is much more practical when
targeted at a given device/driver combination.

Reply via email to