Hi Jonathan,

In previous conversations in the context of the bounce buffering changes I
did, we ended up turning my first straw-man proposal of a global parameter
into a per-device parameter, with the argument that devices are best
positioned to "know" how much concurrent DMA they will be performing. This
does not make much sense for the CXL case, because the bounce buffering
happens on the receiving end of the DMA operation if I understand
correctly. Of course, the memory side can't really know what size is
appropriate because the access patterns are controlled by the initiators.
In the previous line of thought, it would have to be the sum of the max
bounce buffer size across all devices?

Still, if you wanted to put a limit on the memory side of things, I reckon
in principle you could implement a max size parameter on the memory region,
supplied by whoever creates the I/O region (CXL in your case). Then, adjust
the size limit accounting logic to optionally use capacity from that.

Recall that the original motivation for putting a limit is to prevent
malicious guests from allocating unlimited bounce buffer memory and in the
extreme perform a denial-of-service attack against the host. So, thinking
outside the box, maybe there could be a way for the guest to donate its own
memory to be used for bounce buffering?

Anyhow, just some context/thoughts in case they are useful, I can't think
of a silver bullet. Perhaps it's time to give up and do the global
parameter after all...

Cheers,
Mattias

On Thu, May 22, 2025 at 5:25 PM Jonathan Cameron <
jonathan.came...@huawei.com> wrote:

> Hi All,
>
> This closely related to Mattias' work to resolve bounce buffer limitations
> for
> PCI memory spaces.
>
> https://lore.kernel.org/qemu-devel/20240819135455.2957406-1-mniss...@rivosinc.com/
>
> For CXL memory, due to the way interleave memory is emulated we end
> up with the same problem with concurrent virtio mappings into IOMEM but
> in this case they are in the address_space_memory.  In my tests
> virtio-blk tends to fail as a result.  Note whilst QEMU sees this as
> IOMEM, in the host it's just 'normal RAM' (be it with terrible performance
> :)
>
> Currently I'm carrying the hack (obviously I never checked how much
> space I actually needed as it's unlikely to be that much :)
>
> diff --git a/system/physmem.c b/system/physmem.c
> index
> e97de3ef65cf8105b030a44e7a481b1679f86b53..fd0848c1d5b982c3255a7c6c8c1f22b32c86b85a
> 100644
> --- a/system/physmem.c
> +++ b/system/physmem.c
> @@ -2787,6 +2787,7 @@ static void memory_map_init(void)
>      memory_region_init(system_memory, NULL, "system", UINT64_MAX);
>      address_space_init(&address_space_memory, system_memory, "memory");
>
> +    address_space_memory.max_bounce_buffer_size = 1024 * 1024 * 1024;
>      system_io = g_malloc(sizeof(*system_io));
>      memory_region_init_io(system_io, NULL, &unassigned_io_ops, NULL, "io",
>                            65536);
>
>
> Assuming people are amenable to making this configurable with a parameter
> like x-max-bounce-buffer-size (from Mattias' set) how would people like
> that to be configured?  The address_space_init() call is
> fairly early but I think we can modify the max_bounce_buffer_size later
> potentially directly from machine_set_mem() if the parameter is set.
>
> I'm also interested if anyone has another suggestion for how to solve this
> problem more generally.
>
> Thanks,
>
> Jonathan
>

Reply via email to