Anthony Liguori wrote:
There are N users of this code, all of which would need to cope with the
failure. Or there could be one user (dma.c) which handles the failure
and the bouncing.
N should be small long term. It should only be for places that would
interact directly with CPU memory. This would be the PCI BUS, the ISA
BUS, some speciality devices, and possibly virtio (although you could
argue it should go through the PCI BUS).
Fine, then let's rename it pci-dma.c.
map() has to fail and that has nothing to do with bouncing or not
bouncing. In the case of Xen, you can have a guest that has 8GB of
memory, and you only have 2GB of virtual address space. If you try to
DMA to more than 2GB of memory, there will be a failure. Whoever is
accessing memory directly in this fashion needs to cope with that.
The code already allows for failure, by partitioning the dma into
segments. Currently, this happens only on bounce buffer overflow, when
the Xen code is integrated it can be expanded to accommodate this.
(There's a case for partitioning 2GB DMAs even without Xen; just to
reduce the size of iovec allocations)
dma.c _is_ a map/unmap api, except it doesn't expose the mapped data,
which allows it to control scheduling as well as be easier to use.
As I understand dma.c, it performs the following action: map() as much
as possible, call an actor on mapped memory, repeat until done, signal
completion.
As an abstraction, it may be useful. I would argue that it should be
a bit more generic though. It should take a function pointer for map
and unmap too, and then you wouldn't need N versions of it for each
different type of API.
I don't follow. What possible map/unmap pairs would it call, other than
cpu_physical_memory_(map/unmap)()?
Right, but who would it notify?
We need some place that can deal with this, and it isn't
_map()/_unmap(), and it isn't ide.c or scsi.c.
The pattern of try to map(), do IO, unmap(), repeat only really works
for block IO. It doesn't really work for network traffic. You have
to map the entire packet and send it all at once. You cannot accept a
partial mapping result. The IO pattern to send an IO packet is much
simpler: try to map the packet, if the mapping fails, either wait
until more space frees up or drop the packet. For the other uses of
direct memory access, like kernel loading, the same is true.
If so, the API should be extended to support more I/O patterns.
What this is describing is not a DMA API. It's a very specific IO
pattern. I think that's part of what's causing confusion in this
series. It's certainly not at all related to PCI DMA.
It deals with converting scatter/gather lists to iovecs, bouncing when
this is not possible, and managing the bounce buffers. If this is not
dma, I'm not sure what is. It certainly isn't part of block device
emulation, it isn't part of the block layer (since bouncing is common to
non-block devices). What is it?
I would argue, that you really want to add a block driver interface
that takes the necessary information, and implements this pattern but
that's not important. Reducing code duplication is a good thing so
however it ends up working out is fine.
Right now the qemu block layer is totally independent of device
emulation, and I think that's a good thing.
--
Do not meddle in the internals of kernels, for they are subtle and quick to
panic.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html