On Tue, 24 Oct 2023 at 14:38, Antonio Caggiano <quic_acagg...@quicinc.com> wrote: > > +cc Mark which has a better understanding of our use case. > > On 24/10/2023 15:21, Peter Maydell wrote: > > On Tue, 24 Oct 2023 at 13:31, Antonio Caggiano > > <quic_acagg...@quicinc.com> wrote: > >> > >> > >> > >> On 24/10/2023 14:00, Peter Maydell wrote: > >>> On Tue, 24 Oct 2023 at 11:49, Antonio Caggiano > >>> <quic_acagg...@quicinc.com> wrote: > >>> Given that we don't run into this upstream, this leaves me > >>> suspecting that the underlying problem is that a memory > >>> region this big shouldn't be being registered to KVM in the > >>> first place. Certainly the gpex PCI controller works fine > >>> on the arm64 virt board under KVM, so maybe your board code > >>> is doing something odd when it wires it up? > > > >> I think so, we use a MMIO system_memory, effectively using > >> memory_region_init_io() instead of memory_region_init(). This is for > >> registering our callbacks with a MemoryRegionOps. > >> > >> So, for a MMIO memory type, UINT64_MAX means "real" size rather than > >> "all of the 64 bit address space"? > > > > For a memory region, in the creation APIs that take a uint64_t > > size argument: > > * size 0 is (I think) not valid > > * 1..UINT64_MAX-1 mean "size is that many bytes" > > * UINT64_MAX means "size is 2^64 bytes" > > * there is no way to ask for (2^64)-1 bytes > > > > I am confused about why you say your system_memory is an > > MMIO region with callbacks, though. The system_memory MR > > is expected to be a pure "container" region, which has > > no callbacks and simply is a place where we add other > > subregions (which might be RAM, or IO, or whatever). > > > > We use an MMIO system memory for our ops, which capture memory accesses > and handle some of them, even though no memory region is actually > backing the addresses. > > Does that means we are violating QEMU's expectations by modifing that > from "container" to "MMIO"? This issue is just one consequence of such > violation and there might be more potentially?
I think that's one of those things which in theory might be supposed to work but which in practice might run into trouble, because nobody's ever done it before. In particular it seems like it's not currently compatible with KVM... I can think of a few things it might be interesting to try as workarounds: * leave the system_memory as a container, and fill it with two MMIO MemoryRegions at very-low-priority that split the 2^64 space in half, so that neither of them runs into the "too big" issue * restrict the size to the guest cpu physical address space size (again, avoiding the "too big" problem) In an ideal world the code in kvm_region_commit() could be made to handle 2^64-sized MRs, at least for MMIO MRs (where the eventual behaviour in kvm_set_phys_ram() is "nothing to do"). 2^64-sized memory-backed MRs would need handling in the kernel, but they're not possible since you can't get that much host memory to start with :-) -- PMM