On 11.01.22 13:39, David Hildenbrand wrote: > For fd-based shared memory, MAP_NORESERVE is only effective for hugetlb, > otherwise it's ignored. Older Linux versions that didn't support > reservation of huge pages ignored MAP_NORESERVE completely. > > The first client to mmap a hugetlb fd without MAP_NORESERVE will > trigger reservation of huge pages for the whole mmapped range. There are > two cases to consider: > > 1) QEMU mapped RAM without MAP_NORESERVE > > We're not dealing with a sparse mapping, huge pages for the whole range > have already been reserved by QEMU. An additional mmap() without > MAP_NORESERVE won't have any effect on the reservation. > > 2) QEMU mapped RAM with MAP_NORESERVE > > We're delaing with a sparse mapping, no huge pages should be reserved. > Further mappings without MAP_NORESERVE should be avoided. > > For 1), it doesn't matter if we set MAP_NORESERVE or not, so we can > simply set it. For 2), we'd be overriding QEMUs decision and trigger > reservation of huge pages, which might just fail if there are not > sufficient huge pages around. We must map with MAP_NORESERVE. > > This change is required to support virtio-mem with hugetlb: a > virtio-mem device mapped into the guest physical memory corresponds to > a sparse memory mapping and QEMU maps this memory with MAP_NORESERVE. > Whenever memory in that sparse region will be accessed by the VM, QEMU > populates huge pages for the affected range by preallocating memory > and handling any preallocation errors gracefully. > > So let's map shared RAM with MAP_NORESERVE. As libvhost-user only > supports Linux, there shouldn't be anything to take care of in regard of > other OS support. > > Without this change, libvhost-user will fail mapping the region if there > are currently not enough huge pages to perform the reservation: > fv_panic: libvhost-user: region mmap error: Cannot allocate memory > > Cc: "Marc-André Lureau" <marcandre.lur...@redhat.com> > Cc: "Michael S. Tsirkin" <m...@redhat.com> > Cc: Paolo Bonzini <pbonz...@redhat.com> > Cc: Raphael Norwitz <raphael.norw...@nutanix.com> > Cc: Stefan Hajnoczi <stefa...@redhat.com> > Cc: Dr. David Alan Gilbert <dgilb...@redhat.com> > Signed-off-by: David Hildenbrand <da...@redhat.com> > ---
Note: I was assuming rust vhost-user-backend would need similar care, but vm-memory already does the right thing by supplying MAP_NORESERVE: https://github.com/rust-vmm/vm-memory/blob/7a5e0696dc4170f590ac9b837e65cc4136b30e38/src/mmap_unix.rs#L264 -- Thanks, David / dhildenb