On Wed, 27 Mar 2019 15:01:37 +0100
David Hildenbrand <da...@redhat.com> wrote:

> On 27.03.19 10:09, Igor Mammedov wrote:
> > On Wed, 27 Mar 2019 09:10:01 +0100
> > David Hildenbrand <da...@redhat.com> wrote:
> > 
> >> On 27.03.19 01:12, David Gibson wrote:
> >>> On Tue, Mar 26, 2019 at 06:02:51PM +0100, David Hildenbrand wrote:  
> >>>> On 26.03.19 15:08, Igor Mammedov wrote:  
> >>>>> On Tue, 26 Mar 2019 14:50:58 +1100
> >>>>> David Gibson <da...@gibson.dropbear.id.au> wrote:
> >>>>>  
> >>>>>> qemu_getrampagesize() works out the minimum host page size backing any 
> >>>>>> of
> >>>>>> guest RAM.  This is required in a few places, such as for POWER8 PAPR 
> >>>>>> KVM
> >>>>>> guests, because limitations of the hardware virtualization mean the 
> >>>>>> guest
> >>>>>> can't use pagesizes larger than the host pages backing its memory.
> >>>>>>
> >>>>>> However, it currently checks against *every* memory backend, whether 
> >>>>>> or not
> >>>>>> it is actually mapped into guest memory at the moment.  This is 
> >>>>>> incorrect.
> >>>>>>
> >>>>>> This can cause a problem attempting to add memory to a POWER8 pseries 
> >>>>>> KVM
> >>>>>> guest which is configured to allow hugepages in the guest (e.g.
> >>>>>> -machine cap-hpt-max-page-size=16m).  If you attempt to add 
> >>>>>> non-hugepage,
> >>>>>> you can (correctly) create a memory backend, however it (correctly) 
> >>>>>> will
> >>>>>> throw an error when you attempt to map that memory into the guest by
> >>>>>> 'device_add'ing a pc-dimm.
> >>>>>>
> >>>>>> What's not correct is that if you then reset the guest a startup check
> >>>>>> against qemu_getrampagesize() will cause a fatal error because of the 
> >>>>>> new
> >>>>>> memory object, even though it's not mapped into the guest.  
> >>>>> I'd say that backend should be remove by mgmt app since device_add 
> >>>>> failed
> >>>>> instead of leaving it to hang around. (but fatal error either not a nice
> >>>>> behavior on QEMU part)  
> >>>>
> >>>> Indeed, it should be removed. Depending on the options (huge pages with
> >>>> prealloc?) memory might be consumed for unused memory. Undesired.  
> >>>
> >>> Right, but if the guest initiates a reboot before the management gets
> >>> to that, we'll have a crash.
> >>>   
> >>
> >> Yes, I agree.
> >>
> >> At least on s390x (extending on what Igor said):
> >>
> >> mc->init() -> s390_memory_init() ->
> >> memory_region_allocate_system_memory() -> host_memory_backend_set_mapped()
> >>
> >>
> >> ac->init_machine() -> kvm_arch_init() ->
> >> kvm_s390_configure_mempath_backing() -> qemu_getrampagesize()
> >>
> >>
> >> And in vl.c
> >>
> >> configure_accelerator(current_machine, argv[0]);
> > Looking more at it, it is seems s390 is 'broken' anyways.
> > We call qemu_getrampagesize() here with huge page backends on CLI
> > but memory-backends are initialized later
> >  qemu_opts_foreach(..., object_create_delayed, ...)
> > so s390 doesn't take into account memory backends currently
> 
> BTW that might indeed be true, we only check against --mem-path.

It's possible to break it with '-numa node,memdev=...' since we don't really 
have
anything to block that call chain for s390 (but I'd argue it's invalid use of 
CLI
for s390 but it's effectively -mem-path on steroids alternative)


Reply via email to