On Tue, Sep 13, 2016 at 3:49 PM, Anshuman Khandual <[email protected]> wrote: > On 09/13/2016 10:04 AM, Balbir Singh wrote: >> >> >> On 13/09/16 14:07, Anshuman Khandual wrote: >>> On 09/12/2016 05:03 PM, Balbir Singh wrote: >>>> On Mon, Sep 12, 2016 at 9:13 PM, Anshuman Khandual >>>> <[email protected]> wrote: >>>>>> When the HPT size is explicitly passed on from the userspace, currently >>>>>> the KVM_PPC_ALLOCATE_HTAB will try to allocate the requested size of HPT >>>>>> from reserved CMA area and if that is not possible, the allocation just >>>>>> fails. With the commit 572abd563befd56 ("KVM: PPC: Book3S HV: Don't fall >>>>>> back to smaller HPT size in allocation ioctl"), it does not even try to >>>>>> allocate the same order pages from the page allocator before failing for >>>>>> good. Same order allocation should be attempted from the page allocator >>>>>> as a fallback option when the CMA allocation attempt fails. >>>>>> >>>>>> Signed-off-by: Anshuman Khandual <[email protected]> >>>>>> --- >>>>>> - This change saves guests from failing to start after migration >>>>>> >>>>>> arch/powerpc/kvm/book3s_64_mmu_hv.c | 8 ++++++++ >>>>>> 1 file changed, 8 insertions(+) >>>>>> >>>>>> diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c >>>>>> b/arch/powerpc/kvm/book3s_64_mmu_hv.c >>>>>> index 05f09ae..0a30eb4 100644 >>>>>> --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c >>>>>> +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c >>>>>> @@ -78,6 +78,14 @@ long kvmppc_alloc_hpt(struct kvm *kvm, u32 >>>>>> *htab_orderp) >>>>>> --order; >>>>>> } >>>>>> >>>>>> + /* >>>>>> + * Fallback in case the userspace has provided a size via ioctl. >>>>>> + * Try allocating the same order pages from the page allocator. >>>>>> + */ >>>>>> + if (!hpt && order > PPC_MIN_HPT_ORDER && htab_orderp) >>>>>> + hpt = >>>>>> __get_free_pages(GFP_KERNEL|__GFP_ZERO|__GFP_REPEAT| >>>>>> + __GFP_NOWARN, order - PAGE_SHIFT); >>>>>> + >>>> How often does this succeed? Please provide data. I presume this for >>> >>> During continuous guest VM migration test from source host to destination >>> host >>> this patch was able to prevent guest creation failure after migration on the >>> destination host which was failing after 2-3 days. We have not seen the >>> failure >>> till now even after 3-4 days. >>> >> >> OK.. the CMA failures need analysis. Are we just ignoring a CMA bug? IOW, why > > Sure, it does need analysis. But there will be situations where CMA > allocation request can fail, thats why we will need fallback option.
Please elaborate those situations. This patch needs more explanation as to why we should fallback -- what are those short comings of CMA allocation. Can anyone using CMA face them and have to design a fallback? > That the same reason why we have fall back options of attempting from > page allocator (in decreasing order every time) when the size is not > specified as part of the ioctl. Why the case should be any different > when the size is specified in the ioctl(). > >> would CMA allocation fail -- CMA size is too small to accommodate the >> required >> number of allocations? > > The same size seems to be good enough for first couple of days and > then it fails. Probably some __GFP_MOVABLE allocation got pinned > later on. > Please analyze and let us know >> >>>> the case where guest pages are pinned? >>> >>> Hmm, need to check that in the test setup. There was nothing running inside >>> the >>> guests though. IIUC, HPT size of the guest is computed based on the max >>> memory >>> the guest is ever going to have irrespective of the RAM usage before >>> migration. >>> How does pinning effect the HPT size ? >>> >> >> If the pinned pages (from anywhere) belong to CMA, then CMA allocations >> would start failing > > Right and with the current design of CMA we can do nothing about it, > unless we make sure the pages allocated to satisfy guest real memory > do not come from CMA area at all. > I have patches to move non-THP pages out of CMA Balbir
