Avi Kivity wrote: > Anthony Liguori wrote: >> Avi Kivity wrote: >> >>> Anthony Liguori wrote: >>> >>>>>> static int mmu_topup_memory_caches(struct kvm_vcpu *vcpu) >>>>>> { >>>>>> int r; >>>>>> >>>>>> kvm_mmu_free_some_pages(vcpu); >>>>>> r = mmu_topup_memory_cache(&vcpu->mmu_pte_chain_cache, >>>>>> pte_chain_cache, 4); >>>>>> if (r) >>>>>> goto out; >>>>>> r = mmu_topup_memory_cache(&vcpu->mmu_rmap_desc_cache, >>>>>> rmap_desc_cache, 1); >>>>>> if (r) >>>>>> goto out; >>>>>> r = mmu_topup_memory_cache_page(&vcpu->mmu_page_cache, 8); >>>>>> if (r) >>>>>> goto out; >>>>>> r = mmu_topup_memory_cache(&vcpu->mmu_page_header_cache, >>>>>> mmu_page_header_cache, 4); >>>>>> out: >>>>>> return r; >>>>>> } >>>>>> >>>>> These are the (4, 1, 8, 4) values in the call to >>>>> mmu_topup_memory_cache. Perhaps one of them is too low. >>>>> >>>> Sure. Would this be affected at all by your tpr patch? >>> I believe not, but the code doesn't care what I believe. >>> >>> >>>> IIUC, if this is the problem, it should be reproducible with the >>>> latest git too? >>>> >>> One difference is that the tpr patch disables nx. That causes >>> Windows to go into 32-bit paging mode (nice that it has both pae and >>> nonpae in the same kernel), which may change things. >>> >>> You can try booting your host with nx disabled to get the same >>> effect (or disable nx cpuid in kvm). >>> >> >> I've disabled NX in KVM and that didn't reproduce the issue in the >> current git. >> >> If I double all of the memory caches, I can't seem to reproduce. >> However, as soon as I reduce rmap_desc_cache down to 1, I can reproduce. >> >> I'll try to see if just setting the rmap_desc_cache line to 2 is >> enough to make the problem go away. >> >> How can the guest provoke this BUG() based on the cache size? Should >> the cache size only affect performance? >> >> > > The memory caches are a little misnamed; they're really preallocation > buffers. > > They serve two purposes: to avoid allocation in atomic contexts > (that's no longer needed since preempt notifiers) and to avoid complex > error recovery paths. We make sure there are enough objects to > satisfy worst case behavior and assume all allocations will work.
FWIW, two people in IRC just reported the same BUG in kvm-48 and the nightly snapshot. I asked them both to post more thorough descriptions here. That leads me to suspect that the problem wasn't introduced by your TPR optimization. Regards, Anthony Liguori > Regarding the rmap memory cache failure, I can't think of a reason why > we'll need to add more than one rmap entry per fault. > ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel