On Wed, Jul 24, 2013 at 12:25:18PM +0200, Alexander Graf wrote:
>
> On 24.07.2013, at 12:19, Gleb Natapov wrote:
>
> > On Wed, Jul 24, 2013 at 12:09:42PM +0200, Alexander Graf wrote:
> >>
> >> On 24.07.2013, at 12:01, Gleb Natapov wrote:
> >>
> >>> Copying Andrea for him to verify that I am not talking nonsense :)
> >>>
> >>> On Wed, Jul 24, 2013 at 10:25:20AM +0200, Alexander Graf wrote:
> >>>>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> >>>>> index 1580dd4..5e8635b 100644
> >>>>> --- a/virt/kvm/kvm_main.c
> >>>>> +++ b/virt/kvm/kvm_main.c
> >>>>> @@ -102,6 +102,10 @@ static bool largepages_enabled = true;
> >>>>>
> >>>>> bool kvm_is_mmio_pfn(pfn_t pfn)
> >>>>> {
> >>>>> +#ifdef CONFIG_MEMORY_HOTPLUG
> >>>>
> >>>> I'd feel safer if we narrow this down to e500.
> >>>>
> >>>>> + /*
> >>>>> + * Currently only in memory hot remove case we may still need
> >>>>> this.
> >>>>> + */
> >>>>> if (pfn_valid(pfn)) {
> >>>>
> >>>> We still have to check for pfn_valid, no? So the #ifdef should be down
> >>>> here.
> >>>>
> >>>>> int reserved;
> >>>>> struct page *tail = pfn_to_page(pfn);
> >>>>> @@ -124,6 +128,7 @@ bool kvm_is_mmio_pfn(pfn_t pfn)
> >>>>> }
> >>>>> return PageReserved(tail);
> >>>>> }
> >>>>> +#endif
> >>>>>
> >>>>> return true;
> >>>>> }
> >>>>>
> >>>>> Before apply this change:
> >>>>>
> >>>>> real (1m19.954s + 1m20.918s + 1m22.740s + 1m21.146s + 1m22.120s)/5=
> >>>>> 1m21.376s
> >>>>> user (0m23.181s + 0m23.550s + 0m23.506s + 0m23.410s + 0m23.520s)/5=
> >>>>> 0m23.433s
> >>>>> sys (0m49.087s + 0m49.563s + 0m51.758s + 0m50.290s + 0m51.047s)/5=
> >>>>> 0m50.349s
> >>>>>
> >>>>> After apply this change:
> >>>>>
> >>>>> real (1m19.507s + 1m20.919s + 1m21.436s + 1m21.179s + 1m20.293s)/5=
> >>>>> 1m20.667s
> >>>>> user (0m22.595s + 0m22.719s + 0m22.484s + 0m22.811s + 0m22.467s)/5=
> >>>>> 0m22.615s
> >>>>> sys (0m48.841s + 0m49.929s + 0m50.310s + 0m49.813s + 0m48.587s)/5=
> >>>>> 0m49.496s
> >>>>>
> >>>>> So,
> >>>>>
> >>>>> real (1m20.667s - 1m21.376s)/1m21.376s x 100% = -0.6%
> >>>>> user (0m22.615s - 0m23.433s)/0m23.433s x 100% = -3.5%
> >>>>> sys (0m49.496s - 0m50.349s)/0m50.349s x 100% = -1.7%
> >>>>
> >>>> Very nice, so there is a real world performance benefit to doing this.
> >>>> Then yes, I think it would make sense to change the global helper
> >>>> function to be fast on e500 and use that one from
> >>>> e500_shadow_mas2_attrib() instead.
> >>>>
> >>>> Gleb, Paolo, any hard feelings?
> >>>>
> >>> I do not see how can we break the function in such a way and get
> >>> away with it. Not all valid pfns point to memory. Physical address can
> >>> be sparse (due to PCI hole, framebuffer or just because).
> >>
> >> But we don't check for sparseness today in here either. We merely check
> >> for incomplete huge pages.
> >>
> > That's not how I read the code. The code checks for reserved flag set.
> > It should be set on pfns that point to memory holes. As far as I
>
> I couldn't find any traces of code that sets the reserved bits on e500 chips
> though. I've only seen it getting set for memory hotplug and memory
> incoherent DMA code which doesn't get used on e500.
>
> But I'd be more than happy to get proven wrong :).
>
Can you write a module that scans all page structures? AFAIK all pages
are marked as reserved and then those that become regular memory are
marked as unreserved. Hope Andrea will chime in here :)
--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html