Andrea Arcangeli wrote:
On Wed, Jun 04, 2008 at 07:17:55PM +0300, Muli Ben-Yehuda wrote:
On Wed, Jun 04, 2008 at 06:09:24PM +0300, Ben-Ami Yassour1 wrote:
We noticed that pfn_valid does not always works as expected by
this patch to indicate that a pfn has a backing page. We have
seen a case where CONFIG_NUMA was not set and then where
pfn_valid returned 1 for an mmio pfn. We then changed the
config file with CONFIG_NUMA set and it worked fine as expected
(since a different implementation of pfn_valid was used).
How should we overcome this issue?
There's a page_is_ram() too, but that's the e820 map check and it
means it's RAM not that there's a page backing store. Certainly if
it's not ram we should go ahead with just the pfn but it'd be a
workaround.
I really think it'd be better off to fix pfn_valid to work for
NUMA.
It does work for NUMA, it does not work without the NUMA option.
Andrea, how would you suggest to fix pfn_valid for the CONFIG_NUMA
disabled case?
I'm very surprised that this is broken for non-numa.
To be sure I understand, what exactly means that "pfn has not a
backing page"?
The pfn does have a backing page. When using CONFIG_FLATMEM,
pfn_valid() is simply:
#ifdef CONFIG_FLATMEM
#define pfn_valid(pfn) ((pfn) < end_pfn)
#endif
And this is true, pfn_valid() just indicates whether there is a space in
mem_map[], and there certainly is. Note this only happens when there is
a valid PFN that's greater than the PCI memory (using 4GB+ of memory).
I think we must fix pfn_valid, so that when it returns true,
pfn_to_page returns an entry that is contained inside some mem_map
array allocated by mm/page_alloc.c. pfn_valid is totally right to
return true on mmio regions, as long as a 'struct page' exists. Like
for example the 640k-1M area. It'd be a waste to pay the price of a
discontiguous array to save a few struct pages (or perhaps these days
they're inefficiently using 4k tlb entries to map struct page dunno, I
surely prefer to waste a few struct pages and stay with 2M tlb).
As long as a 'struct page' exists and pfn_to_page returns a 'struct
page' checking PageReserved() should be enough to know if it's
pageable RAM owned by the VM or an mmio region. I don't know why but
when I did reserved-ram patch, the PageReserved check wasn't returning
true on the reserved-ram. Perhaps it's because it was ram, dunno, so I
had to use the page_count() hack. But for mmio !pfn_valid ||
PageReserved should work.
Dave mentioned that SetPageReserved() doesn't necessarily get called for
zones with bad alignment.
Regards,
Anthony Liguori
Said that this mem_map code tends to change
all the time for whatever reason, so I wouldn't be shocked if
PageReserved doesn't actually work either. But mixing pfn_valid and
PageReserved sounds the right objective.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html