On Fri, 31 Jan 2014 13:14:29 -0800
Dave Hansen <[email protected]> wrote:

> On 01/31/2014 02:05 AM, Petr Tesarik wrote:
> > With DISCONTIGMEM, the mapping between a pfn and its owning node is
> > initialized using data provided by the BIOS or from the command line.
> > However, the initialization may fail if the extents are not aligned
> > to section boundary (64M).
> 
> So is this a problem that shows up with DISCONTIGMEM?

Yes, that's it.

> Just curious, but
> what the heck kind of 32-bit NUMA hardware is still in the wild?  Did
> someon buy a NUMA-Q on eBay? :)

In fact, this is a patch that has been floating around in SUSE
Enterprise kernels for some time. It was originally added to pass
certification on IBM SurePOS 700 x4900-785.

When cleaning up our kernel patches, I noticed that the bug is still
present in the upstream kernel, so I posted this patch. While I don't
have any evidence that someone actually needs the fix today, it seems
wrong to leave buggy code in the kernel.

If you all agree that we rip off DISCONTIGMEM instead, I can post
patches to do that and be equally happy. ;-)

> >  void memory_present(int nid, unsigned long start, unsigned long end)
> >  {
> > -   unsigned long pfn;
> > +   unsigned long sect, endsect;
> >  
> >     printk(KERN_INFO "Node: %d, start_pfn: %lx, end_pfn: %lx\n",
> >                     nid, start, end);
> >     printk(KERN_DEBUG "  Setting physnode_map array to node %d for 
> > pfns:\n", nid);
> >     printk(KERN_DEBUG "  ");
> > -   for (pfn = start; pfn < end; pfn += PAGES_PER_SECTION) {
> > -           physnode_map[pfn / PAGES_PER_SECTION] = nid;
> > -           printk(KERN_CONT "%lx ", pfn);
> > +   endsect = (end - 1) / PAGES_PER_SECTION;
> > +   for (sect = start / PAGES_PER_SECTION; sect <= endsect; ++sect) {
> > +           physnode_map[sect] = nid;
> > +           printk(KERN_CONT "%lx ", sect * PAGES_PER_SECTION);
> >     }
> >     printk(KERN_CONT "\n");
> >  }
> 
> So, if start and end are not aligned to section boundaries, we will miss
> setting physnode_map[] for the final section?

If end belongs to a different section than start, the final section
will not be initialized, yes.

> For instance, if we have a 64MB section size and try to call
> memory_present(32MB -> 96MB), we will set 0->64MB present, but not set
> the 64MB->128MB section as present.
> 
> Right?

Exactly.

> Can you just align 'start' down to the section's start and 'end' up to
> the end of the section that contains it?  I guess you do that
> implicitly, but you should be able to do it without refactoring the for
> loop entirely.

Works for me.

Petr Tesarik
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to