Re: [PATCH] sparsemem/bootmem: catch greater than section size allocations

2012-02-29 Thread Johannes Weiner
On Tue, Feb 28, 2012 at 12:11:51PM -0800, Nishanth Aravamudan wrote:
 On 28.02.2012 [14:53:26 +0100], Johannes Weiner wrote:
  On Fri, Feb 24, 2012 at 11:33:58AM -0800, Nishanth Aravamudan wrote:
   While testing AMS (Active Memory Sharing) / CMO (Cooperative Memory
   Overcommit) on powerpc, we tripped the following:
   
   kernel BUG at mm/bootmem.c:483!
   cpu 0x0: Vector: 700 (Program Check) at [c0c03940]
   pc: c0a62bd8: .alloc_bootmem_core+0x90/0x39c
   lr: c0a64bcc: .sparse_early_usemaps_alloc_node+0x84/0x29c
   sp: c0c03bc0
  msr: 80021032
 current = 0xc0b0cce0
 paca= 0xc1d8
   pid   = 0, comm = swapper
   kernel BUG at mm/bootmem.c:483!
   enter ? for help
   [c0c03c80] c0a64bcc
   .sparse_early_usemaps_alloc_node+0x84/0x29c
   [c0c03d50] c0a64f10 .sparse_init+0x12c/0x28c
   [c0c03e20] c0a474f4 .setup_arch+0x20c/0x294
   [c0c03ee0] c0a4079c .start_kernel+0xb4/0x460
   [c0c03f90] c0009670 .start_here_common+0x1c/0x2c
   
   This is
   
   BUG_ON(limit  goal + size  limit);
   
   and after some debugging, it seems that
   
 goal = 0x700
 limit = 0x800
   
   and sparse_early_usemaps_alloc_node -
   sparse_early_usemaps_alloc_pgdat_section - alloc_bootmem_section calls
   
 return alloc_bootmem_section(usemap_size() * count, section_nr);
   
   This is on a system with 8TB available via the AMS pool, and as a quirk
   of AMS in firmware, all of that memory shows up in node 0. So, we end up
   with an allocation that will fail the goal/limit constraints. In theory,
   we could fall-back to alloc_bootmem_node() in
   sparse_early_usemaps_alloc_node(), but since we actually have HOTREMOVE
   defined, we'll BUG_ON() instead. A simple solution appears to be to
   disable the limit check if the size of the allocation in
   alloc_bootmem_secition exceeds the section size.
  
  It makes sense to allow the usemaps to spill over to subsequent
  sections instead of panicking, so FWIW:
  
  Acked-by: Johannes Weiner han...@cmpxchg.org
  
  That being said, it would be good if check_usemap_section_nr() printed
  the cross-dependencies between pgdats and sections when the usemaps of
  a node spilled over to other sections than the ones holding the pgdat.
  
  How about this?
  
  ---
  From: Johannes Weiner han...@cmpxchg.org
  Subject: sparsemem/bootmem: catch greater than section size allocations fix
  
  If alloc_bootmem_section() no longer guarantees section-locality, we
  need check_usemap_section_nr() to print possible cross-dependencies
  between node descriptors and the usemaps allocated through it.
  
  Signed-off-by: Johannes Weiner han...@cmpxchg.org
  ---
  
  diff --git a/mm/sparse.c b/mm/sparse.c
  index 61d7cde..9e032dc 100644
  --- a/mm/sparse.c
  +++ b/mm/sparse.c
  @@ -359,6 +359,7 @@ static void __init 
  sparse_early_usemaps_alloc_node(unsigned long**usemap_map,
  continue;
  usemap_map[pnum] = usemap;
  usemap += size;
  +   check_usemap_section_nr(nodeid, usemap_map[pnum]);
  }
  return;
  }
 
 This makes sense to me -- ok if I fold it into the re-worked patch
 (based upon Mel's comments)?

Sure thing!

  Furthermore, I wonder if we can remove the sparse-specific stuff from
  bootmem.c as well, as now even more so than before, calculating the
  desired area is really none of bootmem's business.
  
  Would something like this be okay?
  
  ---
  From: Johannes Weiner han...@cmpxchg.org
  Subject: [patch] mm: remove sparsemem allocation details from the bootmem 
  allocator
  
  alloc_bootmem_section() derives allocation area constraints from the
  specified sparsemem section.  This is a bit specific for a generic
  memory allocator like bootmem, though, so move it over to sparsemem.
  
  Since __alloc_bootmem_node() already retries failed allocations with
  relaxed area constraints, the fallback code in sparsemem.c can be
  removed and the code becomes a bit more compact overall.
  
  Signed-off-by: Johannes Weiner han...@cmpxchg.org
 
 I've not tested it, but the intention seems sensible. I think it should
 remain a separate change.

Yes, I agree.  I'll resend it in a bit as stand-alone patch.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] sparsemem/bootmem: catch greater than section size allocations

2012-02-28 Thread Johannes Weiner
On Fri, Feb 24, 2012 at 11:33:58AM -0800, Nishanth Aravamudan wrote:
 While testing AMS (Active Memory Sharing) / CMO (Cooperative Memory
 Overcommit) on powerpc, we tripped the following:
 
 kernel BUG at mm/bootmem.c:483!
 cpu 0x0: Vector: 700 (Program Check) at [c0c03940]
 pc: c0a62bd8: .alloc_bootmem_core+0x90/0x39c
 lr: c0a64bcc: .sparse_early_usemaps_alloc_node+0x84/0x29c
 sp: c0c03bc0
msr: 80021032
   current = 0xc0b0cce0
   paca= 0xc1d8
 pid   = 0, comm = swapper
 kernel BUG at mm/bootmem.c:483!
 enter ? for help
 [c0c03c80] c0a64bcc
 .sparse_early_usemaps_alloc_node+0x84/0x29c
 [c0c03d50] c0a64f10 .sparse_init+0x12c/0x28c
 [c0c03e20] c0a474f4 .setup_arch+0x20c/0x294
 [c0c03ee0] c0a4079c .start_kernel+0xb4/0x460
 [c0c03f90] c0009670 .start_here_common+0x1c/0x2c
 
 This is
 
 BUG_ON(limit  goal + size  limit);
 
 and after some debugging, it seems that
 
   goal = 0x700
   limit = 0x800
 
 and sparse_early_usemaps_alloc_node -
 sparse_early_usemaps_alloc_pgdat_section - alloc_bootmem_section calls
 
   return alloc_bootmem_section(usemap_size() * count, section_nr);
 
 This is on a system with 8TB available via the AMS pool, and as a quirk
 of AMS in firmware, all of that memory shows up in node 0. So, we end up
 with an allocation that will fail the goal/limit constraints. In theory,
 we could fall-back to alloc_bootmem_node() in
 sparse_early_usemaps_alloc_node(), but since we actually have HOTREMOVE
 defined, we'll BUG_ON() instead. A simple solution appears to be to
 disable the limit check if the size of the allocation in
 alloc_bootmem_secition exceeds the section size.

It makes sense to allow the usemaps to spill over to subsequent
sections instead of panicking, so FWIW:

Acked-by: Johannes Weiner han...@cmpxchg.org

That being said, it would be good if check_usemap_section_nr() printed
the cross-dependencies between pgdats and sections when the usemaps of
a node spilled over to other sections than the ones holding the pgdat.

How about this?

---
From: Johannes Weiner han...@cmpxchg.org
Subject: sparsemem/bootmem: catch greater than section size allocations fix

If alloc_bootmem_section() no longer guarantees section-locality, we
need check_usemap_section_nr() to print possible cross-dependencies
between node descriptors and the usemaps allocated through it.

Signed-off-by: Johannes Weiner han...@cmpxchg.org
---

diff --git a/mm/sparse.c b/mm/sparse.c
index 61d7cde..9e032dc 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -359,6 +359,7 @@ static void __init sparse_early_usemaps_alloc_node(unsigned 
long**usemap_map,
continue;
usemap_map[pnum] = usemap;
usemap += size;
+   check_usemap_section_nr(nodeid, usemap_map[pnum]);
}
return;
}

---

Furthermore, I wonder if we can remove the sparse-specific stuff from
bootmem.c as well, as now even more so than before, calculating the
desired area is really none of bootmem's business.

Would something like this be okay?

---
From: Johannes Weiner han...@cmpxchg.org
Subject: [patch] mm: remove sparsemem allocation details from the bootmem 
allocator

alloc_bootmem_section() derives allocation area constraints from the
specified sparsemem section.  This is a bit specific for a generic
memory allocator like bootmem, though, so move it over to sparsemem.

Since __alloc_bootmem_node() already retries failed allocations with
relaxed area constraints, the fallback code in sparsemem.c can be
removed and the code becomes a bit more compact overall.

Signed-off-by: Johannes Weiner han...@cmpxchg.org
---
 include/linux/bootmem.h |3 ---
 mm/bootmem.c|   26 --
 mm/sparse.c |   29 +
 3 files changed, 9 insertions(+), 49 deletions(-)

diff --git a/include/linux/bootmem.h b/include/linux/bootmem.h
index ab344a5..001c248 100644
--- a/include/linux/bootmem.h
+++ b/include/linux/bootmem.h
@@ -135,9 +135,6 @@ extern void *__alloc_bootmem_low_node(pg_data_t *pgdat,
 extern int reserve_bootmem_generic(unsigned long addr, unsigned long size,
   int flags);
 
-extern void *alloc_bootmem_section(unsigned long size,
-  unsigned long section_nr);
-
 #ifdef CONFIG_HAVE_ARCH_ALLOC_REMAP
 extern void *alloc_remap(int nid, unsigned long size);
 #else
diff --git a/mm/bootmem.c b/mm/bootmem.c
index 7bc0557..d34026c 100644
--- a/mm/bootmem.c
+++ b/mm/bootmem.c
@@ -756,32 +756,6 @@ void * __init __alloc_bootmem_node_high(pg_data_t *pgdat, 
unsigned long size,
 
 }
 
-#ifdef CONFIG_SPARSEMEM
-/**
- * alloc_bootmem_section - allocate boot memory from a specific section
- * @size: size of the request in bytes
- * 

Re: [PATCH] sparsemem/bootmem: catch greater than section size allocations

2012-02-28 Thread Mel Gorman
On Fri, Feb 24, 2012 at 11:33:58AM -0800, Nishanth Aravamudan wrote:
 While testing AMS (Active Memory Sharing) / CMO (Cooperative Memory
 Overcommit) on powerpc, we tripped the following:
 
 kernel BUG at mm/bootmem.c:483!
 cpu 0x0: Vector: 700 (Program Check) at [c0c03940]
 pc: c0a62bd8: .alloc_bootmem_core+0x90/0x39c
 lr: c0a64bcc: .sparse_early_usemaps_alloc_node+0x84/0x29c
 sp: c0c03bc0
msr: 80021032
   current = 0xc0b0cce0
   paca= 0xc1d8
 pid   = 0, comm = swapper
 kernel BUG at mm/bootmem.c:483!
 enter ? for help
 [c0c03c80] c0a64bcc
 .sparse_early_usemaps_alloc_node+0x84/0x29c
 [c0c03d50] c0a64f10 .sparse_init+0x12c/0x28c
 [c0c03e20] c0a474f4 .setup_arch+0x20c/0x294
 [c0c03ee0] c0a4079c .start_kernel+0xb4/0x460
 [c0c03f90] c0009670 .start_here_common+0x1c/0x2c
 
 This is
 
 BUG_ON(limit  goal + size  limit);
 
 and after some debugging, it seems that
 
   goal = 0x700
   limit = 0x800
 
 and sparse_early_usemaps_alloc_node -
 sparse_early_usemaps_alloc_pgdat_section - alloc_bootmem_section calls
 
   return alloc_bootmem_section(usemap_size() * count, section_nr);
 
 This is on a system with 8TB available via the AMS pool, and as a quirk
 of AMS in firmware, all of that memory shows up in node 0. So, we end up
 with an allocation that will fail the goal/limit constraints. In theory,
 we could fall-back to alloc_bootmem_node() in
 sparse_early_usemaps_alloc_node(), but since we actually have HOTREMOVE
 defined, we'll BUG_ON() instead. A simple solution appears to be to
 disable the limit check if the size of the allocation in
 alloc_bootmem_secition exceeds the section size.
 
 Signed-off-by: Nishanth Aravamudan n...@us.ibm.com
 Cc: Dave Hansen haveb...@us.ibm.com
 Cc: Anton Blanchard an...@au1.ibm.com
 Cc: Paul Mackerras pau...@samba.org
 Cc: Ben Herrenschmidt b...@kernel.crashing.org
 Cc: Robert Jennings r...@linux.vnet.ibm.com
 Cc: linux...@kvack.org
 Cc: linuxppc-dev@lists.ozlabs.org
 ---
  include/linux/mmzone.h |2 ++
  mm/bootmem.c   |5 -
  2 files changed, 6 insertions(+), 1 deletions(-)
 
 diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
 index 650ba2f..4176834 100644
 --- a/include/linux/mmzone.h
 +++ b/include/linux/mmzone.h
 @@ -967,6 +967,8 @@ static inline unsigned long early_pfn_to_nid(unsigned 
 long pfn)
   * PA_SECTION_SHIFT  physical address to/from section number
   * PFN_SECTION_SHIFT pfn to/from section number
   */
 +#define BYTES_PER_SECTION(1UL  SECTION_SIZE_BITS)
 +
  #define SECTIONS_SHIFT   (MAX_PHYSMEM_BITS - SECTION_SIZE_BITS)
  
  #define PA_SECTION_SHIFT (SECTION_SIZE_BITS)
 diff --git a/mm/bootmem.c b/mm/bootmem.c
 index 668e94d..5cbbc76 100644
 --- a/mm/bootmem.c
 +++ b/mm/bootmem.c
 @@ -770,7 +770,10 @@ void * __init alloc_bootmem_section(unsigned long size,
  
   pfn = section_nr_to_pfn(section_nr);
   goal = pfn  PAGE_SHIFT;
 - limit = section_nr_to_pfn(section_nr + 1)  PAGE_SHIFT;
 + if (size  BYTES_PER_SECTION)
 + limit = 0;
 + else
 + limit = section_nr_to_pfn(section_nr + 1)  PAGE_SHIFT;

As it's ok to spill the allocation over to an adjacent section, why not
just make limit==0 unconditionally. That would avoid defining
BYTES_PER_SECTION.

-- 
Mel Gorman
SUSE Labs
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] sparsemem/bootmem: catch greater than section size allocations

2012-02-28 Thread Nishanth Aravamudan
On 28.02.2012 [14:53:26 +0100], Johannes Weiner wrote:
 On Fri, Feb 24, 2012 at 11:33:58AM -0800, Nishanth Aravamudan wrote:
  While testing AMS (Active Memory Sharing) / CMO (Cooperative Memory
  Overcommit) on powerpc, we tripped the following:
  
  kernel BUG at mm/bootmem.c:483!
  cpu 0x0: Vector: 700 (Program Check) at [c0c03940]
  pc: c0a62bd8: .alloc_bootmem_core+0x90/0x39c
  lr: c0a64bcc: .sparse_early_usemaps_alloc_node+0x84/0x29c
  sp: c0c03bc0
 msr: 80021032
current = 0xc0b0cce0
paca= 0xc1d8
  pid   = 0, comm = swapper
  kernel BUG at mm/bootmem.c:483!
  enter ? for help
  [c0c03c80] c0a64bcc
  .sparse_early_usemaps_alloc_node+0x84/0x29c
  [c0c03d50] c0a64f10 .sparse_init+0x12c/0x28c
  [c0c03e20] c0a474f4 .setup_arch+0x20c/0x294
  [c0c03ee0] c0a4079c .start_kernel+0xb4/0x460
  [c0c03f90] c0009670 .start_here_common+0x1c/0x2c
  
  This is
  
  BUG_ON(limit  goal + size  limit);
  
  and after some debugging, it seems that
  
  goal = 0x700
  limit = 0x800
  
  and sparse_early_usemaps_alloc_node -
  sparse_early_usemaps_alloc_pgdat_section - alloc_bootmem_section calls
  
  return alloc_bootmem_section(usemap_size() * count, section_nr);
  
  This is on a system with 8TB available via the AMS pool, and as a quirk
  of AMS in firmware, all of that memory shows up in node 0. So, we end up
  with an allocation that will fail the goal/limit constraints. In theory,
  we could fall-back to alloc_bootmem_node() in
  sparse_early_usemaps_alloc_node(), but since we actually have HOTREMOVE
  defined, we'll BUG_ON() instead. A simple solution appears to be to
  disable the limit check if the size of the allocation in
  alloc_bootmem_secition exceeds the section size.
 
 It makes sense to allow the usemaps to spill over to subsequent
 sections instead of panicking, so FWIW:
 
 Acked-by: Johannes Weiner han...@cmpxchg.org
 
 That being said, it would be good if check_usemap_section_nr() printed
 the cross-dependencies between pgdats and sections when the usemaps of
 a node spilled over to other sections than the ones holding the pgdat.
 
 How about this?
 
 ---
 From: Johannes Weiner han...@cmpxchg.org
 Subject: sparsemem/bootmem: catch greater than section size allocations fix
 
 If alloc_bootmem_section() no longer guarantees section-locality, we
 need check_usemap_section_nr() to print possible cross-dependencies
 between node descriptors and the usemaps allocated through it.
 
 Signed-off-by: Johannes Weiner han...@cmpxchg.org
 ---
 
 diff --git a/mm/sparse.c b/mm/sparse.c
 index 61d7cde..9e032dc 100644
 --- a/mm/sparse.c
 +++ b/mm/sparse.c
 @@ -359,6 +359,7 @@ static void __init 
 sparse_early_usemaps_alloc_node(unsigned long**usemap_map,
   continue;
   usemap_map[pnum] = usemap;
   usemap += size;
 + check_usemap_section_nr(nodeid, usemap_map[pnum]);
   }
   return;
   }

This makes sense to me -- ok if I fold it into the re-worked patch
(based upon Mel's comments)?

 ---
 
 Furthermore, I wonder if we can remove the sparse-specific stuff from
 bootmem.c as well, as now even more so than before, calculating the
 desired area is really none of bootmem's business.
 
 Would something like this be okay?
 
 ---
 From: Johannes Weiner han...@cmpxchg.org
 Subject: [patch] mm: remove sparsemem allocation details from the bootmem 
 allocator
 
 alloc_bootmem_section() derives allocation area constraints from the
 specified sparsemem section.  This is a bit specific for a generic
 memory allocator like bootmem, though, so move it over to sparsemem.
 
 Since __alloc_bootmem_node() already retries failed allocations with
 relaxed area constraints, the fallback code in sparsemem.c can be
 removed and the code becomes a bit more compact overall.
 
 Signed-off-by: Johannes Weiner han...@cmpxchg.org

I've not tested it, but the intention seems sensible. I think it should
remain a separate change.

Thanks,
Nish

-- 
Nishanth Aravamudan n...@us.ibm.com
IBM Linux Technology Center

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH] sparsemem/bootmem: catch greater than section size allocations

2012-02-24 Thread Nishanth Aravamudan
While testing AMS (Active Memory Sharing) / CMO (Cooperative Memory
Overcommit) on powerpc, we tripped the following:

kernel BUG at mm/bootmem.c:483!
cpu 0x0: Vector: 700 (Program Check) at [c0c03940]
pc: c0a62bd8: .alloc_bootmem_core+0x90/0x39c
lr: c0a64bcc: .sparse_early_usemaps_alloc_node+0x84/0x29c
sp: c0c03bc0
   msr: 80021032
  current = 0xc0b0cce0
  paca= 0xc1d8
pid   = 0, comm = swapper
kernel BUG at mm/bootmem.c:483!
enter ? for help
[c0c03c80] c0a64bcc
.sparse_early_usemaps_alloc_node+0x84/0x29c
[c0c03d50] c0a64f10 .sparse_init+0x12c/0x28c
[c0c03e20] c0a474f4 .setup_arch+0x20c/0x294
[c0c03ee0] c0a4079c .start_kernel+0xb4/0x460
[c0c03f90] c0009670 .start_here_common+0x1c/0x2c

This is

BUG_ON(limit  goal + size  limit);

and after some debugging, it seems that

goal = 0x700
limit = 0x800

and sparse_early_usemaps_alloc_node -
sparse_early_usemaps_alloc_pgdat_section - alloc_bootmem_section calls

return alloc_bootmem_section(usemap_size() * count, section_nr);

This is on a system with 8TB available via the AMS pool, and as a quirk
of AMS in firmware, all of that memory shows up in node 0. So, we end up
with an allocation that will fail the goal/limit constraints. In theory,
we could fall-back to alloc_bootmem_node() in
sparse_early_usemaps_alloc_node(), but since we actually have HOTREMOVE
defined, we'll BUG_ON() instead. A simple solution appears to be to
disable the limit check if the size of the allocation in
alloc_bootmem_secition exceeds the section size.

Signed-off-by: Nishanth Aravamudan n...@us.ibm.com
Cc: Dave Hansen haveb...@us.ibm.com
Cc: Anton Blanchard an...@au1.ibm.com
Cc: Paul Mackerras pau...@samba.org
Cc: Ben Herrenschmidt b...@kernel.crashing.org
Cc: Robert Jennings r...@linux.vnet.ibm.com
Cc: linux...@kvack.org
Cc: linuxppc-dev@lists.ozlabs.org
---
 include/linux/mmzone.h |2 ++
 mm/bootmem.c   |5 -
 2 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 650ba2f..4176834 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -967,6 +967,8 @@ static inline unsigned long early_pfn_to_nid(unsigned long 
pfn)
  * PA_SECTION_SHIFTphysical address to/from section number
  * PFN_SECTION_SHIFT   pfn to/from section number
  */
+#define BYTES_PER_SECTION  (1UL  SECTION_SIZE_BITS)
+
 #define SECTIONS_SHIFT (MAX_PHYSMEM_BITS - SECTION_SIZE_BITS)
 
 #define PA_SECTION_SHIFT   (SECTION_SIZE_BITS)
diff --git a/mm/bootmem.c b/mm/bootmem.c
index 668e94d..5cbbc76 100644
--- a/mm/bootmem.c
+++ b/mm/bootmem.c
@@ -770,7 +770,10 @@ void * __init alloc_bootmem_section(unsigned long size,
 
pfn = section_nr_to_pfn(section_nr);
goal = pfn  PAGE_SHIFT;
-   limit = section_nr_to_pfn(section_nr + 1)  PAGE_SHIFT;
+   if (size  BYTES_PER_SECTION)
+   limit = 0;
+   else
+   limit = section_nr_to_pfn(section_nr + 1)  PAGE_SHIFT;
bdata = bootmem_node_data[early_pfn_to_nid(pfn)];
 
return alloc_bootmem_core(bdata, size, SMP_CACHE_BYTES, goal, limit);
-- 
1.7.5.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev