g the buddy order to
> > something > MAX_ORDER -1 on that path?
>
> Agreed.
We would need to return the supersized block to the huge page pool and not
to the buddy allocator. There is a special callback in the compound page
sos that you can call an alternate free function that is not t
On Thu, 1 Jun 2017, Hugh Dickins wrote:
> SLUB versus SLAB, cpu versus memory? Since someone has taken the
> trouble to write it with ctors in the past, I didn't feel on firm
> enough ground to recommend such a change. But it may be obvious
> to someone else that your suggestion would be better
On Thu, 1 Jun 2017, Hugh Dickins wrote:
> Thanks a lot for working that out. Makes sense, fully understood now,
> nothing to worry about (though makes one wonder whether it's efficient
> to use ctors on high-alignment caches; or whether an internal "zero-me"
> ctor would be useful).
Use kzalloc
On Thu, 1 Jun 2017, Hugh Dickins wrote:
> CONFIG_SLUB_DEBUG_ON=y. My SLAB|SLUB config options are
>
> CONFIG_SLUB_DEBUG=y
> # CONFIG_SLUB_MEMCG_SYSFS_ON is not set
> # CONFIG_SLAB is not set
> CONFIG_SLUB=y
> # CONFIG_SLAB_FREELIST_RANDOM is not set
> CONFIG_SLUB_CPU_PARTIAL=y
>
> > I am curious as to what is going on there. Do you have the output from
> > these failed allocations?
>
> I thought the relevant output was in my mail. I did skip the Mem-Info
> dump, since that just seemed noise in this case: we know memory can get
> fragmented. What more output are you
On Wed, 31 May 2017, Michael Ellerman wrote:
> > SLUB: Unable to allocate memory on node -1, gfp=0x14000c0(GFP_KERNEL)
> > cache: pgtable-2^12, object size: 32768, buffer size: 65536, default
> > order: 4, min order: 4
> > pgtable-2^12 debugging increased min order, use slub_debug=O to
On Tue, 30 May 2017, Hugh Dickins wrote:
> I wanted to try removing CONFIG_SLUB_DEBUG, but didn't succeed in that:
> it seemed to be a hard requirement for something, but I didn't find what.
CONFIG_SLUB_DEBUG does not enable debugging. It only includes the code to
be able to enable it at
On Wed, 21 Sep 2016, Tejun Heo wrote:
> Hello, Nick.
>
> How have you been? :)
>
He is baack. Are we getting SL!B? ;-)
On Fri, 8 Jul 2016, Kees Cook wrote:
> Is check_valid_pointer() making sure the pointer is within the usable
> size? It seemed like it was checking that it was within the slub
> object (checks against s->size, wants it above base after moving
> pointer to include redzone, etc).
On Fri, 8 Jul 2016, Michael Ellerman wrote:
> > I wonder if this code should be using size_from_object() instead of s->size?
>
> Hmm, not sure. Who's SLUB maintainer? :)
Me.
s->size is the size of the whole object including debugging info etc.
ksize() gives you the actual usable size of an
in slub.c that is similar.
Doh, somehow I convinced myself that there's #else and alloc_pages() is only
used for !CONFIG_NUMA so it doesn't matter. Here's a fixed version.
Acked-by: Christoph Lameter c...@linux.com
___
Linuxppc-dev mailing list
Linuxppc
On Thu, 30 Jul 2015, Vlastimil Babka wrote:
numa_mem_id() is able to handle allocation from CPUs on memory-less nodes,
so it's a more robust fallback than the currently used numa_node_id().
Suggested-by: Christoph Lameter c...@linux.com
Signed-off-by: Vlastimil Babka vba...@suse.cz
Acked
Acked-by: Christoph Lameter c...@linux.com
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev
On Thu, 30 Jul 2015, Vlastimil Babka wrote:
--- a/mm/slob.c
+++ b/mm/slob.c
void *page;
-#ifdef CONFIG_NUMA
- if (node != NUMA_NO_NODE)
- page = alloc_pages_exact_node(node, gfp, order);
- else
-#endif
- page = alloc_pages(gfp, order);
+ page =
On Wed, 22 Jul 2015, David Rientjes wrote:
Eek, yeah, that does look bad. I'm not even sure the
if (nid 0)
nid = numa_node_id();
is correct; I think this should be comparing to NUMA_NO_NODE rather than
all negative numbers, otherwise we silently ignore overflow and
On Tue, 21 Jul 2015, Vlastimil Babka wrote:
The function alloc_pages_exact_node() was introduced in 6484eb3e2a81 (page
allocator: do not check NUMA node ID when the caller knows the node is valid)
as an optimized variant of alloc_pages_node(), that doesn't allow the node id
to be -1.
On Wed, 29 Oct 2014, Michael Ellerman wrote:
#define __ARCH_IRQ_STAT
-#define local_softirq_pending()
__get_cpu_var(irq_stat).__softirq_pending
+#define local_softirq_pending()
__this_cpu_read(irq_stat.__softirq_pending)
+#define set_softirq_pending(x)
Ping? We are planning to remove support for __get_cpu_var in the
3.19 merge period. I can move the definition for __get_cpu_var into the
powerpc per cpu definition instead if we cannot get this merged?
On Tue, 21 Oct 2014, Christoph Lameter wrote:
This still has not been merged and now powerpc
On Tue, 28 Oct 2014, Michael Ellerman wrote:
I'm happy to put it in a topic branch for 3.19, or move the definition or
whatever, your choice Christoph.
Get the patch merged please.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
of a per cpu variable
DEFINE_PER_CPU(int, y);
__get_cpu_var(y)++
Converts to
__this_cpu_inc(y)
Cc: Benjamin Herrenschmidt b...@kernel.crashing.org
CC: Paul Mackerras pau...@samba.org
Signed-off-by: Christoph Lameter c...@linux.com
---
arch/powerpc/include/asm
On Wed, 13 Aug 2014, Nishanth Aravamudan wrote:
+++ b/include/linux/topology.h
@@ -119,11 +119,20 @@ static inline int numa_node_id(void)
* Use the accessor functions set_numa_mem(), numa_mem_id() and cpu_to_mem().
*/
DECLARE_PER_CPU(int, _numa_mem_);
+extern int
Goobledieguy due to missing Mime header.
On Thu, 12 Jun 2014, David Laight wrote:
RnJvbTogQW50b24gQmxhbmNoYXJkDQouLi4NCj4gZGlmZiAtLWdpdCBhL2FyY2gvcG93ZXJwYy9i
b290L2luc3RhbGwuc2ggYi9hcmNoL3Bvd2VycGMvYm9vdC9pbnN0YWxsLnNoDQo+IGluZGV4IGI2
This is under Ubuntu Utopic Unicorn on a Power 8 system while simply
trying to build with the Ubuntu standard kernel config. It could be that
these issues come about because we do not have an rc1 yet but I wanted to
give some early notice. Also this is a new arch to me so I may not be
aware of how
Looking at arch/powerpc/include/asm/percpu.h I see that the per cpu offset
comes from a local_paca field and local_paca is in r13. That means that
for all percpu operations we first have to determine the address through a
memory access.
Would it be possible to put the paca at the beginning of
On Mon, 19 May 2014, Nishanth Aravamudan wrote:
I'm seeing a panic at boot with this change on an LPAR which actually
has no Node 0. Here's what I think is happening:
start_kernel
...
- setup_per_cpu_areas
- pcpu_embed_first_chunk
- pcpu_fc_alloc
On Mon, 31 Mar 2014, Nishanth Aravamudan wrote:
Yep. The node exists, it's just fully exhausted at boot (due to the
presence of 16GB pages reserved at boot-time).
Well if you want us to support that then I guess you need to propose
patches to address this issue.
I'd appreciate a bit more
On Thu, 27 Mar 2014, Nishanth Aravamudan wrote:
That looks to be the correct way to handle things. Maybe mark the node as
offline or somehow not present so that the kernel ignores it.
This is a SLUB condition:
mm/slub.c::early_kmem_cache_node_alloc():
...
page =
On Mon, 24 Mar 2014, Nishanth Aravamudan wrote:
Anyone have any ideas here?
Dont do that? Check on boot to not allow exhausting a node with huge
pages?
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
On Tue, 25 Mar 2014, Nishanth Aravamudan wrote:
On 25.03.2014 [11:17:57 -0500], Christoph Lameter wrote:
On Mon, 24 Mar 2014, Nishanth Aravamudan wrote:
Anyone have any ideas here?
Dont do that? Check on boot to not allow exhausting a node with huge
pages?
Gigantic hugepages
On Tue, 25 Mar 2014, Nishanth Aravamudan wrote:
On power, very early, we find the 16G pages (gpages in the powerpc arch
code) in the device-tree:
early_setup -
early_init_mmu -
htab_initialize -
htab_init_page_sizes -
On Tue, 11 Mar 2014, Nishanth Aravamudan wrote:
I have a P7 system that has no node0, but a node0 shows up in numactl
--hardware, which has no cpus and no memory (and no PCI devices):
Well as you see from the code there has been so far the assumption that
node 0 has memory. I have never run a
On Fri, 21 Feb 2014, Nishanth Aravamudan wrote:
I added two calls to local_memory_node(), I *think* both are necessary,
but am willing to be corrected.
One is in map_cpu_to_node() and one is in start_secondary(). The
start_secondary() path is fine, AFAICT, as we are up running at that
On Mon, 24 Feb 2014, Joonsoo Kim wrote:
It will not common get there because of the tracking. Instead a per cpu
object will be used.
get_partial_node() always fails even if there are some partial slab on
memoryless node's neareast node.
Correct and that leads to a page allocator
On Wed, 19 Feb 2014, David Rientjes wrote:
On Tue, 18 Feb 2014, Christoph Lameter wrote:
Its an optimization to avoid calling the page allocator to figure out if
there is memory available on a particular node.
Thus this patch breaks with memory hot-add for a memoryless node.
As soon
On Wed, 19 Feb 2014, Nishanth Aravamudan wrote:
We can call local_memory_node() before the zonelists are setup. In that
case, first_zones_zonelist() will not set zone and the reference to
zone-node will Oops. Catch this case, and, since we presumably running
very early, just return that any
On Tue, 18 Feb 2014, Nishanth Aravamudan wrote:
the performance impact of the underlying NUMA configuration. I guess we
could special-case memoryless/cpuless configurations somewhat, but I
don't think there's any reason to do that if we can make memoryless-node
support work in-kernel?
Well
On Mon, 17 Feb 2014, Joonsoo Kim wrote:
On Wed, Feb 12, 2014 at 04:16:11PM -0600, Christoph Lameter wrote:
Here is another patch with some fixes. The additional logic is only
compiled in if CONFIG_HAVE_MEMORYLESS_NODES is set.
Subject: slub: Memoryless node support
Support memoryless
On Mon, 17 Feb 2014, Joonsoo Kim wrote:
On Wed, Feb 12, 2014 at 10:51:37PM -0800, Nishanth Aravamudan wrote:
Hi Joonsoo,
Also, given that only ia64 and (hopefuly soon) ppc64 can set
CONFIG_HAVE_MEMORYLESS_NODES, does that mean x86_64 can't have
memoryless nodes present? Even with
On Tue, 18 Feb 2014, Nishanth Aravamudan wrote:
Well, on powerpc, with the hypervisor providing the resources and the
topology, you can have cpuless and memoryless nodes. I'm not sure how
fake the NUMA is -- as I think since the resources are virtualized to
be one system, it's logically
On Tue, 18 Feb 2014, Nishanth Aravamudan wrote:
We use the topology provided by the hypervisor, it does actually reflect
where CPUs and memory are, and their corresponding performance/NUMA
characteristics.
And so there are actually nodes without memory that have processors?
Can the hypervisor
to the
current available per cpu objects and if that is not available will
create a new slab using the page allocator to fallback from the
memoryless node to some other node.
Signed-off-by: Christoph Lameter c...@linux.com
Index: linux/mm/slub.c
On Mon, 10 Feb 2014, Joonsoo Kim wrote:
On Fri, Feb 07, 2014 at 12:51:07PM -0600, Christoph Lameter wrote:
Here is a draft of a patch to make this work with memoryless nodes.
The first thing is that we modify node_match to also match if we hit an
empty node. In that case we simply take
On Fri, 7 Feb 2014, Joonsoo Kim wrote:
This check wouild need to be something that checks for other contigencies
in the page allocator as well. A simple solution would be to actually run
a GFP_THIS_NODE alloc to see if you can grab a page from the proper node.
If that fails then fallback.
Here is a draft of a patch to make this work with memoryless nodes.
The first thing is that we modify node_match to also match if we hit an
empty node. In that case we simply take the current slab if its there.
If there is no current slab then a regular allocation occurs with the
memoryless
On Fri, 7 Feb 2014, Joonsoo Kim wrote:
It seems like a better approach would be to do this when a node is brought
online and determine the fallback node based not on the zonelists as you
do here but rather on locality (such as through a SLIT if provided, see
node_distance()).
Hmm...
on that node.
On that node, page allocation always fallback to numa_mem_id() first. So
searching a partial slab on numa_node_id() in that case is proper solution
for memoryless node case.
Acked-by: Christoph Lameter c...@linux.com
___
Linuxppc-dev mailing
On Wed, 5 Feb 2014, Nishanth Aravamudan wrote:
Right so if we are ignoring the node then the simplest thing to do is to
not deactivate the current cpu slab but to take an object from it.
Ok, that's what Anton's patch does, I believe. Are you ok with that
patch as it is?
No. Again his
On Thu, 6 Feb 2014, David Rientjes wrote:
I think you'll need to send these to Andrew since he appears to be picking
up slub patches these days.
I can start managing merges again if Pekka no longer has the time.
___
Linuxppc-dev mailing list
On Thu, 6 Feb 2014, Joonsoo Kim wrote:
diff --git a/mm/slub.c b/mm/slub.c
index cc1f995..c851f82 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1700,6 +1700,14 @@ static void *get_partial(struct kmem_cache *s, gfp_t
flags, int node,
void *object;
int searchnode = (node ==
On Tue, 4 Feb 2014, Nishanth Aravamudan wrote:
If the target node allocation fails (for whatever reason) then I would
recommend for simplicities sake to change the target node to
NUMA_NO_NODE and just take whatever is in the current cpu slab. A more
complex solution would be to look
On Mon, 3 Feb 2014, Nishanth Aravamudan wrote:
Yes, sorry for my lack of clarity. I meant Joonsoo's latest patch for
the $SUBJECT issue.
Hmmm... I am not sure that this is a general solution. The fallback to
other nodes can not only occur because a node has no memory as his patch
assumes.
If
On Mon, 3 Feb 2014, Nishanth Aravamudan wrote:
So what's the status of this patch? Christoph, do you think this is fine
as it is?
Certainly enabling CONFIG_MEMORYLESS_NODES is the right thing to do and I
already acked the patch.
___
Linuxppc-dev
On Wed, 29 Jan 2014, Nishanth Aravamudan wrote:
exactly what the caller intends.
int searchnode = node;
if (node == NUMA_NO_NODE)
searchnode = numa_mem_id();
if (!node_present_pages(node))
searchnode = local_memory_node(node);
The difference in semantics from the previous is
On Tue, 28 Jan 2014, Nishanth Aravamudan wrote:
This helps about the same as David's patch -- but I found the reason
why! ppc64 doesn't set CONFIG_HAVE_MEMORYLESS_NODES :) Expect a patch
shortly for that and one other case I found.
Oww...
___
On Tue, 28 Jan 2014, Nishanth Aravamudan wrote:
Anton Blanchard found an issue with an LPAR that had no memory in Node
0. Christoph Lameter recommended, as one possible solution, to use
numa_mem_id() for locality of the nearest memory node-wise. However,
numa_mem_id() [and the other related
On Fri, 24 Jan 2014, David Rientjes wrote:
kmalloc_node(nid) and kmem_cache_alloc_node(nid) should fallback to nodes
other than nid when memory can't be allocated, these functions only
indicate a preference.
The nid passed indicated a preference unless __GFP_THIS_NODE is specified.
Then the
On Fri, 24 Jan 2014, Nishanth Aravamudan wrote:
As to cpu_to_node() being passed to kmalloc_node(), I think an
appropriate fix is to change that to cpu_to_mem()?
Yup.
Yeah, the default policy should be to fallback to local memory if the node
passed is memoryless.
Thanks!
I would
On Fri, 24 Jan 2014, Wanpeng Li wrote:
diff --git a/mm/slub.c b/mm/slub.c
index 545a170..a1c6040 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1700,6 +1700,9 @@ static void *get_partial(struct kmem_cache *s, gfp_t
flags, int node,
void *object;
int searchnode = (node ==
On Mon, 20 Jan 2014, Wanpeng Li wrote:
+ enum zone_type high_zoneidx = gfp_zone(flags);
+ if (!node_present_pages(searchnode)) {
+ zonelist = node_zonelist(searchnode, flags);
+ for_each_zone_zonelist(zone, z, zonelist, high_zoneidx) {
+
On Tue, 6 Aug 2013, Wladislav Wiebe wrote:
ok, just saw in slab/for-linus branch that those stuff is reverted again..
No that was only for the 3.11 merge by Linus. The 3.12 patches have not
been put into pekkas tree.
___
Linuxppc-dev mailing list
On Wed, 31 Jul 2013, Wladislav Wiebe wrote:
on a PPC 32-Bit board with a Linux Kernel v3.10.0 I see trouble with
kmalloc_slab.
Basically at system startup, something request a size of 8388608 b,
but KMALLOC_MAX_SIZE has 4194304 b in our case. It points a WARNING at:
..
NIP [c0099fec]
buffers for proc fs?
Signed-off-by: Christoph Lameter c...@linux.com
Index: linux/fs/seq_file.c
===
--- linux.orig/fs/seq_file.c2013-07-10 14:03:15.367134544 -0500
+++ linux/fs/seq_file.c 2013-07-31 10:11:42.671736131 -0500
@@ -96,7
allocation. Use kmalloc_large().
This fixes the warning about large allocs but it will still cause
large contiguous allocs that could fail because of memory fragmentation.
Signed-off-by: Christoph Lameter c...@linux.com
Index: linux/fs/seq_file.c
On Wed, 31 Jul 2013, Wladislav Wiebe wrote:
Thanks for the point, do you plan to make kmalloc_large available for extern
access in a separate mainline patch?
Since kmalloc_large is statically defined in slub_def.h and when including it
to seq_file.c
we have a lot of conflicting types:
You
On Thu, 27 Dec 2012, Tang Chen wrote:
On 12/26/2012 11:30 AM, Kamezawa Hiroyuki wrote:
@@ -41,6 +42,7 @@ struct firmware_map_entry {
const char *type; /* type of the memory range */
struct list_headlist; /* entry for the linked list */
I was pointed by Glauber to the slab common code patches. I need some
more time to read the patches. Now I think the slab/slot changes in this
v3 are not needed, and can be ignored.
That may take some kernel cycles. You have a current issue here that needs
to be fixed.
On Mon, 9 Jul 2012, Yasuaki Ishimatsu wrote:
Even if you apply these patches, you cannot remove the physical memory
completely since these patches are still under development. I want you to
cooperate to improve the physical memory hot-remove. So please review these
patches and give your
itself.
Signed-off-by: Christoph Lameter c...@linux.com
---
mm/slub.c | 29 ++---
1 file changed, 14 insertions(+), 15 deletions(-)
Index: linux-2.6/mm/slub.c
===
--- linux-2.6.orig/mm/slub.c2012-06-11 08
On Mon, 25 Jun 2012, Li Zhong wrote:
This patch tries to kfree the cache name of pgtables cache if SLUB is
used, as SLUB duplicates the cache name, and the original one is leaked.
SLAB also does not free the name. Why would you have an #ifdef in there?
that alias
processing is done using the copy of the string and not
the string itself.
Signed-off-by: Christoph Lameter c...@linux.com
---
mm/slub.c | 29 ++---
1 file changed, 14 insertions(+), 15 deletions(-)
Index: linux-2.6/mm/slub.c
On Sun, 12 Jun 2011, Hugh Dickins wrote:
3.0-rc won't boot with SLUB on my PowerPC G5: kernel BUG at mm/slub.c:1950!
Bisected to 1759415e630e slub: Remove CONFIG_CMPXCHG_LOCAL ifdeffery.
After giving myself a medal for finding the BUG on line 1950 of mm/slub.c
(it's actually the
On Mon, 13 Jun 2011, Pekka Enberg wrote:
Hmmm.. The allocpercpu in alloc_kmem_cache_cpus should take care of the
alignment. Uhh.. I see that a patch that removes the #ifdef CMPXCHG_LOCAL
was not applied? Pekka?
This patch?
On Thu, 23 Sep 2010, Christian Riesch wrote:
It implies clock tuning in userspace for a potential sub microsecond
accurate clock. The clock accuracy will be limited by user space
latencies and noise. You wont be able to discipline the system clock
accurately.
Noise matters,
On Thu, 23 Sep 2010, john stultz wrote:
3) Further, the PTP hardware counter can be simply set to a new offset
to put it in line with the network time. This could cause trouble with
timekeeping much like unsynced TSCs do.
You can do the same for system time.
Settimeofday does allow
On Fri, 24 Sep 2010, Alan Cox wrote:
Whether you add new syscalls or do the fd passing using flags and hide
the ugly bits in glibc is another question.
Use device specific ioctls instead of syscalls?
___
Linuxppc-dev mailing list
On Thu, 23 Sep 2010, Richard Cochran wrote:
Support for obtaining timestamps from a PHC already exists via the
SO_TIMESTAMPING socket option, integrated in kernel version 2.6.30.
This patch set completes the picture by allow user space programs to
adjust the PHC and to control its
On Thu, 23 Sep 2010, Jacob Keller wrote:
There is a reason for not being able to shift posix clocks: The system has
one time base. The various clocks are contributing to maintaining that
sytem wide time.
Adjusting clocks is absolutely essential for proper functioning of the PTP
On Thu, 23 Sep 2010, john stultz wrote:
This was my initial gut reaction as well, but in the end, I agree with
Richard that in the case of one or multiple PTP hardware clocks, we
really can't abstract over the different time domains.
My (arguably still superficial) review of the source does
On Thu, 23 Sep 2010, Richard Cochran wrote:
+* Gianfar PTP clock nodes
+
+General Properties:
+
+ - compatible Should be fsl,etsec-ptp
+ - reg Offset and length of the register set for the device
+ - interrupts There should be at least two interrupts. Some devices
+
On Thu, 23 Sep 2010, Alan Cox wrote:
Please do not introduce useless additional layers for clock sync. Load
these ptp clocks like the other regular clock modules and make them sync
system time like any other clock.
I don't think you understand PTP. PTP has masters, a system can need to
On Thu, 23 Sep 2010, john stultz wrote:
The HPET or pit timesource are also quite slow these days. You only need
access periodically to essentially tune the TSC ratio.
If we're using the TSC, then we're not using the PTP clock as you
suggest. Further the HPET and PIT aren't used to steer
On Mon, 1 Mar 2010, Mel Gorman wrote:
Christoph, how feasible would it be to allow parallel reclaimers in
__zone_reclaim() that back off at a rate depending on the number of
reclaimers?
Not too hard. Zone locking is there but there may be a lot of bouncing
cachelines if you run it
On Fri, 19 Feb 2010, Mel Gorman wrote:
The patch below sets a smaller value for RECLAIM_DISTANCE and thus enables
zone reclaim.
I've no problem with the patch anyway.
Nor do I.
- We seem to end up racing between zone_watermark_ok, zone_reclaim and
buffered_rmqueue. Since
On Fri, 19 Feb 2010, Balbir Singh wrote:
zone_reclaim. The others back off and try the next zone in the zonelist
instead. I'm not sure what the original intention was but most likely it
was to prevent too many parallel reclaimers in the same zone potentially
dumping out way more data than
On Thu, 15 Oct 2009, Gerald Schaefer wrote:
The pages allocated as __GFP_MOVABLE are used to store the list of pages
allocated by the balloon. They reference virtual addresses and it would
be fine for the kernel to migrate the physical pages for those, the
balloon would not notice this.
On Fri, 1 May 2009, Sam Ravnborg wrote:
Are there any specific reason why we do not support read_mostly on all
architectures?
Not that I know of.
read_mostly is about grouping rarely written data together
so what is needed is to introduce this section in the remaining
archtectures.
Mel Gorman wrote:
With Erics patch and libhugetlbfs, we can automatically back text/data[1],
malloc[2] and stacks without source modification. Fairly soon, libhugetlbfs
will also be able to override shmget() to add SHM_HUGETLB. That should cover
a lot of the memory-intensive apps without
On Tue, 4 Mar 2008, Pekka Enberg wrote:
I suspect the WARN_ON() is bogus although I really don't know that part
of the code all too well. Mel?
The warn-on is valid. A situation should not exist that allows both flags
to
be set. I suspect if remove-set_migrateflags.patch
On Tue, 4 Mar 2008, Pekka Enberg wrote:
[c9edf5f0] [c00b56e4]
.__alloc_pages_internal+0xf8/0x470
[c9edf6e0] [c00e0458] .kmem_getpages+0x8c/0x194
[c9edf770] [c00e1050] .fallback_alloc+0x194/0x254
[c9edf820]
On Tue, 4 Mar 2008, Pekka J Enberg wrote:
On Tue, 4 Mar 2008, Christoph Lameter wrote:
Slab allocations should never be passed these flags since the slabs do
their own thing there.
The following patch would clear these in slub:
Here's the same fix for SLAB:
That is an immediate fix
I think this is the correct fix.
The NUMA fallback logic should be passing local_flags to kmem_get_pages()
and not simply the flags.
Maybe a stable candidate since we are now simply
passing on flags to the page allocator on the fallback path.
Signed-off-by: Christoph Lameter [EMAIL PROTECTED
On Tue, 4 Mar 2008, Pekka Enberg wrote:
Looking at the code, it's triggerable in 2.6.24.3 at least. Why we don't have
a report yet, probably because (1) the default allocator is SLUB which doesn't
suffer from this and (2) you need a big honkin' NUMA box that causes fallback
allocations to
On Wed, 23 Jan 2008, Pekka J Enberg wrote:
I still think Christoph's kmem_getpages() patch is correct (to fix
cache_grow() oops) but I overlooked the fact that none the callers of
cache_alloc_node() deal with bootstrapping (with the exception of
__cache_alloc_node() that even has a
On Wed, 23 Jan 2008, Pekka J Enberg wrote:
Furthermore, don't let kmem_getpages() call alloc_pages_node() if nodeid
passed
to it is -1 as the latter will always translate that to numa_node_id() which
might not have -nodelist that caused the invocation of fallback_alloc() in
the
first
On Wed, 23 Jan 2008, Mel Gorman wrote:
This patch adds the necessary checks to make sure a kmem_list3 exists for
the preferred node used when growing the cache. If the preferred node has
no nodelist then the currently running node is used instead. This
problem only affects the SLAB allocator,
On Wed, 23 Jan 2008, Pekka J Enberg wrote:
Fine. But, why are we hitting fallback_alloc() in the first place? It's
definitely not because of missing -nodelists as we do:
cache_cache.nodelists[node] = initkmem_list3[CACHE_CACHE];
before attempting to set up kmalloc caches. Now, if
On Wed, 23 Jan 2008, Pekka Enberg wrote:
I think Mel said that their configuration did work with 2.6.23
although I also wonder how that's possible. AFAIK there has been some
changes in the page allocator that might explain this. That is, if
kmem_getpages() returned pages for memoryless node
On Wed, 23 Jan 2008, Nishanth Aravamudan wrote:
Right, so it might have functioned before, but the correctness was
wobbly at best... Certainly the memoryless patch series has tightened
that up, but we missed these SLAB issues.
I see that your patch fixed Olaf's machine, Pekka. Nice work on
On Tue, 22 Jan 2008, Mel Gorman wrote:
After you reverted the slab memoryless node patch there should be per node
structures created for node 0 unless the node is marked offline. Is it? If
so then you are booting a cpu that is associated with an offline node.
I'll roll a patch that
On Tue, 22 Jan 2008, Olaf Hering wrote:
It crashes now in a different way if the patch below is applied:
Yup no l3 structure for the current node. We are early in boostrap. You
could just check if the l3 is there and if not just skip starting the
reaper? This will be redone later anyways. Not
1 - 100 of 111 matches
Mail list logo