Re: non-NUMA cache_free_alien() (was Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp->nodeid;)

2007-04-02 Thread Siddha, Suresh B
On Mon, Apr 02, 2007 at 05:23:20PM -0700, Christoph Lameter wrote: > On Mon, 2 Apr 2007, Siddha, Suresh B wrote: > > > Set the node_possible_map at runtime. On a non NUMA system, > > num_possible_nodes() will now say '1' > > How does this relate to nr_node_ids? With this patch, nr_node_ids on

Re: non-NUMA cache_free_alien() (was Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp->nodeid;)

2007-04-02 Thread Christoph Lameter
On Mon, 2 Apr 2007, Siddha, Suresh B wrote: > Set the node_possible_map at runtime. On a non NUMA system, > num_possible_nodes() will now say '1' How does this relate to nr_node_ids? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL

Re: non-NUMA cache_free_alien() (was Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp->nodeid;)

2007-04-02 Thread Siddha, Suresh B
On Fri, Mar 23, 2007 at 03:12:10PM +0100, Andi Kleen wrote: > > But that is based on compile time option, isn't it? Perhaps I need > > to use some other mechanism to find out the platform is not NUMA capable.. > > We can probably make it runtime on x86. That will be needed sooner or > later for

Re: non-NUMA cache_free_alien() (was Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp-nodeid;)

2007-04-02 Thread Siddha, Suresh B
On Fri, Mar 23, 2007 at 03:12:10PM +0100, Andi Kleen wrote: But that is based on compile time option, isn't it? Perhaps I need to use some other mechanism to find out the platform is not NUMA capable.. We can probably make it runtime on x86. That will be needed sooner or later for correct

Re: non-NUMA cache_free_alien() (was Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp-nodeid;)

2007-04-02 Thread Christoph Lameter
On Mon, 2 Apr 2007, Siddha, Suresh B wrote: Set the node_possible_map at runtime. On a non NUMA system, num_possible_nodes() will now say '1' How does this relate to nr_node_ids? - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL

Re: non-NUMA cache_free_alien() (was Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp-nodeid;)

2007-04-02 Thread Siddha, Suresh B
On Mon, Apr 02, 2007 at 05:23:20PM -0700, Christoph Lameter wrote: On Mon, 2 Apr 2007, Siddha, Suresh B wrote: Set the node_possible_map at runtime. On a non NUMA system, num_possible_nodes() will now say '1' How does this relate to nr_node_ids? With this patch, nr_node_ids on non NUMA

Re: non-NUMA cache_free_alien() (was Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp->nodeid;)

2007-03-23 Thread Andi Kleen
On Thu, Mar 22, 2007 at 06:25:16PM -0700, Christoph Lameter wrote: > On Thu, 22 Mar 2007, Siddha, Suresh B wrote: > > > > You should check num_possible_nodes(), or nr_node_ids (this one is > > > cheaper, > > > its a variable instead of a function call) > > > > But that is based on compile time

Re: non-NUMA cache_free_alien() (was Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp->nodeid;)

2007-03-23 Thread Andi Kleen
> But that is based on compile time option, isn't it? Perhaps I need > to use some other mechanism to find out the platform is not NUMA capable.. We can probably make it runtime on x86. That will be needed sooner or later for correct NUMA hotplug support anyways. -Andi - To unsubscribe from

Re: non-NUMA cache_free_alien() (was Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp-nodeid;)

2007-03-23 Thread Andi Kleen
But that is based on compile time option, isn't it? Perhaps I need to use some other mechanism to find out the platform is not NUMA capable.. We can probably make it runtime on x86. That will be needed sooner or later for correct NUMA hotplug support anyways. -Andi - To unsubscribe from this

Re: non-NUMA cache_free_alien() (was Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp-nodeid;)

2007-03-23 Thread Andi Kleen
On Thu, Mar 22, 2007 at 06:25:16PM -0700, Christoph Lameter wrote: On Thu, 22 Mar 2007, Siddha, Suresh B wrote: You should check num_possible_nodes(), or nr_node_ids (this one is cheaper, its a variable instead of a function call) But that is based on compile time option, isn't

Re: non-NUMA cache_free_alien() (was Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp->nodeid;)

2007-03-22 Thread Christoph Lameter
On Thu, 22 Mar 2007, Siddha, Suresh B wrote: > > You should check num_possible_nodes(), or nr_node_ids (this one is cheaper, > > its a variable instead of a function call) > > But that is based on compile time option, isn't it? Perhaps I need > to use some other mechanism to find out the

Re: non-NUMA cache_free_alien() (was Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp->nodeid;)

2007-03-22 Thread Eric Dumazet
Siddha, Suresh B a écrit : On Thu, Mar 22, 2007 at 11:12:39PM +0100, Eric Dumazet wrote: Siddha, Suresh B a écrit : + if (num_online_nodes() == 1) + use_alien_caches = 0; + Unfortunatly this part is wrong. oops. You should check num_possible_nodes(), or nr_node_ids

Re: non-NUMA cache_free_alien() (was Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp->nodeid;)

2007-03-22 Thread Siddha, Suresh B
On Thu, Mar 22, 2007 at 11:12:39PM +0100, Eric Dumazet wrote: > Siddha, Suresh B a écrit : > >+if (num_online_nodes() == 1) > >+use_alien_caches = 0; > >+ > > Unfortunatly this part is wrong. oops. > > You should check num_possible_nodes(), or nr_node_ids (this one is cheaper,

Re: non-NUMA cache_free_alien() (was Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp->nodeid;)

2007-03-22 Thread Eric Dumazet
Siddha, Suresh B a écrit : Christoph, While we are at this topic, recently I had reports that cache_free_alien() is costly on non NUMA platforms too (similar to the cache miss issues that Eric was referring to on NUMA) and the appended patch seems to fix it for non NUMA atleast. Appended patch

Re: non-NUMA cache_free_alien() (was Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp->nodeid;)

2007-03-22 Thread Christoph Lameter
On Thu, 22 Mar 2007, Siddha, Suresh B wrote: > @@ -1394,6 +1394,9 @@ void __init kmem_cache_init(void) > int order; > int node; > > + if (num_online_nodes() == 1) > + use_alien_caches = 0; > + What happens if you bring up a second node? - To unsubscribe from this

non-NUMA cache_free_alien() (was Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp->nodeid;)

2007-03-22 Thread Siddha, Suresh B
Christoph, While we are at this topic, recently I had reports that cache_free_alien() is costly on non NUMA platforms too (similar to the cache miss issues that Eric was referring to on NUMA) and the appended patch seems to fix it for non NUMA atleast. Appended patch gives a nice 1% perf

non-NUMA cache_free_alien() (was Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp-nodeid;)

2007-03-22 Thread Siddha, Suresh B
Christoph, While we are at this topic, recently I had reports that cache_free_alien() is costly on non NUMA platforms too (similar to the cache miss issues that Eric was referring to on NUMA) and the appended patch seems to fix it for non NUMA atleast. Appended patch gives a nice 1% perf

Re: non-NUMA cache_free_alien() (was Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp-nodeid;)

2007-03-22 Thread Christoph Lameter
On Thu, 22 Mar 2007, Siddha, Suresh B wrote: @@ -1394,6 +1394,9 @@ void __init kmem_cache_init(void) int order; int node; + if (num_online_nodes() == 1) + use_alien_caches = 0; + What happens if you bring up a second node? - To unsubscribe from this list:

Re: non-NUMA cache_free_alien() (was Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp-nodeid;)

2007-03-22 Thread Eric Dumazet
Siddha, Suresh B a écrit : Christoph, While we are at this topic, recently I had reports that cache_free_alien() is costly on non NUMA platforms too (similar to the cache miss issues that Eric was referring to on NUMA) and the appended patch seems to fix it for non NUMA atleast. Appended patch

Re: non-NUMA cache_free_alien() (was Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp-nodeid;)

2007-03-22 Thread Siddha, Suresh B
On Thu, Mar 22, 2007 at 11:12:39PM +0100, Eric Dumazet wrote: Siddha, Suresh B a écrit : +if (num_online_nodes() == 1) +use_alien_caches = 0; + Unfortunatly this part is wrong. oops. You should check num_possible_nodes(), or nr_node_ids (this one is cheaper, its a

Re: non-NUMA cache_free_alien() (was Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp-nodeid;)

2007-03-22 Thread Eric Dumazet
Siddha, Suresh B a écrit : On Thu, Mar 22, 2007 at 11:12:39PM +0100, Eric Dumazet wrote: Siddha, Suresh B a écrit : + if (num_online_nodes() == 1) + use_alien_caches = 0; + Unfortunatly this part is wrong. oops. You should check num_possible_nodes(), or nr_node_ids

Re: non-NUMA cache_free_alien() (was Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp-nodeid;)

2007-03-22 Thread Christoph Lameter
On Thu, 22 Mar 2007, Siddha, Suresh B wrote: You should check num_possible_nodes(), or nr_node_ids (this one is cheaper, its a variable instead of a function call) But that is based on compile time option, isn't it? Perhaps I need to use some other mechanism to find out the platform is

Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp->nodeid;

2007-03-21 Thread Christoph Lameter
On Wed, 21 Mar 2007, Eric Dumazet wrote: > If numa_node_id() is equal to the node of the page containing the first byte > of the object, then object is on the local node. Or what ? No. The slab (the page you are referring to) may have been allocated for another node and been tracked via the

Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp->nodeid;

2007-03-21 Thread Eric Dumazet
Christoph Lameter a écrit : On Wed, 21 Mar 2007, Eric Dumazet wrote: The fast path is to put the pointer, into the cpu array cache. This object might be given back some cycles later, because of a kmem_cache_alloc() : No need to access the two cache lines (struct page, struct slab) If you do

Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp->nodeid;

2007-03-21 Thread Christoph Lameter
On Wed, 21 Mar 2007, Eric Dumazet wrote: > The fast path is to put the pointer, into the cpu array cache. This object > might be given back some cycles later, because of a kmem_cache_alloc() : No > need to access the two cache lines (struct page, struct slab) If you do that then the slab will no

Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp-nodeid;

2007-03-21 Thread Christoph Lameter
On Wed, 21 Mar 2007, Eric Dumazet wrote: The fast path is to put the pointer, into the cpu array cache. This object might be given back some cycles later, because of a kmem_cache_alloc() : No need to access the two cache lines (struct page, struct slab) If you do that then the slab will no

Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp-nodeid;

2007-03-21 Thread Eric Dumazet
Christoph Lameter a écrit : On Wed, 21 Mar 2007, Eric Dumazet wrote: The fast path is to put the pointer, into the cpu array cache. This object might be given back some cycles later, because of a kmem_cache_alloc() : No need to access the two cache lines (struct page, struct slab) If you do

Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp-nodeid;

2007-03-21 Thread Christoph Lameter
On Wed, 21 Mar 2007, Eric Dumazet wrote: If numa_node_id() is equal to the node of the page containing the first byte of the object, then object is on the local node. Or what ? No. The slab (the page you are referring to) may have been allocated for another node and been tracked via the node

Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp->nodeid;

2007-03-20 Thread Eric Dumazet
Christoph Lameter a écrit : On Tue, 20 Mar 2007, Eric Dumazet wrote: I understand we want to do special things (fallback and such tricks) at allocation time, but I believe that we can just trust the real nid of memory at free time. Sorry no. The node at allocation time determines which node

Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp->nodeid;

2007-03-20 Thread Christoph Lameter
On Wed, 21 Mar 2007, Andi Kleen wrote: > > We usually use page_to_nid(). Sure this will determine the node the object > > resides on. But this may not be the node on which the slab is tracked > > since there may have been a fallback at alloc time. > > How about your slab rewrite? I assume it

Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp->nodeid;

2007-03-20 Thread Andi Kleen
> We usually use page_to_nid(). Sure this will determine the node the object > resides on. But this may not be the node on which the slab is tracked > since there may have been a fallback at alloc time. How about your slab rewrite? I assume it would make more sense to fix such problems in that

Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp->nodeid;

2007-03-20 Thread Christoph Lameter
On Tue, 20 Mar 2007, Eric Dumazet wrote: > I understand we want to do special things (fallback and such tricks) at > allocation time, but I believe that we can just trust the real nid of memory > at free time. Sorry no. The node at allocation time determines which node specific structure tracks

Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp->nodeid;

2007-03-20 Thread Christoph Lameter
On Tue, 20 Mar 2007, Andi Kleen wrote: > > > Is it possible virt_to_slab(objp)->nodeid being different from > > > pfn_to_nid(objp) ? > > > > It is possible the page allocator falls back to another node than > > requested. We would need to check that this never occurs. > > The only way to

Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp->nodeid;

2007-03-20 Thread Eric Dumazet
Andi Kleen a écrit : Is it possible virt_to_slab(objp)->nodeid being different from pfn_to_nid(objp) ? It is possible the page allocator falls back to another node than requested. We would need to check that this never occurs. The only way to ensure that would be to set a strict mempolicy.

Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp->nodeid;

2007-03-20 Thread Andi Kleen
> > Is it possible virt_to_slab(objp)->nodeid being different from > > pfn_to_nid(objp) ? > > It is possible the page allocator falls back to another node than > requested. We would need to check that this never occurs. The only way to ensure that would be to set a strict mempolicy. But I'm

Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp->nodeid;

2007-03-20 Thread Christoph Lameter
On Tue, 20 Mar 2007, Eric Dumazet wrote: > > I noticed on a small x86_64 NUMA setup (2 nodes) that cache_free_alien() is > very expensive. > This is because of a cache miss on struct slab. > At the time an object is freed (call to kmem_cache_free() for example), the > underlying 'struct slab'

[RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp->nodeid;

2007-03-20 Thread Eric Dumazet
Hi I noticed on a small x86_64 NUMA setup (2 nodes) that cache_free_alien() is very expensive. This is because of a cache miss on struct slab. At the time an object is freed (call to kmem_cache_free() for example), the underlying 'struct slab' is not anymore cache-hot. struct slab *slabp =

[RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp-nodeid;

2007-03-20 Thread Eric Dumazet
Hi I noticed on a small x86_64 NUMA setup (2 nodes) that cache_free_alien() is very expensive. This is because of a cache miss on struct slab. At the time an object is freed (call to kmem_cache_free() for example), the underlying 'struct slab' is not anymore cache-hot. struct slab *slabp =

Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp-nodeid;

2007-03-20 Thread Christoph Lameter
On Tue, 20 Mar 2007, Eric Dumazet wrote: I noticed on a small x86_64 NUMA setup (2 nodes) that cache_free_alien() is very expensive. This is because of a cache miss on struct slab. At the time an object is freed (call to kmem_cache_free() for example), the underlying 'struct slab' is not

Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp-nodeid;

2007-03-20 Thread Andi Kleen
Is it possible virt_to_slab(objp)-nodeid being different from pfn_to_nid(objp) ? It is possible the page allocator falls back to another node than requested. We would need to check that this never occurs. The only way to ensure that would be to set a strict mempolicy. But I'm not sure

Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp-nodeid;

2007-03-20 Thread Eric Dumazet
Andi Kleen a écrit : Is it possible virt_to_slab(objp)-nodeid being different from pfn_to_nid(objp) ? It is possible the page allocator falls back to another node than requested. We would need to check that this never occurs. The only way to ensure that would be to set a strict mempolicy.

Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp-nodeid;

2007-03-20 Thread Christoph Lameter
On Tue, 20 Mar 2007, Eric Dumazet wrote: I understand we want to do special things (fallback and such tricks) at allocation time, but I believe that we can just trust the real nid of memory at free time. Sorry no. The node at allocation time determines which node specific structure tracks

Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp-nodeid;

2007-03-20 Thread Christoph Lameter
On Tue, 20 Mar 2007, Andi Kleen wrote: Is it possible virt_to_slab(objp)-nodeid being different from pfn_to_nid(objp) ? It is possible the page allocator falls back to another node than requested. We would need to check that this never occurs. The only way to ensure that would

Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp-nodeid;

2007-03-20 Thread Andi Kleen
We usually use page_to_nid(). Sure this will determine the node the object resides on. But this may not be the node on which the slab is tracked since there may have been a fallback at alloc time. How about your slab rewrite? I assume it would make more sense to fix such problems in that

Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp-nodeid;

2007-03-20 Thread Christoph Lameter
On Wed, 21 Mar 2007, Andi Kleen wrote: We usually use page_to_nid(). Sure this will determine the node the object resides on. But this may not be the node on which the slab is tracked since there may have been a fallback at alloc time. How about your slab rewrite? I assume it would

Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp-nodeid;

2007-03-20 Thread Eric Dumazet
Christoph Lameter a écrit : On Tue, 20 Mar 2007, Eric Dumazet wrote: I understand we want to do special things (fallback and such tricks) at allocation time, but I believe that we can just trust the real nid of memory at free time. Sorry no. The node at allocation time determines which node