Re: SLUB: The unqueued Slab allocator

2007-02-25 Thread Jörn Engel
On Sat, 24 February 2007 16:14:48 -0800, Christoph Lameter wrote:
> 
> It eliminates 50% of the slab caches. Thus it reduces the management 
> overhead by half.

How much management overhead is there left with SLUB?  Is it just the
one per-node slab?  Is there runtime overhead as well?

In a slightly different approach, can we possibly get rid of some slab
caches, instead of merging them at boot time?  On my system I have 97
slab caches right now, ignoring the generic kmalloc() ones.  Of those,
28 are completely empty, 23 contain <=10 objects, 23 <=100 and 23
contain >100 objects.

It is fairly obvious to me that the highly populated slab caches are a
big win.  But is it worth it to have slab caches with a single object
inside?  Maybe some of these caches are populated for some systems.
But there could also be candidates for removal among them.

#   name
0 0 dm-crypt_io
0 0 dm_io
0 0 dm_tio
0 0 ext3_xattr
0 0 fat_cache
0 0 fat_inode_cache
0 0 flow_cache
0 0 inet_peer_cache
0 0 ip_conntrack_expect
0 0 ip_mrt_cache
0 0 isofs_inode_cache
0 0 jbd_1k
0 0 jbd_4k
0 0 kiocb
0 0 kioctx
0 0 nfs_inode_cache
0 0 nfs_page
0 0 posix_timers_cache
0 0 request_sock_TCP
0 0 revoke_record
0 0 rpc_inode_cache
0 0 scsi_io_context
0 0 secpath_cache
0 0 skbuff_fclone_cache
0 0 tw_sock_TCP
0 0 udf_inode_cache
0 0 uhci_urb_priv
0 0 xfrm_dst_cache
1 169 dnotify_cache
1 30 arp_cache
1 7 mqueue_inode_cache
2 101 eventpoll_pwq
2 203 fasync_cache
2 254 revoke_table
2 30 eventpoll_epi
2 9 RAW
4 17 ip_conntrack
7 10 biovec-128
7 10 biovec-64
7 20 biovec-16
7 42 file_lock_cache
7 59 biovec-4
7 59 uid_cache
7 8 biovec-256
7 9 bdev_cache
8 127 inotify_event_cache
8 20 rpc_tasks
8 8 rpc_buffers
10 113 ip_fib_alias
10 113 ip_fib_hash
10 12 blkdev_queue
11 203 biovec-1
11 22 blkdev_requests
13 92 inotify_watch_cache
16 169 journal_handle
16 203 tcp_bind_bucket
16 72 journal_head
18 18 UDP
19 19 names_cache
19 28 TCP
22 30 mnt_cache
27 27 sigqueue
27 60 ip_dst_cache
32 32 sgpool-128
32 32 sgpool-32
32 32 sgpool-64
32 36 nfs_read_data
32 45 sgpool-16
32 60 sgpool-8
36 42 nfs_write_data
72 80 cfq_pool
74 127 blkdev_ioc
78 92 cfq_ioc_pool
94 94 pgd
107 113 fs_cache
108 108 mm_struct
108 140 files_cache
123 123 sighand_cache
125 140 UNIX
130 130 signal_cache
147 147 task_struct
154 174 idr_layer_cache
158 404 pid
190 190 sock_inode_cache
260 295 bio
273 273 proc_inode_cache
840 920 skbuff_head_cache
1234 1326 inode_cache
1507 1510 shmem_inode_cache
2871 3051 anon_vma
2910 3360 filp
5161 5292 sysfs_dir_cache
5762 6164 vm_area_struct
12056 19446 radix_tree_node
65776 151272 buffer_head
578304 578304 ext3_inode_cache
677490 677490 dentry_cache

Jörn

-- 
And spam is a useful source of entropy for /dev/random too!
-- Jasmine Strong
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SLUB: The unqueued Slab allocator

2007-02-25 Thread Jörn Engel
On Sat, 24 February 2007 16:14:48 -0800, Christoph Lameter wrote:
 
 It eliminates 50% of the slab caches. Thus it reduces the management 
 overhead by half.

How much management overhead is there left with SLUB?  Is it just the
one per-node slab?  Is there runtime overhead as well?

In a slightly different approach, can we possibly get rid of some slab
caches, instead of merging them at boot time?  On my system I have 97
slab caches right now, ignoring the generic kmalloc() ones.  Of those,
28 are completely empty, 23 contain =10 objects, 23 =100 and 23
contain 100 objects.

It is fairly obvious to me that the highly populated slab caches are a
big win.  But is it worth it to have slab caches with a single object
inside?  Maybe some of these caches are populated for some systems.
But there could also be candidates for removal among them.

# active_objs num_objs name
0 0 dm-crypt_io
0 0 dm_io
0 0 dm_tio
0 0 ext3_xattr
0 0 fat_cache
0 0 fat_inode_cache
0 0 flow_cache
0 0 inet_peer_cache
0 0 ip_conntrack_expect
0 0 ip_mrt_cache
0 0 isofs_inode_cache
0 0 jbd_1k
0 0 jbd_4k
0 0 kiocb
0 0 kioctx
0 0 nfs_inode_cache
0 0 nfs_page
0 0 posix_timers_cache
0 0 request_sock_TCP
0 0 revoke_record
0 0 rpc_inode_cache
0 0 scsi_io_context
0 0 secpath_cache
0 0 skbuff_fclone_cache
0 0 tw_sock_TCP
0 0 udf_inode_cache
0 0 uhci_urb_priv
0 0 xfrm_dst_cache
1 169 dnotify_cache
1 30 arp_cache
1 7 mqueue_inode_cache
2 101 eventpoll_pwq
2 203 fasync_cache
2 254 revoke_table
2 30 eventpoll_epi
2 9 RAW
4 17 ip_conntrack
7 10 biovec-128
7 10 biovec-64
7 20 biovec-16
7 42 file_lock_cache
7 59 biovec-4
7 59 uid_cache
7 8 biovec-256
7 9 bdev_cache
8 127 inotify_event_cache
8 20 rpc_tasks
8 8 rpc_buffers
10 113 ip_fib_alias
10 113 ip_fib_hash
10 12 blkdev_queue
11 203 biovec-1
11 22 blkdev_requests
13 92 inotify_watch_cache
16 169 journal_handle
16 203 tcp_bind_bucket
16 72 journal_head
18 18 UDP
19 19 names_cache
19 28 TCP
22 30 mnt_cache
27 27 sigqueue
27 60 ip_dst_cache
32 32 sgpool-128
32 32 sgpool-32
32 32 sgpool-64
32 36 nfs_read_data
32 45 sgpool-16
32 60 sgpool-8
36 42 nfs_write_data
72 80 cfq_pool
74 127 blkdev_ioc
78 92 cfq_ioc_pool
94 94 pgd
107 113 fs_cache
108 108 mm_struct
108 140 files_cache
123 123 sighand_cache
125 140 UNIX
130 130 signal_cache
147 147 task_struct
154 174 idr_layer_cache
158 404 pid
190 190 sock_inode_cache
260 295 bio
273 273 proc_inode_cache
840 920 skbuff_head_cache
1234 1326 inode_cache
1507 1510 shmem_inode_cache
2871 3051 anon_vma
2910 3360 filp
5161 5292 sysfs_dir_cache
5762 6164 vm_area_struct
12056 19446 radix_tree_node
65776 151272 buffer_head
578304 578304 ext3_inode_cache
677490 677490 dentry_cache

Jörn

-- 
And spam is a useful source of entropy for /dev/random too!
-- Jasmine Strong
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SLUB: The unqueued Slab allocator

2007-02-24 Thread David Miller
From: Christoph Lameter <[EMAIL PROTECTED]>
Date: Sat, 24 Feb 2007 09:32:49 -0800 (PST)

> On Fri, 23 Feb 2007, David Miller wrote:
> 
> > I also agree with Andi in that merging could mess up how object type
> > local lifetimes help reduce fragmentation in object pools.
> 
> If that is a problem for particular object pools then we may be able to 
> except those from the merging.

If it is a problem, it's going to be a problem "in general"
and not for specific SLAB caches.

I think this is really a very unwise idea.  We have enough
fragmentation problems as it is.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SLUB: The unqueued Slab allocator

2007-02-24 Thread Christoph Lameter
On Sat, 24 Feb 2007, Jörn Engel wrote:

> How much of a gain is the merging anyway?  Once you start having
> explicit whitelists or blacklists of pools that can be merged, one can
> start to wonder if the result is worth the effort.

It eliminates 50% of the slab caches. Thus it reduces the management 
overhead by half.



Re: SLUB: The unqueued Slab allocator

2007-02-24 Thread Jörn Engel
On Sat, 24 February 2007 09:32:49 -0800, Christoph Lameter wrote:
> 
> If that is a problem for particular object pools then we may be able to 
> except those from the merging.

How much of a gain is the merging anyway?  Once you start having
explicit whitelists or blacklists of pools that can be merged, one can
start to wonder if the result is worth the effort.

Jörn

-- 
Joern's library part 6:
http://www.gzip.org/zlib/feldspar.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SLUB: The unqueued Slab allocator

2007-02-24 Thread Christoph Lameter
On Fri, 23 Feb 2007, David Miller wrote:

> > The general caches already merge lots of users depending on their sizes. 
> > So we already have the situation and we have tools to deal with it.
> 
> But this doesn't happen for things like biovecs, and that will
> make debugging painful.
> 
> If a crash happens because of a corrupted biovec-256 I want to know
> it was a biovec not some anonymous clone of kmalloc256.
> 
> Please provide at a minimum a way to turn the merging off.

Ok. Its currently a compile time option. Will make it possible to specify 
a boot option.
 
> I also agree with Andi in that merging could mess up how object type
> local lifetimes help reduce fragmentation in object pools.

If that is a problem for particular object pools then we may be able to 
except those from the merging.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SLUB: The unqueued Slab allocator

2007-02-24 Thread Christoph Lameter
On Fri, 23 Feb 2007, David Miller wrote:

  The general caches already merge lots of users depending on their sizes. 
  So we already have the situation and we have tools to deal with it.
 
 But this doesn't happen for things like biovecs, and that will
 make debugging painful.
 
 If a crash happens because of a corrupted biovec-256 I want to know
 it was a biovec not some anonymous clone of kmalloc256.
 
 Please provide at a minimum a way to turn the merging off.

Ok. Its currently a compile time option. Will make it possible to specify 
a boot option.
 
 I also agree with Andi in that merging could mess up how object type
 local lifetimes help reduce fragmentation in object pools.

If that is a problem for particular object pools then we may be able to 
except those from the merging.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SLUB: The unqueued Slab allocator

2007-02-24 Thread Jörn Engel
On Sat, 24 February 2007 09:32:49 -0800, Christoph Lameter wrote:
 
 If that is a problem for particular object pools then we may be able to 
 except those from the merging.

How much of a gain is the merging anyway?  Once you start having
explicit whitelists or blacklists of pools that can be merged, one can
start to wonder if the result is worth the effort.

Jörn

-- 
Joern's library part 6:
http://www.gzip.org/zlib/feldspar.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SLUB: The unqueued Slab allocator

2007-02-24 Thread Christoph Lameter
On Sat, 24 Feb 2007, Jörn Engel wrote:

 How much of a gain is the merging anyway?  Once you start having
 explicit whitelists or blacklists of pools that can be merged, one can
 start to wonder if the result is worth the effort.

It eliminates 50% of the slab caches. Thus it reduces the management 
overhead by half.



Re: SLUB: The unqueued Slab allocator

2007-02-24 Thread David Miller
From: Christoph Lameter [EMAIL PROTECTED]
Date: Sat, 24 Feb 2007 09:32:49 -0800 (PST)

 On Fri, 23 Feb 2007, David Miller wrote:
 
  I also agree with Andi in that merging could mess up how object type
  local lifetimes help reduce fragmentation in object pools.
 
 If that is a problem for particular object pools then we may be able to 
 except those from the merging.

If it is a problem, it's going to be a problem in general
and not for specific SLAB caches.

I think this is really a very unwise idea.  We have enough
fragmentation problems as it is.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SLUB: The unqueued Slab allocator

2007-02-23 Thread David Miller
From: Christoph Lameter <[EMAIL PROTECTED]>
Date: Fri, 23 Feb 2007 21:47:36 -0800 (PST)

> On Sat, 24 Feb 2007, KAMEZAWA Hiroyuki wrote:
> 
> > >From a viewpoint of a crash dump user, this merging will make crash dump
> > investigation very very very difficult.
> 
> The general caches already merge lots of users depending on their sizes. 
> So we already have the situation and we have tools to deal with it.

But this doesn't happen for things like biovecs, and that will
make debugging painful.

If a crash happens because of a corrupted biovec-256 I want to know
it was a biovec not some anonymous clone of kmalloc256.

Please provide at a minimum a way to turn the merging off.

I also agree with Andi in that merging could mess up how object type
local lifetimes help reduce fragmentation in object pools.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SLUB: The unqueued Slab allocator

2007-02-23 Thread Christoph Lameter
On Sat, 24 Feb 2007, KAMEZAWA Hiroyuki wrote:

> >From a viewpoint of a crash dump user, this merging will make crash dump
> investigation very very very difficult.

The general caches already merge lots of users depending on their sizes. 
So we already have the situation and we have tools to deal with it.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SLUB: The unqueued Slab allocator

2007-02-23 Thread KAMEZAWA Hiroyuki
On Thu, 22 Feb 2007 10:42:23 -0800 (PST)
Christoph Lameter <[EMAIL PROTECTED]> wrote:

> > > G. Slab merging
> > > 
> > >We often have slab caches with similar parameters. SLUB detects those
> > >on bootup and merges them into the corresponding general caches. This
> > >leads to more effective memory use.
> > 
> > Did you do any tests on what that does to long term memory fragmentation?
> > It is against the "object of same type have similar livetime and should
> > be clustered together" theory at least.
> 
> I have done no tests in that regard and we would have to assess the impact 
> that the merging has to overall system behavior.
> 
>From a viewpoint of a crash dump user, this merging will make crash dump
investigation very very very difficult.
So please avoid this merging if the benefit is nog big.

-Kame

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SLUB: The unqueued Slab allocator

2007-02-23 Thread KAMEZAWA Hiroyuki
On Thu, 22 Feb 2007 10:42:23 -0800 (PST)
Christoph Lameter [EMAIL PROTECTED] wrote:

   G. Slab merging
   
  We often have slab caches with similar parameters. SLUB detects those
  on bootup and merges them into the corresponding general caches. This
  leads to more effective memory use.
  
  Did you do any tests on what that does to long term memory fragmentation?
  It is against the object of same type have similar livetime and should
  be clustered together theory at least.
 
 I have done no tests in that regard and we would have to assess the impact 
 that the merging has to overall system behavior.
 
From a viewpoint of a crash dump user, this merging will make crash dump
investigation very very very difficult.
So please avoid this merging if the benefit is nog big.

-Kame

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SLUB: The unqueued Slab allocator

2007-02-23 Thread Christoph Lameter
On Sat, 24 Feb 2007, KAMEZAWA Hiroyuki wrote:

 From a viewpoint of a crash dump user, this merging will make crash dump
 investigation very very very difficult.

The general caches already merge lots of users depending on their sizes. 
So we already have the situation and we have tools to deal with it.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SLUB: The unqueued Slab allocator

2007-02-23 Thread David Miller
From: Christoph Lameter [EMAIL PROTECTED]
Date: Fri, 23 Feb 2007 21:47:36 -0800 (PST)

 On Sat, 24 Feb 2007, KAMEZAWA Hiroyuki wrote:
 
  From a viewpoint of a crash dump user, this merging will make crash dump
  investigation very very very difficult.
 
 The general caches already merge lots of users depending on their sizes. 
 So we already have the situation and we have tools to deal with it.

But this doesn't happen for things like biovecs, and that will
make debugging painful.

If a crash happens because of a corrupted biovec-256 I want to know
it was a biovec not some anonymous clone of kmalloc256.

Please provide at a minimum a way to turn the merging off.

I also agree with Andi in that merging could mess up how object type
local lifetimes help reduce fragmentation in object pools.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SLUB: The unqueued Slab allocator

2007-02-22 Thread Christoph Lameter
On Fri, 23 Feb 2007, Andi Kleen wrote:

> If you don't cache constructed but free objects then there is no cache
> advantage of constructors/destructors and they would be useless.

SLUB caches those objects as long as they are part of a partially 
allocated slab. If all objects in the slab are freed then the whole slab 
will be freed. SLUB does not keep queues of freed slabs.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SLUB: The unqueued Slab allocator

2007-02-22 Thread Andi Kleen
On Thu, Feb 22, 2007 at 10:42:23AM -0800, Christoph Lameter wrote:
> On Thu, 22 Feb 2007, Andi Kleen wrote:
> 
> > >SLUB does not need a cache reaper for UP systems.
> > 
> > This means constructors/destructors are becomming worthless? 
> > Can you describe your rationale why you think they don't make
> > sense on UP?
> 
> Cache reaping has nothing to do with constructors and destructors. SLUB 
> fully supports constructors and destructors.

If you don't cache constructed but free objects then there is no cache
advantage of constructors/destructors and they would be useless.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SLUB: The unqueued Slab allocator

2007-02-22 Thread Christoph Lameter
On Thu, 22 Feb 2007, Andi Kleen wrote:

> >SLUB does not need a cache reaper for UP systems.
> 
> This means constructors/destructors are becomming worthless? 
> Can you describe your rationale why you think they don't make
> sense on UP?

Cache reaping has nothing to do with constructors and destructors. SLUB 
fully supports constructors and destructors.

> > G. Slab merging
> > 
> >We often have slab caches with similar parameters. SLUB detects those
> >on bootup and merges them into the corresponding general caches. This
> >leads to more effective memory use.
> 
> Did you do any tests on what that does to long term memory fragmentation?
> It is against the "object of same type have similar livetime and should
> be clustered together" theory at least.

I have done no tests in that regard and we would have to assess the impact 
that the merging has to overall system behavior.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SLUB: The unqueued Slab allocator

2007-02-22 Thread Andi Kleen
Christoph Lameter <[EMAIL PROTECTED]> writes:

> This is a new slab allocator which was motivated by the complexity of the
> with the existing implementation.

Thanks for doing that work. It certainly was long overdue.

> D. SLAB has a complex cache reaper
> 
>SLUB does not need a cache reaper for UP systems.

This means constructors/destructors are becomming worthless? 
Can you describe your rationale why you think they don't make
sense on UP?

> G. Slab merging
> 
>We often have slab caches with similar parameters. SLUB detects those
>on bootup and merges them into the corresponding general caches. This
>leads to more effective memory use.

Did you do any tests on what that does to long term memory fragmentation?
It is against the "object of same type have similar livetime and should
be clustered together" theory at least.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SLUB: The unqueued Slab allocator

2007-02-22 Thread Christoph Lameter
n Thu, 22 Feb 2007, David Miller wrote:

> All of that logic needs to be protected by CONFIG_ZONE_DMA too.

Right. Will fix that in the next release.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SLUB: The unqueued Slab allocator

2007-02-22 Thread Christoph Lameter
On Thu, 22 Feb 2007, Peter Zijlstra wrote:

> On Wed, 2007-02-21 at 23:00 -0800, Christoph Lameter wrote:
> 
> > +/*
> > + * Lock order:
> > + *   1. slab_lock(page)
> > + *   2. slab->list_lock
> > + *
> 
> That seems to contradict this:

This is a trylock. If it fails then we can compensate by allocating
a new slab.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SLUB: The unqueued Slab allocator

2007-02-22 Thread Christoph Lameter
On Thu, 22 Feb 2007, Pekka Enberg wrote:

> On 2/22/07, Christoph Lameter <[EMAIL PROTECTED]> wrote:
> > This is a new slab allocator which was motivated by the complexity of the
> > existing code in mm/slab.c. It attempts to address a variety of concerns
> > with the existing implementation.
> 
> So do you want to add a new allocator or replace slab?

Add. The performance and quality is not comparable to SLAB at this point.

> On 2/22/07, Christoph Lameter <[EMAIL PROTECTED]> wrote:
> > B. Storage overhead of object queues
> 
> Does this make sense for non-NUMA too? If not, can we disable the
> queues for NUMA in current slab?

Given the locking scheme in the current slab you cannot do that. Otherwise
there will be a single lock taken for every operation limiting performace

> On 2/22/07, Christoph Lameter <[EMAIL PROTECTED]> wrote:
> > C. SLAB metadata overhead
> 
> Can be done for the current slab code too, no?

The per slab metadata of the SLAB does not fit into the page_struct. 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SLUB: The unqueued Slab allocator

2007-02-22 Thread Pekka Enberg

Hi Christoph,

On 2/22/07, Christoph Lameter <[EMAIL PROTECTED]> wrote:

This is a new slab allocator which was motivated by the complexity of the
existing code in mm/slab.c. It attempts to address a variety of concerns
with the existing implementation.


So do you want to add a new allocator or replace slab?

On 2/22/07, Christoph Lameter <[EMAIL PROTECTED]> wrote:

B. Storage overhead of object queues


Does this make sense for non-NUMA too? If not, can we disable the
queues for NUMA in current slab?

On 2/22/07, Christoph Lameter <[EMAIL PROTECTED]> wrote:

C. SLAB metadata overhead


Can be done for the current slab code too, no?

Pekka
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SLUB: The unqueued Slab allocator

2007-02-22 Thread David Miller
From: Christoph Lameter <[EMAIL PROTECTED]>
Date: Wed, 21 Feb 2007 23:00:30 -0800 (PST)

> +#ifdef CONFIG_ZONE_DMA
> +static struct kmem_cache *kmalloc_caches_dma[KMALLOC_NR_CACHES];
> +#endif

Therefore.

> +static struct kmem_cache *get_slab(size_t size, gfp_t flags)
> +{
 ...
> + s = kmalloc_caches_dma[index];
> + if (s)
> + return s;
> +
> + /* Dynamically create dma cache */
> + x = kmalloc(sizeof(struct kmem_cache), flags & ~(__GFP_DMA));
> +
> + if (!x)
> + panic("Unable to allocate memory for dma cache\n");
> +
> +#ifdef KMALLOC_EXTRA
> + if (index <= KMALLOC_SHIFT_HIGH - KMALLOC_SHIFT_LOW)
> +#endif
> + realsize = 1 << index;
> +#ifdef KMALLOC_EXTRA
> + else if (index == KMALLOC_EXTRAS)
> + realsize = 96;
> + else
> + realsize = 192;
> +#endif
> +
> + s = create_kmalloc_cache(x, "kmalloc_dma", realsize);
> + kmalloc_caches_dma[index] = s;
> + return s;
> +}

All of that logic needs to be protected by CONFIG_ZONE_DMA too.

I noticed this due to a build failure on sparc64 with this patch.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SLUB: The unqueued Slab allocator

2007-02-22 Thread Peter Zijlstra
On Wed, 2007-02-21 at 23:00 -0800, Christoph Lameter wrote:

> +/*
> + * Lock order:
> + *   1. slab_lock(page)
> + *   2. slab->list_lock
> + *

That seems to contradict this:

> +/*
> + * Lock page and remove it from the partial list
> + *
> + * Must hold list_lock
> + */
> +static __always_inline int lock_and_del_slab(struct kmem_cache *s,
> + struct page *page)
> +{
> + if (slab_trylock(page)) {
> + list_del(>lru);
> + s->nr_partial--;
> + return 1;
> + }
> + return 0;
> +}
> +
> +/*
> + * Get a partial page, lock it and return it.
> + */
> +#ifdef CONFIG_NUMA
> +static struct page *get_partial(struct kmem_cache *s, gfp_t flags, int node)
> +{
> + struct page *page;
> + int searchnode = (node == -1) ? numa_node_id() : node;
> +
> + if (!s->nr_partial)
> + return NULL;
> +
> + spin_lock(>list_lock);
> + /*
> +  * Search for slab on the right node
> +  */
> + list_for_each_entry(page, >partial, lru)
> + if (likely(page_to_nid(page) == searchnode) &&
> + lock_and_del_slab(s, page))
> + goto out;
> +
> + if (likely(!(flags & __GFP_THISNODE))) {
> + /*
> +  * We can fall back to any other node in order to
> +  * reduce the size of the partial list.
> +  */
> + list_for_each_entry(page, >partial, lru)
> + if (likely(lock_and_del_slab(s, page)))
> + goto out;
> + }
> +
> + /* Nothing found */
> + page = NULL;
> +out:
> + spin_unlock(>list_lock);
> + return page;
> +}
> +#else
> +static struct page *get_partial(struct kmem_cache *s, gfp_t flags, int node)
> +{
> + struct page *page;
> +
> + /*
> +  * Racy check. If we mistakenly see no partial slabs then we
> +  * just allocate an empty slab.
> +  */
> + if (!s->nr_partial)
> + return NULL;
> +
> + spin_lock(>list_lock);
> + list_for_each_entry(page, >partial, lru)
> + if (likely(lock_and_del_slab(s, page)))
> + goto out;
> +
> + /* No slab or all slabs busy */
> + page = NULL;
> +out:
> + spin_unlock(>list_lock);
> + return page;
> +}
> +#endif

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SLUB: The unqueued Slab allocator

2007-02-22 Thread Peter Zijlstra
On Wed, 2007-02-21 at 23:00 -0800, Christoph Lameter wrote:

 +/*
 + * Lock order:
 + *   1. slab_lock(page)
 + *   2. slab-list_lock
 + *

That seems to contradict this:

 +/*
 + * Lock page and remove it from the partial list
 + *
 + * Must hold list_lock
 + */
 +static __always_inline int lock_and_del_slab(struct kmem_cache *s,
 + struct page *page)
 +{
 + if (slab_trylock(page)) {
 + list_del(page-lru);
 + s-nr_partial--;
 + return 1;
 + }
 + return 0;
 +}
 +
 +/*
 + * Get a partial page, lock it and return it.
 + */
 +#ifdef CONFIG_NUMA
 +static struct page *get_partial(struct kmem_cache *s, gfp_t flags, int node)
 +{
 + struct page *page;
 + int searchnode = (node == -1) ? numa_node_id() : node;
 +
 + if (!s-nr_partial)
 + return NULL;
 +
 + spin_lock(s-list_lock);
 + /*
 +  * Search for slab on the right node
 +  */
 + list_for_each_entry(page, s-partial, lru)
 + if (likely(page_to_nid(page) == searchnode) 
 + lock_and_del_slab(s, page))
 + goto out;
 +
 + if (likely(!(flags  __GFP_THISNODE))) {
 + /*
 +  * We can fall back to any other node in order to
 +  * reduce the size of the partial list.
 +  */
 + list_for_each_entry(page, s-partial, lru)
 + if (likely(lock_and_del_slab(s, page)))
 + goto out;
 + }
 +
 + /* Nothing found */
 + page = NULL;
 +out:
 + spin_unlock(s-list_lock);
 + return page;
 +}
 +#else
 +static struct page *get_partial(struct kmem_cache *s, gfp_t flags, int node)
 +{
 + struct page *page;
 +
 + /*
 +  * Racy check. If we mistakenly see no partial slabs then we
 +  * just allocate an empty slab.
 +  */
 + if (!s-nr_partial)
 + return NULL;
 +
 + spin_lock(s-list_lock);
 + list_for_each_entry(page, s-partial, lru)
 + if (likely(lock_and_del_slab(s, page)))
 + goto out;
 +
 + /* No slab or all slabs busy */
 + page = NULL;
 +out:
 + spin_unlock(s-list_lock);
 + return page;
 +}
 +#endif

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SLUB: The unqueued Slab allocator

2007-02-22 Thread David Miller
From: Christoph Lameter [EMAIL PROTECTED]
Date: Wed, 21 Feb 2007 23:00:30 -0800 (PST)

 +#ifdef CONFIG_ZONE_DMA
 +static struct kmem_cache *kmalloc_caches_dma[KMALLOC_NR_CACHES];
 +#endif

Therefore.

 +static struct kmem_cache *get_slab(size_t size, gfp_t flags)
 +{
 ...
 + s = kmalloc_caches_dma[index];
 + if (s)
 + return s;
 +
 + /* Dynamically create dma cache */
 + x = kmalloc(sizeof(struct kmem_cache), flags  ~(__GFP_DMA));
 +
 + if (!x)
 + panic(Unable to allocate memory for dma cache\n);
 +
 +#ifdef KMALLOC_EXTRA
 + if (index = KMALLOC_SHIFT_HIGH - KMALLOC_SHIFT_LOW)
 +#endif
 + realsize = 1  index;
 +#ifdef KMALLOC_EXTRA
 + else if (index == KMALLOC_EXTRAS)
 + realsize = 96;
 + else
 + realsize = 192;
 +#endif
 +
 + s = create_kmalloc_cache(x, kmalloc_dma, realsize);
 + kmalloc_caches_dma[index] = s;
 + return s;
 +}

All of that logic needs to be protected by CONFIG_ZONE_DMA too.

I noticed this due to a build failure on sparc64 with this patch.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SLUB: The unqueued Slab allocator

2007-02-22 Thread Pekka Enberg

Hi Christoph,

On 2/22/07, Christoph Lameter [EMAIL PROTECTED] wrote:

This is a new slab allocator which was motivated by the complexity of the
existing code in mm/slab.c. It attempts to address a variety of concerns
with the existing implementation.


So do you want to add a new allocator or replace slab?

On 2/22/07, Christoph Lameter [EMAIL PROTECTED] wrote:

B. Storage overhead of object queues


Does this make sense for non-NUMA too? If not, can we disable the
queues for NUMA in current slab?

On 2/22/07, Christoph Lameter [EMAIL PROTECTED] wrote:

C. SLAB metadata overhead


Can be done for the current slab code too, no?

Pekka
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SLUB: The unqueued Slab allocator

2007-02-22 Thread Christoph Lameter
On Thu, 22 Feb 2007, Pekka Enberg wrote:

 On 2/22/07, Christoph Lameter [EMAIL PROTECTED] wrote:
  This is a new slab allocator which was motivated by the complexity of the
  existing code in mm/slab.c. It attempts to address a variety of concerns
  with the existing implementation.
 
 So do you want to add a new allocator or replace slab?

Add. The performance and quality is not comparable to SLAB at this point.

 On 2/22/07, Christoph Lameter [EMAIL PROTECTED] wrote:
  B. Storage overhead of object queues
 
 Does this make sense for non-NUMA too? If not, can we disable the
 queues for NUMA in current slab?

Given the locking scheme in the current slab you cannot do that. Otherwise
there will be a single lock taken for every operation limiting performace

 On 2/22/07, Christoph Lameter [EMAIL PROTECTED] wrote:
  C. SLAB metadata overhead
 
 Can be done for the current slab code too, no?

The per slab metadata of the SLAB does not fit into the page_struct. 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SLUB: The unqueued Slab allocator

2007-02-22 Thread Christoph Lameter
On Thu, 22 Feb 2007, Peter Zijlstra wrote:

 On Wed, 2007-02-21 at 23:00 -0800, Christoph Lameter wrote:
 
  +/*
  + * Lock order:
  + *   1. slab_lock(page)
  + *   2. slab-list_lock
  + *
 
 That seems to contradict this:

This is a trylock. If it fails then we can compensate by allocating
a new slab.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SLUB: The unqueued Slab allocator

2007-02-22 Thread Christoph Lameter
n Thu, 22 Feb 2007, David Miller wrote:

 All of that logic needs to be protected by CONFIG_ZONE_DMA too.

Right. Will fix that in the next release.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SLUB: The unqueued Slab allocator

2007-02-22 Thread Andi Kleen
Christoph Lameter [EMAIL PROTECTED] writes:

 This is a new slab allocator which was motivated by the complexity of the
 with the existing implementation.

Thanks for doing that work. It certainly was long overdue.

 D. SLAB has a complex cache reaper
 
SLUB does not need a cache reaper for UP systems.

This means constructors/destructors are becomming worthless? 
Can you describe your rationale why you think they don't make
sense on UP?

 G. Slab merging
 
We often have slab caches with similar parameters. SLUB detects those
on bootup and merges them into the corresponding general caches. This
leads to more effective memory use.

Did you do any tests on what that does to long term memory fragmentation?
It is against the object of same type have similar livetime and should
be clustered together theory at least.

-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SLUB: The unqueued Slab allocator

2007-02-22 Thread Christoph Lameter
On Thu, 22 Feb 2007, Andi Kleen wrote:

 SLUB does not need a cache reaper for UP systems.
 
 This means constructors/destructors are becomming worthless? 
 Can you describe your rationale why you think they don't make
 sense on UP?

Cache reaping has nothing to do with constructors and destructors. SLUB 
fully supports constructors and destructors.

  G. Slab merging
  
 We often have slab caches with similar parameters. SLUB detects those
 on bootup and merges them into the corresponding general caches. This
 leads to more effective memory use.
 
 Did you do any tests on what that does to long term memory fragmentation?
 It is against the object of same type have similar livetime and should
 be clustered together theory at least.

I have done no tests in that regard and we would have to assess the impact 
that the merging has to overall system behavior.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SLUB: The unqueued Slab allocator

2007-02-22 Thread Andi Kleen
On Thu, Feb 22, 2007 at 10:42:23AM -0800, Christoph Lameter wrote:
 On Thu, 22 Feb 2007, Andi Kleen wrote:
 
  SLUB does not need a cache reaper for UP systems.
  
  This means constructors/destructors are becomming worthless? 
  Can you describe your rationale why you think they don't make
  sense on UP?
 
 Cache reaping has nothing to do with constructors and destructors. SLUB 
 fully supports constructors and destructors.

If you don't cache constructed but free objects then there is no cache
advantage of constructors/destructors and they would be useless.

-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SLUB: The unqueued Slab allocator

2007-02-22 Thread Christoph Lameter
On Fri, 23 Feb 2007, Andi Kleen wrote:

 If you don't cache constructed but free objects then there is no cache
 advantage of constructors/destructors and they would be useless.

SLUB caches those objects as long as they are part of a partially 
allocated slab. If all objects in the slab are freed then the whole slab 
will be freed. SLUB does not keep queues of freed slabs.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/