Re: SLUB: The unqueued Slab allocator
On Sat, 24 February 2007 16:14:48 -0800, Christoph Lameter wrote: > > It eliminates 50% of the slab caches. Thus it reduces the management > overhead by half. How much management overhead is there left with SLUB? Is it just the one per-node slab? Is there runtime overhead as well? In a slightly different approach, can we possibly get rid of some slab caches, instead of merging them at boot time? On my system I have 97 slab caches right now, ignoring the generic kmalloc() ones. Of those, 28 are completely empty, 23 contain <=10 objects, 23 <=100 and 23 contain >100 objects. It is fairly obvious to me that the highly populated slab caches are a big win. But is it worth it to have slab caches with a single object inside? Maybe some of these caches are populated for some systems. But there could also be candidates for removal among them. # name 0 0 dm-crypt_io 0 0 dm_io 0 0 dm_tio 0 0 ext3_xattr 0 0 fat_cache 0 0 fat_inode_cache 0 0 flow_cache 0 0 inet_peer_cache 0 0 ip_conntrack_expect 0 0 ip_mrt_cache 0 0 isofs_inode_cache 0 0 jbd_1k 0 0 jbd_4k 0 0 kiocb 0 0 kioctx 0 0 nfs_inode_cache 0 0 nfs_page 0 0 posix_timers_cache 0 0 request_sock_TCP 0 0 revoke_record 0 0 rpc_inode_cache 0 0 scsi_io_context 0 0 secpath_cache 0 0 skbuff_fclone_cache 0 0 tw_sock_TCP 0 0 udf_inode_cache 0 0 uhci_urb_priv 0 0 xfrm_dst_cache 1 169 dnotify_cache 1 30 arp_cache 1 7 mqueue_inode_cache 2 101 eventpoll_pwq 2 203 fasync_cache 2 254 revoke_table 2 30 eventpoll_epi 2 9 RAW 4 17 ip_conntrack 7 10 biovec-128 7 10 biovec-64 7 20 biovec-16 7 42 file_lock_cache 7 59 biovec-4 7 59 uid_cache 7 8 biovec-256 7 9 bdev_cache 8 127 inotify_event_cache 8 20 rpc_tasks 8 8 rpc_buffers 10 113 ip_fib_alias 10 113 ip_fib_hash 10 12 blkdev_queue 11 203 biovec-1 11 22 blkdev_requests 13 92 inotify_watch_cache 16 169 journal_handle 16 203 tcp_bind_bucket 16 72 journal_head 18 18 UDP 19 19 names_cache 19 28 TCP 22 30 mnt_cache 27 27 sigqueue 27 60 ip_dst_cache 32 32 sgpool-128 32 32 sgpool-32 32 32 sgpool-64 32 36 nfs_read_data 32 45 sgpool-16 32 60 sgpool-8 36 42 nfs_write_data 72 80 cfq_pool 74 127 blkdev_ioc 78 92 cfq_ioc_pool 94 94 pgd 107 113 fs_cache 108 108 mm_struct 108 140 files_cache 123 123 sighand_cache 125 140 UNIX 130 130 signal_cache 147 147 task_struct 154 174 idr_layer_cache 158 404 pid 190 190 sock_inode_cache 260 295 bio 273 273 proc_inode_cache 840 920 skbuff_head_cache 1234 1326 inode_cache 1507 1510 shmem_inode_cache 2871 3051 anon_vma 2910 3360 filp 5161 5292 sysfs_dir_cache 5762 6164 vm_area_struct 12056 19446 radix_tree_node 65776 151272 buffer_head 578304 578304 ext3_inode_cache 677490 677490 dentry_cache Jörn -- And spam is a useful source of entropy for /dev/random too! -- Jasmine Strong - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SLUB: The unqueued Slab allocator
On Sat, 24 February 2007 16:14:48 -0800, Christoph Lameter wrote: It eliminates 50% of the slab caches. Thus it reduces the management overhead by half. How much management overhead is there left with SLUB? Is it just the one per-node slab? Is there runtime overhead as well? In a slightly different approach, can we possibly get rid of some slab caches, instead of merging them at boot time? On my system I have 97 slab caches right now, ignoring the generic kmalloc() ones. Of those, 28 are completely empty, 23 contain =10 objects, 23 =100 and 23 contain 100 objects. It is fairly obvious to me that the highly populated slab caches are a big win. But is it worth it to have slab caches with a single object inside? Maybe some of these caches are populated for some systems. But there could also be candidates for removal among them. # active_objs num_objs name 0 0 dm-crypt_io 0 0 dm_io 0 0 dm_tio 0 0 ext3_xattr 0 0 fat_cache 0 0 fat_inode_cache 0 0 flow_cache 0 0 inet_peer_cache 0 0 ip_conntrack_expect 0 0 ip_mrt_cache 0 0 isofs_inode_cache 0 0 jbd_1k 0 0 jbd_4k 0 0 kiocb 0 0 kioctx 0 0 nfs_inode_cache 0 0 nfs_page 0 0 posix_timers_cache 0 0 request_sock_TCP 0 0 revoke_record 0 0 rpc_inode_cache 0 0 scsi_io_context 0 0 secpath_cache 0 0 skbuff_fclone_cache 0 0 tw_sock_TCP 0 0 udf_inode_cache 0 0 uhci_urb_priv 0 0 xfrm_dst_cache 1 169 dnotify_cache 1 30 arp_cache 1 7 mqueue_inode_cache 2 101 eventpoll_pwq 2 203 fasync_cache 2 254 revoke_table 2 30 eventpoll_epi 2 9 RAW 4 17 ip_conntrack 7 10 biovec-128 7 10 biovec-64 7 20 biovec-16 7 42 file_lock_cache 7 59 biovec-4 7 59 uid_cache 7 8 biovec-256 7 9 bdev_cache 8 127 inotify_event_cache 8 20 rpc_tasks 8 8 rpc_buffers 10 113 ip_fib_alias 10 113 ip_fib_hash 10 12 blkdev_queue 11 203 biovec-1 11 22 blkdev_requests 13 92 inotify_watch_cache 16 169 journal_handle 16 203 tcp_bind_bucket 16 72 journal_head 18 18 UDP 19 19 names_cache 19 28 TCP 22 30 mnt_cache 27 27 sigqueue 27 60 ip_dst_cache 32 32 sgpool-128 32 32 sgpool-32 32 32 sgpool-64 32 36 nfs_read_data 32 45 sgpool-16 32 60 sgpool-8 36 42 nfs_write_data 72 80 cfq_pool 74 127 blkdev_ioc 78 92 cfq_ioc_pool 94 94 pgd 107 113 fs_cache 108 108 mm_struct 108 140 files_cache 123 123 sighand_cache 125 140 UNIX 130 130 signal_cache 147 147 task_struct 154 174 idr_layer_cache 158 404 pid 190 190 sock_inode_cache 260 295 bio 273 273 proc_inode_cache 840 920 skbuff_head_cache 1234 1326 inode_cache 1507 1510 shmem_inode_cache 2871 3051 anon_vma 2910 3360 filp 5161 5292 sysfs_dir_cache 5762 6164 vm_area_struct 12056 19446 radix_tree_node 65776 151272 buffer_head 578304 578304 ext3_inode_cache 677490 677490 dentry_cache Jörn -- And spam is a useful source of entropy for /dev/random too! -- Jasmine Strong - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SLUB: The unqueued Slab allocator
From: Christoph Lameter <[EMAIL PROTECTED]> Date: Sat, 24 Feb 2007 09:32:49 -0800 (PST) > On Fri, 23 Feb 2007, David Miller wrote: > > > I also agree with Andi in that merging could mess up how object type > > local lifetimes help reduce fragmentation in object pools. > > If that is a problem for particular object pools then we may be able to > except those from the merging. If it is a problem, it's going to be a problem "in general" and not for specific SLAB caches. I think this is really a very unwise idea. We have enough fragmentation problems as it is. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SLUB: The unqueued Slab allocator
On Sat, 24 Feb 2007, Jörn Engel wrote: > How much of a gain is the merging anyway? Once you start having > explicit whitelists or blacklists of pools that can be merged, one can > start to wonder if the result is worth the effort. It eliminates 50% of the slab caches. Thus it reduces the management overhead by half.
Re: SLUB: The unqueued Slab allocator
On Sat, 24 February 2007 09:32:49 -0800, Christoph Lameter wrote: > > If that is a problem for particular object pools then we may be able to > except those from the merging. How much of a gain is the merging anyway? Once you start having explicit whitelists or blacklists of pools that can be merged, one can start to wonder if the result is worth the effort. Jörn -- Joern's library part 6: http://www.gzip.org/zlib/feldspar.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SLUB: The unqueued Slab allocator
On Fri, 23 Feb 2007, David Miller wrote: > > The general caches already merge lots of users depending on their sizes. > > So we already have the situation and we have tools to deal with it. > > But this doesn't happen for things like biovecs, and that will > make debugging painful. > > If a crash happens because of a corrupted biovec-256 I want to know > it was a biovec not some anonymous clone of kmalloc256. > > Please provide at a minimum a way to turn the merging off. Ok. Its currently a compile time option. Will make it possible to specify a boot option. > I also agree with Andi in that merging could mess up how object type > local lifetimes help reduce fragmentation in object pools. If that is a problem for particular object pools then we may be able to except those from the merging. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SLUB: The unqueued Slab allocator
On Fri, 23 Feb 2007, David Miller wrote: The general caches already merge lots of users depending on their sizes. So we already have the situation and we have tools to deal with it. But this doesn't happen for things like biovecs, and that will make debugging painful. If a crash happens because of a corrupted biovec-256 I want to know it was a biovec not some anonymous clone of kmalloc256. Please provide at a minimum a way to turn the merging off. Ok. Its currently a compile time option. Will make it possible to specify a boot option. I also agree with Andi in that merging could mess up how object type local lifetimes help reduce fragmentation in object pools. If that is a problem for particular object pools then we may be able to except those from the merging. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SLUB: The unqueued Slab allocator
On Sat, 24 February 2007 09:32:49 -0800, Christoph Lameter wrote: If that is a problem for particular object pools then we may be able to except those from the merging. How much of a gain is the merging anyway? Once you start having explicit whitelists or blacklists of pools that can be merged, one can start to wonder if the result is worth the effort. Jörn -- Joern's library part 6: http://www.gzip.org/zlib/feldspar.html - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SLUB: The unqueued Slab allocator
On Sat, 24 Feb 2007, Jörn Engel wrote: How much of a gain is the merging anyway? Once you start having explicit whitelists or blacklists of pools that can be merged, one can start to wonder if the result is worth the effort. It eliminates 50% of the slab caches. Thus it reduces the management overhead by half.
Re: SLUB: The unqueued Slab allocator
From: Christoph Lameter [EMAIL PROTECTED] Date: Sat, 24 Feb 2007 09:32:49 -0800 (PST) On Fri, 23 Feb 2007, David Miller wrote: I also agree with Andi in that merging could mess up how object type local lifetimes help reduce fragmentation in object pools. If that is a problem for particular object pools then we may be able to except those from the merging. If it is a problem, it's going to be a problem in general and not for specific SLAB caches. I think this is really a very unwise idea. We have enough fragmentation problems as it is. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SLUB: The unqueued Slab allocator
From: Christoph Lameter <[EMAIL PROTECTED]> Date: Fri, 23 Feb 2007 21:47:36 -0800 (PST) > On Sat, 24 Feb 2007, KAMEZAWA Hiroyuki wrote: > > > >From a viewpoint of a crash dump user, this merging will make crash dump > > investigation very very very difficult. > > The general caches already merge lots of users depending on their sizes. > So we already have the situation and we have tools to deal with it. But this doesn't happen for things like biovecs, and that will make debugging painful. If a crash happens because of a corrupted biovec-256 I want to know it was a biovec not some anonymous clone of kmalloc256. Please provide at a minimum a way to turn the merging off. I also agree with Andi in that merging could mess up how object type local lifetimes help reduce fragmentation in object pools. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SLUB: The unqueued Slab allocator
On Sat, 24 Feb 2007, KAMEZAWA Hiroyuki wrote: > >From a viewpoint of a crash dump user, this merging will make crash dump > investigation very very very difficult. The general caches already merge lots of users depending on their sizes. So we already have the situation and we have tools to deal with it. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SLUB: The unqueued Slab allocator
On Thu, 22 Feb 2007 10:42:23 -0800 (PST) Christoph Lameter <[EMAIL PROTECTED]> wrote: > > > G. Slab merging > > > > > >We often have slab caches with similar parameters. SLUB detects those > > >on bootup and merges them into the corresponding general caches. This > > >leads to more effective memory use. > > > > Did you do any tests on what that does to long term memory fragmentation? > > It is against the "object of same type have similar livetime and should > > be clustered together" theory at least. > > I have done no tests in that regard and we would have to assess the impact > that the merging has to overall system behavior. > >From a viewpoint of a crash dump user, this merging will make crash dump investigation very very very difficult. So please avoid this merging if the benefit is nog big. -Kame - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SLUB: The unqueued Slab allocator
On Thu, 22 Feb 2007 10:42:23 -0800 (PST) Christoph Lameter [EMAIL PROTECTED] wrote: G. Slab merging We often have slab caches with similar parameters. SLUB detects those on bootup and merges them into the corresponding general caches. This leads to more effective memory use. Did you do any tests on what that does to long term memory fragmentation? It is against the object of same type have similar livetime and should be clustered together theory at least. I have done no tests in that regard and we would have to assess the impact that the merging has to overall system behavior. From a viewpoint of a crash dump user, this merging will make crash dump investigation very very very difficult. So please avoid this merging if the benefit is nog big. -Kame - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SLUB: The unqueued Slab allocator
On Sat, 24 Feb 2007, KAMEZAWA Hiroyuki wrote: From a viewpoint of a crash dump user, this merging will make crash dump investigation very very very difficult. The general caches already merge lots of users depending on their sizes. So we already have the situation and we have tools to deal with it. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SLUB: The unqueued Slab allocator
From: Christoph Lameter [EMAIL PROTECTED] Date: Fri, 23 Feb 2007 21:47:36 -0800 (PST) On Sat, 24 Feb 2007, KAMEZAWA Hiroyuki wrote: From a viewpoint of a crash dump user, this merging will make crash dump investigation very very very difficult. The general caches already merge lots of users depending on their sizes. So we already have the situation and we have tools to deal with it. But this doesn't happen for things like biovecs, and that will make debugging painful. If a crash happens because of a corrupted biovec-256 I want to know it was a biovec not some anonymous clone of kmalloc256. Please provide at a minimum a way to turn the merging off. I also agree with Andi in that merging could mess up how object type local lifetimes help reduce fragmentation in object pools. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SLUB: The unqueued Slab allocator
On Fri, 23 Feb 2007, Andi Kleen wrote: > If you don't cache constructed but free objects then there is no cache > advantage of constructors/destructors and they would be useless. SLUB caches those objects as long as they are part of a partially allocated slab. If all objects in the slab are freed then the whole slab will be freed. SLUB does not keep queues of freed slabs. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SLUB: The unqueued Slab allocator
On Thu, Feb 22, 2007 at 10:42:23AM -0800, Christoph Lameter wrote: > On Thu, 22 Feb 2007, Andi Kleen wrote: > > > >SLUB does not need a cache reaper for UP systems. > > > > This means constructors/destructors are becomming worthless? > > Can you describe your rationale why you think they don't make > > sense on UP? > > Cache reaping has nothing to do with constructors and destructors. SLUB > fully supports constructors and destructors. If you don't cache constructed but free objects then there is no cache advantage of constructors/destructors and they would be useless. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SLUB: The unqueued Slab allocator
On Thu, 22 Feb 2007, Andi Kleen wrote: > >SLUB does not need a cache reaper for UP systems. > > This means constructors/destructors are becomming worthless? > Can you describe your rationale why you think they don't make > sense on UP? Cache reaping has nothing to do with constructors and destructors. SLUB fully supports constructors and destructors. > > G. Slab merging > > > >We often have slab caches with similar parameters. SLUB detects those > >on bootup and merges them into the corresponding general caches. This > >leads to more effective memory use. > > Did you do any tests on what that does to long term memory fragmentation? > It is against the "object of same type have similar livetime and should > be clustered together" theory at least. I have done no tests in that regard and we would have to assess the impact that the merging has to overall system behavior. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SLUB: The unqueued Slab allocator
Christoph Lameter <[EMAIL PROTECTED]> writes: > This is a new slab allocator which was motivated by the complexity of the > with the existing implementation. Thanks for doing that work. It certainly was long overdue. > D. SLAB has a complex cache reaper > >SLUB does not need a cache reaper for UP systems. This means constructors/destructors are becomming worthless? Can you describe your rationale why you think they don't make sense on UP? > G. Slab merging > >We often have slab caches with similar parameters. SLUB detects those >on bootup and merges them into the corresponding general caches. This >leads to more effective memory use. Did you do any tests on what that does to long term memory fragmentation? It is against the "object of same type have similar livetime and should be clustered together" theory at least. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SLUB: The unqueued Slab allocator
n Thu, 22 Feb 2007, David Miller wrote: > All of that logic needs to be protected by CONFIG_ZONE_DMA too. Right. Will fix that in the next release. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SLUB: The unqueued Slab allocator
On Thu, 22 Feb 2007, Peter Zijlstra wrote: > On Wed, 2007-02-21 at 23:00 -0800, Christoph Lameter wrote: > > > +/* > > + * Lock order: > > + * 1. slab_lock(page) > > + * 2. slab->list_lock > > + * > > That seems to contradict this: This is a trylock. If it fails then we can compensate by allocating a new slab. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SLUB: The unqueued Slab allocator
On Thu, 22 Feb 2007, Pekka Enberg wrote: > On 2/22/07, Christoph Lameter <[EMAIL PROTECTED]> wrote: > > This is a new slab allocator which was motivated by the complexity of the > > existing code in mm/slab.c. It attempts to address a variety of concerns > > with the existing implementation. > > So do you want to add a new allocator or replace slab? Add. The performance and quality is not comparable to SLAB at this point. > On 2/22/07, Christoph Lameter <[EMAIL PROTECTED]> wrote: > > B. Storage overhead of object queues > > Does this make sense for non-NUMA too? If not, can we disable the > queues for NUMA in current slab? Given the locking scheme in the current slab you cannot do that. Otherwise there will be a single lock taken for every operation limiting performace > On 2/22/07, Christoph Lameter <[EMAIL PROTECTED]> wrote: > > C. SLAB metadata overhead > > Can be done for the current slab code too, no? The per slab metadata of the SLAB does not fit into the page_struct. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SLUB: The unqueued Slab allocator
Hi Christoph, On 2/22/07, Christoph Lameter <[EMAIL PROTECTED]> wrote: This is a new slab allocator which was motivated by the complexity of the existing code in mm/slab.c. It attempts to address a variety of concerns with the existing implementation. So do you want to add a new allocator or replace slab? On 2/22/07, Christoph Lameter <[EMAIL PROTECTED]> wrote: B. Storage overhead of object queues Does this make sense for non-NUMA too? If not, can we disable the queues for NUMA in current slab? On 2/22/07, Christoph Lameter <[EMAIL PROTECTED]> wrote: C. SLAB metadata overhead Can be done for the current slab code too, no? Pekka - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SLUB: The unqueued Slab allocator
From: Christoph Lameter <[EMAIL PROTECTED]> Date: Wed, 21 Feb 2007 23:00:30 -0800 (PST) > +#ifdef CONFIG_ZONE_DMA > +static struct kmem_cache *kmalloc_caches_dma[KMALLOC_NR_CACHES]; > +#endif Therefore. > +static struct kmem_cache *get_slab(size_t size, gfp_t flags) > +{ ... > + s = kmalloc_caches_dma[index]; > + if (s) > + return s; > + > + /* Dynamically create dma cache */ > + x = kmalloc(sizeof(struct kmem_cache), flags & ~(__GFP_DMA)); > + > + if (!x) > + panic("Unable to allocate memory for dma cache\n"); > + > +#ifdef KMALLOC_EXTRA > + if (index <= KMALLOC_SHIFT_HIGH - KMALLOC_SHIFT_LOW) > +#endif > + realsize = 1 << index; > +#ifdef KMALLOC_EXTRA > + else if (index == KMALLOC_EXTRAS) > + realsize = 96; > + else > + realsize = 192; > +#endif > + > + s = create_kmalloc_cache(x, "kmalloc_dma", realsize); > + kmalloc_caches_dma[index] = s; > + return s; > +} All of that logic needs to be protected by CONFIG_ZONE_DMA too. I noticed this due to a build failure on sparc64 with this patch. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SLUB: The unqueued Slab allocator
On Wed, 2007-02-21 at 23:00 -0800, Christoph Lameter wrote: > +/* > + * Lock order: > + * 1. slab_lock(page) > + * 2. slab->list_lock > + * That seems to contradict this: > +/* > + * Lock page and remove it from the partial list > + * > + * Must hold list_lock > + */ > +static __always_inline int lock_and_del_slab(struct kmem_cache *s, > + struct page *page) > +{ > + if (slab_trylock(page)) { > + list_del(>lru); > + s->nr_partial--; > + return 1; > + } > + return 0; > +} > + > +/* > + * Get a partial page, lock it and return it. > + */ > +#ifdef CONFIG_NUMA > +static struct page *get_partial(struct kmem_cache *s, gfp_t flags, int node) > +{ > + struct page *page; > + int searchnode = (node == -1) ? numa_node_id() : node; > + > + if (!s->nr_partial) > + return NULL; > + > + spin_lock(>list_lock); > + /* > + * Search for slab on the right node > + */ > + list_for_each_entry(page, >partial, lru) > + if (likely(page_to_nid(page) == searchnode) && > + lock_and_del_slab(s, page)) > + goto out; > + > + if (likely(!(flags & __GFP_THISNODE))) { > + /* > + * We can fall back to any other node in order to > + * reduce the size of the partial list. > + */ > + list_for_each_entry(page, >partial, lru) > + if (likely(lock_and_del_slab(s, page))) > + goto out; > + } > + > + /* Nothing found */ > + page = NULL; > +out: > + spin_unlock(>list_lock); > + return page; > +} > +#else > +static struct page *get_partial(struct kmem_cache *s, gfp_t flags, int node) > +{ > + struct page *page; > + > + /* > + * Racy check. If we mistakenly see no partial slabs then we > + * just allocate an empty slab. > + */ > + if (!s->nr_partial) > + return NULL; > + > + spin_lock(>list_lock); > + list_for_each_entry(page, >partial, lru) > + if (likely(lock_and_del_slab(s, page))) > + goto out; > + > + /* No slab or all slabs busy */ > + page = NULL; > +out: > + spin_unlock(>list_lock); > + return page; > +} > +#endif - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SLUB: The unqueued Slab allocator
On Wed, 2007-02-21 at 23:00 -0800, Christoph Lameter wrote: +/* + * Lock order: + * 1. slab_lock(page) + * 2. slab-list_lock + * That seems to contradict this: +/* + * Lock page and remove it from the partial list + * + * Must hold list_lock + */ +static __always_inline int lock_and_del_slab(struct kmem_cache *s, + struct page *page) +{ + if (slab_trylock(page)) { + list_del(page-lru); + s-nr_partial--; + return 1; + } + return 0; +} + +/* + * Get a partial page, lock it and return it. + */ +#ifdef CONFIG_NUMA +static struct page *get_partial(struct kmem_cache *s, gfp_t flags, int node) +{ + struct page *page; + int searchnode = (node == -1) ? numa_node_id() : node; + + if (!s-nr_partial) + return NULL; + + spin_lock(s-list_lock); + /* + * Search for slab on the right node + */ + list_for_each_entry(page, s-partial, lru) + if (likely(page_to_nid(page) == searchnode) + lock_and_del_slab(s, page)) + goto out; + + if (likely(!(flags __GFP_THISNODE))) { + /* + * We can fall back to any other node in order to + * reduce the size of the partial list. + */ + list_for_each_entry(page, s-partial, lru) + if (likely(lock_and_del_slab(s, page))) + goto out; + } + + /* Nothing found */ + page = NULL; +out: + spin_unlock(s-list_lock); + return page; +} +#else +static struct page *get_partial(struct kmem_cache *s, gfp_t flags, int node) +{ + struct page *page; + + /* + * Racy check. If we mistakenly see no partial slabs then we + * just allocate an empty slab. + */ + if (!s-nr_partial) + return NULL; + + spin_lock(s-list_lock); + list_for_each_entry(page, s-partial, lru) + if (likely(lock_and_del_slab(s, page))) + goto out; + + /* No slab or all slabs busy */ + page = NULL; +out: + spin_unlock(s-list_lock); + return page; +} +#endif - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SLUB: The unqueued Slab allocator
From: Christoph Lameter [EMAIL PROTECTED] Date: Wed, 21 Feb 2007 23:00:30 -0800 (PST) +#ifdef CONFIG_ZONE_DMA +static struct kmem_cache *kmalloc_caches_dma[KMALLOC_NR_CACHES]; +#endif Therefore. +static struct kmem_cache *get_slab(size_t size, gfp_t flags) +{ ... + s = kmalloc_caches_dma[index]; + if (s) + return s; + + /* Dynamically create dma cache */ + x = kmalloc(sizeof(struct kmem_cache), flags ~(__GFP_DMA)); + + if (!x) + panic(Unable to allocate memory for dma cache\n); + +#ifdef KMALLOC_EXTRA + if (index = KMALLOC_SHIFT_HIGH - KMALLOC_SHIFT_LOW) +#endif + realsize = 1 index; +#ifdef KMALLOC_EXTRA + else if (index == KMALLOC_EXTRAS) + realsize = 96; + else + realsize = 192; +#endif + + s = create_kmalloc_cache(x, kmalloc_dma, realsize); + kmalloc_caches_dma[index] = s; + return s; +} All of that logic needs to be protected by CONFIG_ZONE_DMA too. I noticed this due to a build failure on sparc64 with this patch. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SLUB: The unqueued Slab allocator
Hi Christoph, On 2/22/07, Christoph Lameter [EMAIL PROTECTED] wrote: This is a new slab allocator which was motivated by the complexity of the existing code in mm/slab.c. It attempts to address a variety of concerns with the existing implementation. So do you want to add a new allocator or replace slab? On 2/22/07, Christoph Lameter [EMAIL PROTECTED] wrote: B. Storage overhead of object queues Does this make sense for non-NUMA too? If not, can we disable the queues for NUMA in current slab? On 2/22/07, Christoph Lameter [EMAIL PROTECTED] wrote: C. SLAB metadata overhead Can be done for the current slab code too, no? Pekka - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SLUB: The unqueued Slab allocator
On Thu, 22 Feb 2007, Pekka Enberg wrote: On 2/22/07, Christoph Lameter [EMAIL PROTECTED] wrote: This is a new slab allocator which was motivated by the complexity of the existing code in mm/slab.c. It attempts to address a variety of concerns with the existing implementation. So do you want to add a new allocator or replace slab? Add. The performance and quality is not comparable to SLAB at this point. On 2/22/07, Christoph Lameter [EMAIL PROTECTED] wrote: B. Storage overhead of object queues Does this make sense for non-NUMA too? If not, can we disable the queues for NUMA in current slab? Given the locking scheme in the current slab you cannot do that. Otherwise there will be a single lock taken for every operation limiting performace On 2/22/07, Christoph Lameter [EMAIL PROTECTED] wrote: C. SLAB metadata overhead Can be done for the current slab code too, no? The per slab metadata of the SLAB does not fit into the page_struct. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SLUB: The unqueued Slab allocator
On Thu, 22 Feb 2007, Peter Zijlstra wrote: On Wed, 2007-02-21 at 23:00 -0800, Christoph Lameter wrote: +/* + * Lock order: + * 1. slab_lock(page) + * 2. slab-list_lock + * That seems to contradict this: This is a trylock. If it fails then we can compensate by allocating a new slab. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SLUB: The unqueued Slab allocator
n Thu, 22 Feb 2007, David Miller wrote: All of that logic needs to be protected by CONFIG_ZONE_DMA too. Right. Will fix that in the next release. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SLUB: The unqueued Slab allocator
Christoph Lameter [EMAIL PROTECTED] writes: This is a new slab allocator which was motivated by the complexity of the with the existing implementation. Thanks for doing that work. It certainly was long overdue. D. SLAB has a complex cache reaper SLUB does not need a cache reaper for UP systems. This means constructors/destructors are becomming worthless? Can you describe your rationale why you think they don't make sense on UP? G. Slab merging We often have slab caches with similar parameters. SLUB detects those on bootup and merges them into the corresponding general caches. This leads to more effective memory use. Did you do any tests on what that does to long term memory fragmentation? It is against the object of same type have similar livetime and should be clustered together theory at least. -Andi - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SLUB: The unqueued Slab allocator
On Thu, 22 Feb 2007, Andi Kleen wrote: SLUB does not need a cache reaper for UP systems. This means constructors/destructors are becomming worthless? Can you describe your rationale why you think they don't make sense on UP? Cache reaping has nothing to do with constructors and destructors. SLUB fully supports constructors and destructors. G. Slab merging We often have slab caches with similar parameters. SLUB detects those on bootup and merges them into the corresponding general caches. This leads to more effective memory use. Did you do any tests on what that does to long term memory fragmentation? It is against the object of same type have similar livetime and should be clustered together theory at least. I have done no tests in that regard and we would have to assess the impact that the merging has to overall system behavior. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SLUB: The unqueued Slab allocator
On Thu, Feb 22, 2007 at 10:42:23AM -0800, Christoph Lameter wrote: On Thu, 22 Feb 2007, Andi Kleen wrote: SLUB does not need a cache reaper for UP systems. This means constructors/destructors are becomming worthless? Can you describe your rationale why you think they don't make sense on UP? Cache reaping has nothing to do with constructors and destructors. SLUB fully supports constructors and destructors. If you don't cache constructed but free objects then there is no cache advantage of constructors/destructors and they would be useless. -Andi - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SLUB: The unqueued Slab allocator
On Fri, 23 Feb 2007, Andi Kleen wrote: If you don't cache constructed but free objects then there is no cache advantage of constructors/destructors and they would be useless. SLUB caches those objects as long as they are part of a partially allocated slab. If all objects in the slab are freed then the whole slab will be freed. SLUB does not keep queues of freed slabs. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/