Re: [REPOST PATCH 3/4] slab: introduce byte sized index for the freelist of a slab
On Wed, Sep 11, 2013 at 02:22:25PM +, Christoph Lameter wrote: > On Wed, 11 Sep 2013, Joonsoo Kim wrote: > > > Anyway, could you review my previous patchset, that is, 'overload struct > > slab > > over struct page to reduce memory usage'? I'm not sure whether your answer > > is > > ack or not. > > I scanned over it before but I was not able to see if it was correct on > first glance. I will look at it again. Sounds good! Thanks :) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REPOST PATCH 3/4] slab: introduce byte sized index for the freelist of a slab
On Wed, 11 Sep 2013, Joonsoo Kim wrote: > Anyway, could you review my previous patchset, that is, 'overload struct slab > over struct page to reduce memory usage'? I'm not sure whether your answer is > ack or not. I scanned over it before but I was not able to see if it was correct on first glance. I will look at it again. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REPOST PATCH 3/4] slab: introduce byte sized index for the freelist of a slab
On Tue, Sep 10, 2013 at 09:25:05PM +, Christoph Lameter wrote: > On Tue, 10 Sep 2013, Joonsoo Kim wrote: > > > On Mon, Sep 09, 2013 at 02:44:03PM +, Christoph Lameter wrote: > > > On Mon, 9 Sep 2013, Joonsoo Kim wrote: > > > > > > > 32 byte is not minimum object size, minimum *kmalloc* object size > > > > in default configuration. There are some slabs that their object size is > > > > less than 32 byte. If we have a 8 byte sized kmem_cache, it has 512 > > > > objects > > > > in 4K page. > > > > > > As far as I can recall only SLUB supports 8 byte objects. SLABs mininum > > > has always been 32 bytes. > > > > No. > > There are many slabs that their object size are less than 32 byte. > > And I can also create a 8 byte sized slab in my kernel with SLAB. > > Well the minimum size for the kmalloc array is 32 bytes. These are custom > slabs. KMALLOC_SHIFT_LOW is set to 5 in include/linux/slab.h. > > Ok so there are some slabs like that. Hmmm.. We have sizes 16 and 24 in > your list. 16*256 is still 4096. So this would still work fine if we would > forbid a size of 8 or increase that by default to 16. > > > > On x86 f.e. it would add useless branching. The branches are never taken. > > > You only need these if you do bad things to the system like requiring > > > large contiguous allocs. > > > > As I said before, since there is a possibility that some runtime loaded > > modules > > use a 8 byte sized slab, we can't determine index size in compile time. > > Otherwise > > we should always use short int sized index and I think that it is worse than > > adding a branch. > > We can enforce a mininum slab size and an order limit so that it fits. And > then there would be no additional branching. > Okay. I will respin this patchset with your suggestion. Anyway, could you review my previous patchset, that is, 'overload struct slab over struct page to reduce memory usage'? I'm not sure whether your answer is ack or not. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REPOST PATCH 3/4] slab: introduce byte sized index for the freelist of a slab
On Tue, 10 Sep 2013, Joonsoo Kim wrote: > On Mon, Sep 09, 2013 at 02:44:03PM +, Christoph Lameter wrote: > > On Mon, 9 Sep 2013, Joonsoo Kim wrote: > > > > > 32 byte is not minimum object size, minimum *kmalloc* object size > > > in default configuration. There are some slabs that their object size is > > > less than 32 byte. If we have a 8 byte sized kmem_cache, it has 512 > > > objects > > > in 4K page. > > > > As far as I can recall only SLUB supports 8 byte objects. SLABs mininum > > has always been 32 bytes. > > No. > There are many slabs that their object size are less than 32 byte. > And I can also create a 8 byte sized slab in my kernel with SLAB. Well the minimum size for the kmalloc array is 32 bytes. These are custom slabs. KMALLOC_SHIFT_LOW is set to 5 in include/linux/slab.h. Ok so there are some slabs like that. Hmmm.. We have sizes 16 and 24 in your list. 16*256 is still 4096. So this would still work fine if we would forbid a size of 8 or increase that by default to 16. > > On x86 f.e. it would add useless branching. The branches are never taken. > > You only need these if you do bad things to the system like requiring > > large contiguous allocs. > > As I said before, since there is a possibility that some runtime loaded > modules > use a 8 byte sized slab, we can't determine index size in compile time. > Otherwise > we should always use short int sized index and I think that it is worse than > adding a branch. We can enforce a mininum slab size and an order limit so that it fits. And then there would be no additional branching. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REPOST PATCH 3/4] slab: introduce byte sized index for the freelist of a slab
On Mon, Sep 09, 2013 at 02:44:03PM +, Christoph Lameter wrote: > On Mon, 9 Sep 2013, Joonsoo Kim wrote: > > > 32 byte is not minimum object size, minimum *kmalloc* object size > > in default configuration. There are some slabs that their object size is > > less than 32 byte. If we have a 8 byte sized kmem_cache, it has 512 objects > > in 4K page. > > As far as I can recall only SLUB supports 8 byte objects. SLABs mininum > has always been 32 bytes. No. There are many slabs that their object size are less than 32 byte. And I can also create a 8 byte sized slab in my kernel with SLAB. js1304@js1304-P5Q-DELUXE:~/Projects/remote_git/linux$ sudo cat /proc/slabinfo | awk '{if($4 < 32) print $0}' slabinfo - version: 2.1 ecryptfs_file_cache 0 0 16 2401 : tunables 120 608 : slabdata 0 0 0 jbd2_revoke_table_s 2240 16 2401 : tunables 120 608 : slabdata 1 1 0 journal_handle 0 0 24 1631 : tunables 120 608 : slabdata 0 0 0 revoke_table 0 0 16 2401 : tunables 120 608 : slabdata 0 0 0 scsi_data_buffer 0 0 24 1631 : tunables 120 608 : slabdata 0 0 0 fsnotify_event_holder 0 0 24 1631 : tunables 120 608 : slabdata 0 0 0 numa_policy3163 24 1631 : tunables 120 608 : slabdata 1 1 0 > > > Moreover, we can configure slab_max_order in boot time so that we can't know > > how many object are in a certain slab in compile time. Therefore we can't > > decide the size of the index in compile time. > > You can ignore the slab_max_order if necessary. > > > I think that byte and short int sized index support would be enough, but > > it should be determined at runtime. > > On x86 f.e. it would add useless branching. The branches are never taken. > You only need these if you do bad things to the system like requiring > large contiguous allocs. As I said before, since there is a possibility that some runtime loaded modules use a 8 byte sized slab, we can't determine index size in compile time. Otherwise we should always use short int sized index and I think that it is worse than adding a branch. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REPOST PATCH 3/4] slab: introduce byte sized index for the freelist of a slab
On Mon, 9 Sep 2013, Joonsoo Kim wrote: > 32 byte is not minimum object size, minimum *kmalloc* object size > in default configuration. There are some slabs that their object size is > less than 32 byte. If we have a 8 byte sized kmem_cache, it has 512 objects > in 4K page. As far as I can recall only SLUB supports 8 byte objects. SLABs mininum has always been 32 bytes. > Moreover, we can configure slab_max_order in boot time so that we can't know > how many object are in a certain slab in compile time. Therefore we can't > decide the size of the index in compile time. You can ignore the slab_max_order if necessary. > I think that byte and short int sized index support would be enough, but > it should be determined at runtime. On x86 f.e. it would add useless branching. The branches are never taken. You only need these if you do bad things to the system like requiring large contiguous allocs. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REPOST PATCH 3/4] slab: introduce byte sized index for the freelist of a slab
On Fri, Sep 06, 2013 at 03:58:18PM +, Christoph Lameter wrote: > On Fri, 6 Sep 2013, Joonsoo Kim wrote: > > > Currently, the freelist of a slab consist of unsigned int sized indexes. > > Most of slabs have less number of objects than 256, since restriction > > for page order is at most 1 in default configuration. For example, > > consider a slab consisting of 32 byte sized objects on two continous > > pages. In this case, 256 objects is possible and these number fit to byte > > sized indexes. 256 objects is maximum possible value in default > > configuration, since 32 byte is minimum object size in the SLAB. > > (8192 / 32 = 256). Therefore, if we use byte sized index, we can save > > 3 bytes for each object. > > Ok then why is the patch making slab do either byte sized or int sized > indexes? Seems that you could do a clean cutover? > > > As you said: The mininum object size is 32 bytes for slab. 32 * 256 = > 8k. So we are fine unless the page size is > 8k. This is true for IA64 and > powerpc only I believe. The page size can be determined at compile time > and depending on that page size you could then choose a different size for > the indexes. Or the alternative is to increase the minimum slab object size. > A 16k page size would require a 64 byte minimum allocation. But thats no > good I guess. byte sized or short int sized index support would be enough. Sorry for misleading commit message. 32 byte is not minimum object size, minimum *kmalloc* object size in default configuration. There are some slabs that their object size is less than 32 byte. If we have a 8 byte sized kmem_cache, it has 512 objects in 4K page. Moreover, we can configure slab_max_order in boot time so that we can't know how many object are in a certain slab in compile time. Therefore we can't decide the size of the index in compile time. I think that byte and short int sized index support would be enough, but it should be determined at runtime. > > > This introduce one likely branch to functions used for setting/getting > > objects to/from the freelist, but we may get more benefits from > > this change. > > Lets not do that. IMHO, this is as best as we can. Do you have any better idea? Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REPOST PATCH 3/4] slab: introduce byte sized index for the freelist of a slab
On Fri, 6 Sep 2013, Joonsoo Kim wrote: > Currently, the freelist of a slab consist of unsigned int sized indexes. > Most of slabs have less number of objects than 256, since restriction > for page order is at most 1 in default configuration. For example, > consider a slab consisting of 32 byte sized objects on two continous > pages. In this case, 256 objects is possible and these number fit to byte > sized indexes. 256 objects is maximum possible value in default > configuration, since 32 byte is minimum object size in the SLAB. > (8192 / 32 = 256). Therefore, if we use byte sized index, we can save > 3 bytes for each object. Ok then why is the patch making slab do either byte sized or int sized indexes? Seems that you could do a clean cutover? As you said: The mininum object size is 32 bytes for slab. 32 * 256 = 8k. So we are fine unless the page size is > 8k. This is true for IA64 and powerpc only I believe. The page size can be determined at compile time and depending on that page size you could then choose a different size for the indexes. Or the alternative is to increase the minimum slab object size. A 16k page size would require a 64 byte minimum allocation. But thats no good I guess. byte sized or short int sized index support would be enough. > This introduce one likely branch to functions used for setting/getting > objects to/from the freelist, but we may get more benefits from > this change. Lets not do that. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[REPOST PATCH 3/4] slab: introduce byte sized index for the freelist of a slab
Currently, the freelist of a slab consist of unsigned int sized indexes. Most of slabs have less number of objects than 256, since restriction for page order is at most 1 in default configuration. For example, consider a slab consisting of 32 byte sized objects on two continous pages. In this case, 256 objects is possible and these number fit to byte sized indexes. 256 objects is maximum possible value in default configuration, since 32 byte is minimum object size in the SLAB. (8192 / 32 = 256). Therefore, if we use byte sized index, we can save 3 bytes for each object. This introduce one likely branch to functions used for setting/getting objects to/from the freelist, but we may get more benefits from this change. Signed-off-by: Joonsoo Kim diff --git a/mm/slab.c b/mm/slab.c index a0e49bb..bd366e5 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -565,8 +565,16 @@ static inline struct array_cache *cpu_cache_get(struct kmem_cache *cachep) return cachep->array[smp_processor_id()]; } -static int calculate_nr_objs(size_t slab_size, size_t buffer_size, - size_t idx_size, size_t align) +static inline bool can_byte_index(int nr_objs) +{ + if (likely(nr_objs <= (sizeof(unsigned char) << 8))) + return true; + + return false; +} + +static int __calculate_nr_objs(size_t slab_size, size_t buffer_size, + unsigned int idx_size, size_t align) { int nr_objs; size_t freelist_size; @@ -592,6 +600,29 @@ static int calculate_nr_objs(size_t slab_size, size_t buffer_size, return nr_objs; } +static int calculate_nr_objs(size_t slab_size, size_t buffer_size, + size_t align) +{ + int nr_objs; + int byte_nr_objs; + + nr_objs = __calculate_nr_objs(slab_size, buffer_size, + sizeof(unsigned int), align); + if (!can_byte_index(nr_objs)) + return nr_objs; + + byte_nr_objs = __calculate_nr_objs(slab_size, buffer_size, + sizeof(unsigned char), align); + /* +* nr_objs can be larger when using byte index, +* so that it cannot be indexed by byte index. +*/ + if (can_byte_index(byte_nr_objs)) + return byte_nr_objs; + else + return nr_objs; +} + /* * Calculate the number of objects and left-over bytes for a given buffer size. */ @@ -618,13 +649,18 @@ static void cache_estimate(unsigned long gfporder, size_t buffer_size, * correct alignment when allocated. */ if (flags & CFLGS_OFF_SLAB) { - mgmt_size = 0; nr_objs = slab_size / buffer_size; + mgmt_size = 0; } else { - nr_objs = calculate_nr_objs(slab_size, buffer_size, - sizeof(unsigned int), align); - mgmt_size = ALIGN(nr_objs * sizeof(unsigned int), align); + nr_objs = calculate_nr_objs(slab_size, buffer_size, align); + if (can_byte_index(nr_objs)) { + mgmt_size = + ALIGN(nr_objs * sizeof(unsigned char), align); + } else { + mgmt_size = + ALIGN(nr_objs * sizeof(unsigned int), align); + } } *num = nr_objs; *left_over = slab_size - (nr_objs * buffer_size) - mgmt_size; @@ -2012,7 +2048,10 @@ static size_t calculate_slab_order(struct kmem_cache *cachep, * looping condition in cache_grow(). */ offslab_limit = size; - offslab_limit /= sizeof(unsigned int); + if (can_byte_index(num)) + offslab_limit /= sizeof(unsigned char); + else + offslab_limit /= sizeof(unsigned int); if (num > offslab_limit) break; @@ -2253,8 +2292,13 @@ __kmem_cache_create (struct kmem_cache *cachep, unsigned long flags) if (!cachep->num) return -E2BIG; - freelist_size = - ALIGN(cachep->num * sizeof(unsigned int), cachep->align); + if (can_byte_index(cachep->num)) { + freelist_size = ALIGN(cachep->num * sizeof(unsigned char), + cachep->align); + } else { + freelist_size = ALIGN(cachep->num * sizeof(unsigned int), + cachep->align); + } /* * If the slab has been placed off-slab, and we have enough space then @@ -2267,7 +2311,10 @@ __kmem_cache_create (struct kmem_cache *cachep, unsigned long flags) if (flags & CFLGS_OFF_SLAB) {
[PATCH 3/4] slab: introduce byte sized index for the freelist of a slab
Currently, the freelist of a slab consist of unsigned int sized indexes. Most of slabs have less number of objects than 256, since restriction for page order is at most 1 in default configuration. For example, consider a slab consisting of 32 byte sized objects on two continous pages. In this case, 256 objects is possible and these number fit to byte sized indexes. 256 objects is maximum possible value in default configuration, since 32 byte is minimum object size in the SLAB. (8192 / 32 = 256). Therefore, if we use byte sized index, we can save 3 bytes for each object. This introduce one likely branch to functions used for setting/getting objects to/from the freelist, but we may get more benefits from this change. Signed-off-by: Joonsoo Kim diff --git a/mm/slab.c b/mm/slab.c index a0e49bb..bd366e5 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -565,8 +565,16 @@ static inline struct array_cache *cpu_cache_get(struct kmem_cache *cachep) return cachep->array[smp_processor_id()]; } -static int calculate_nr_objs(size_t slab_size, size_t buffer_size, - size_t idx_size, size_t align) +static inline bool can_byte_index(int nr_objs) +{ + if (likely(nr_objs <= (sizeof(unsigned char) << 8))) + return true; + + return false; +} + +static int __calculate_nr_objs(size_t slab_size, size_t buffer_size, + unsigned int idx_size, size_t align) { int nr_objs; size_t freelist_size; @@ -592,6 +600,29 @@ static int calculate_nr_objs(size_t slab_size, size_t buffer_size, return nr_objs; } +static int calculate_nr_objs(size_t slab_size, size_t buffer_size, + size_t align) +{ + int nr_objs; + int byte_nr_objs; + + nr_objs = __calculate_nr_objs(slab_size, buffer_size, + sizeof(unsigned int), align); + if (!can_byte_index(nr_objs)) + return nr_objs; + + byte_nr_objs = __calculate_nr_objs(slab_size, buffer_size, + sizeof(unsigned char), align); + /* +* nr_objs can be larger when using byte index, +* so that it cannot be indexed by byte index. +*/ + if (can_byte_index(byte_nr_objs)) + return byte_nr_objs; + else + return nr_objs; +} + /* * Calculate the number of objects and left-over bytes for a given buffer size. */ @@ -618,13 +649,18 @@ static void cache_estimate(unsigned long gfporder, size_t buffer_size, * correct alignment when allocated. */ if (flags & CFLGS_OFF_SLAB) { - mgmt_size = 0; nr_objs = slab_size / buffer_size; + mgmt_size = 0; } else { - nr_objs = calculate_nr_objs(slab_size, buffer_size, - sizeof(unsigned int), align); - mgmt_size = ALIGN(nr_objs * sizeof(unsigned int), align); + nr_objs = calculate_nr_objs(slab_size, buffer_size, align); + if (can_byte_index(nr_objs)) { + mgmt_size = + ALIGN(nr_objs * sizeof(unsigned char), align); + } else { + mgmt_size = + ALIGN(nr_objs * sizeof(unsigned int), align); + } } *num = nr_objs; *left_over = slab_size - (nr_objs * buffer_size) - mgmt_size; @@ -2012,7 +2048,10 @@ static size_t calculate_slab_order(struct kmem_cache *cachep, * looping condition in cache_grow(). */ offslab_limit = size; - offslab_limit /= sizeof(unsigned int); + if (can_byte_index(num)) + offslab_limit /= sizeof(unsigned char); + else + offslab_limit /= sizeof(unsigned int); if (num > offslab_limit) break; @@ -2253,8 +2292,13 @@ __kmem_cache_create (struct kmem_cache *cachep, unsigned long flags) if (!cachep->num) return -E2BIG; - freelist_size = - ALIGN(cachep->num * sizeof(unsigned int), cachep->align); + if (can_byte_index(cachep->num)) { + freelist_size = ALIGN(cachep->num * sizeof(unsigned char), + cachep->align); + } else { + freelist_size = ALIGN(cachep->num * sizeof(unsigned int), + cachep->align); + } /* * If the slab has been placed off-slab, and we have enough space then @@ -2267,7 +2311,10 @@ __kmem_cache_create (struct kmem_cache *cachep, unsigned long flags) if (flags & CFLGS_OFF_SLAB) {