Re: [PATCH v2] mm/slub: don't wait for high-order page allocation
On Fri, 7 Aug 2015, Joonsoo Kim wrote: > Almost description is copied from commit fb05e7a89f50 > ("net: don't wait for order-3 page allocation"). > > I saw excessive direct memory reclaim/compaction triggered by slub. > This causes performance issues and add latency. Slub uses high-order > allocation to reduce internal fragmentation and management overhead. But, > direct memory reclaim/compaction has high overhead and the benefit of > high-order allocation can't compensate the overhead of both work. > > This patch makes auxiliary high-order allocation atomic. If there is > no memory pressure and memory isn't fragmented, the alloction will still > success, so we don't sacrifice high-order allocation's benefit here. > If the atomic allocation fails, direct memory reclaim/compaction will not > be triggered, allocation fallback to low-order immediately, hence > the direct memory reclaim/compaction overhead is avoided. In the > allocation failure case, kswapd is waken up and trying to make high-order > freepages, so allocation could success next time. > > Following is the test to measure effect of this patch. > > System: QEMU, CPU 8, 512 MB > Mem: 25% memory is allocated at random position to make fragmentation. > Memory-hogger occupies 150 MB memory. > Workload: hackbench -g 20 -l 1000 > > Average result by 10 runs (Base va Patched) > > elapsed_time(s): 4.3468 vs 2.9838 > compact_stall: 461.7 vs 73.6 > pgmigrate_success: 28315.9 vs 7256.1 > > Signed-off-by: Joonsoo Kim Acked-by: David Rientjes -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] mm/slub: don't wait for high-order page allocation
On Fri, 7 Aug 2015, Joonsoo Kim wrote: Almost description is copied from commit fb05e7a89f50 (net: don't wait for order-3 page allocation). I saw excessive direct memory reclaim/compaction triggered by slub. This causes performance issues and add latency. Slub uses high-order allocation to reduce internal fragmentation and management overhead. But, direct memory reclaim/compaction has high overhead and the benefit of high-order allocation can't compensate the overhead of both work. This patch makes auxiliary high-order allocation atomic. If there is no memory pressure and memory isn't fragmented, the alloction will still success, so we don't sacrifice high-order allocation's benefit here. If the atomic allocation fails, direct memory reclaim/compaction will not be triggered, allocation fallback to low-order immediately, hence the direct memory reclaim/compaction overhead is avoided. In the allocation failure case, kswapd is waken up and trying to make high-order freepages, so allocation could success next time. Following is the test to measure effect of this patch. System: QEMU, CPU 8, 512 MB Mem: 25% memory is allocated at random position to make fragmentation. Memory-hogger occupies 150 MB memory. Workload: hackbench -g 20 -l 1000 Average result by 10 runs (Base va Patched) elapsed_time(s): 4.3468 vs 2.9838 compact_stall: 461.7 vs 73.6 pgmigrate_success: 28315.9 vs 7256.1 Signed-off-by: Joonsoo Kim iamjoonsoo@lge.com Acked-by: David Rientjes rient...@google.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] mm/slub: don't wait for high-order page allocation
On Mon 10-08-15 09:40:22, Joonsoo Kim wrote: > On Fri, Aug 07, 2015 at 05:05:01PM +0200, Michal Hocko wrote: > > On Fri 07-08-15 11:10:03, Joonsoo Kim wrote: > > [...] > > > diff --git a/mm/slub.c b/mm/slub.c > > > index 257283f..52b9025 100644 > > > --- a/mm/slub.c > > > +++ b/mm/slub.c > > > @@ -1364,6 +1364,8 @@ static struct page *allocate_slab(struct kmem_cache > > > *s, gfp_t flags, int node) > > >* so we fall-back to the minimum order allocation. > > >*/ > > > alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY) & ~__GFP_NOFAIL; > > > + if ((alloc_gfp & __GFP_WAIT) && oo_order(oo) > oo_order(s->min)) > > > + alloc_gfp = (alloc_gfp | __GFP_NOMEMALLOC) & ~__GFP_WAIT; > > > > Wouldn't it be preferable to "fix" the __GFP_WAIT behavior than spilling > > __GFP_NOMEMALLOC around the kernel? GFP flags are getting harder and > > harder to use right and that is a signal we should thing about it and > > unclutter the current state. > > Maybe, it is preferable. Could you try that? I will try to cook up something during the week. > Anyway, it is separate issue so I don't want pending this patch until > that change. OK, fair enough, at least this one is in mm proper... -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] mm/slub: don't wait for high-order page allocation
On Mon 10-08-15 09:40:22, Joonsoo Kim wrote: On Fri, Aug 07, 2015 at 05:05:01PM +0200, Michal Hocko wrote: On Fri 07-08-15 11:10:03, Joonsoo Kim wrote: [...] diff --git a/mm/slub.c b/mm/slub.c index 257283f..52b9025 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -1364,6 +1364,8 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node) * so we fall-back to the minimum order allocation. */ alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY) ~__GFP_NOFAIL; + if ((alloc_gfp __GFP_WAIT) oo_order(oo) oo_order(s-min)) + alloc_gfp = (alloc_gfp | __GFP_NOMEMALLOC) ~__GFP_WAIT; Wouldn't it be preferable to fix the __GFP_WAIT behavior than spilling __GFP_NOMEMALLOC around the kernel? GFP flags are getting harder and harder to use right and that is a signal we should thing about it and unclutter the current state. Maybe, it is preferable. Could you try that? I will try to cook up something during the week. Anyway, it is separate issue so I don't want pending this patch until that change. OK, fair enough, at least this one is in mm proper... -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] mm/slub: don't wait for high-order page allocation
On Fri, Aug 07, 2015 at 05:05:01PM +0200, Michal Hocko wrote: > On Fri 07-08-15 11:10:03, Joonsoo Kim wrote: > [...] > > diff --git a/mm/slub.c b/mm/slub.c > > index 257283f..52b9025 100644 > > --- a/mm/slub.c > > +++ b/mm/slub.c > > @@ -1364,6 +1364,8 @@ static struct page *allocate_slab(struct kmem_cache > > *s, gfp_t flags, int node) > > * so we fall-back to the minimum order allocation. > > */ > > alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY) & ~__GFP_NOFAIL; > > + if ((alloc_gfp & __GFP_WAIT) && oo_order(oo) > oo_order(s->min)) > > + alloc_gfp = (alloc_gfp | __GFP_NOMEMALLOC) & ~__GFP_WAIT; > > Wouldn't it be preferable to "fix" the __GFP_WAIT behavior than spilling > __GFP_NOMEMALLOC around the kernel? GFP flags are getting harder and > harder to use right and that is a signal we should thing about it and > unclutter the current state. Maybe, it is preferable. Could you try that? Anyway, it is separate issue so I don't want pending this patch until that change. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] mm/slub: don't wait for high-order page allocation
On Fri, Aug 07, 2015 at 05:05:01PM +0200, Michal Hocko wrote: On Fri 07-08-15 11:10:03, Joonsoo Kim wrote: [...] diff --git a/mm/slub.c b/mm/slub.c index 257283f..52b9025 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -1364,6 +1364,8 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node) * so we fall-back to the minimum order allocation. */ alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY) ~__GFP_NOFAIL; + if ((alloc_gfp __GFP_WAIT) oo_order(oo) oo_order(s-min)) + alloc_gfp = (alloc_gfp | __GFP_NOMEMALLOC) ~__GFP_WAIT; Wouldn't it be preferable to fix the __GFP_WAIT behavior than spilling __GFP_NOMEMALLOC around the kernel? GFP flags are getting harder and harder to use right and that is a signal we should thing about it and unclutter the current state. Maybe, it is preferable. Could you try that? Anyway, it is separate issue so I don't want pending this patch until that change. Thanks. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] mm/slub: don't wait for high-order page allocation
On Fri 07-08-15 11:10:03, Joonsoo Kim wrote: [...] > diff --git a/mm/slub.c b/mm/slub.c > index 257283f..52b9025 100644 > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -1364,6 +1364,8 @@ static struct page *allocate_slab(struct kmem_cache *s, > gfp_t flags, int node) >* so we fall-back to the minimum order allocation. >*/ > alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY) & ~__GFP_NOFAIL; > + if ((alloc_gfp & __GFP_WAIT) && oo_order(oo) > oo_order(s->min)) > + alloc_gfp = (alloc_gfp | __GFP_NOMEMALLOC) & ~__GFP_WAIT; Wouldn't it be preferable to "fix" the __GFP_WAIT behavior than spilling __GFP_NOMEMALLOC around the kernel? GFP flags are getting harder and harder to use right and that is a signal we should thing about it and unclutter the current state. > > page = alloc_slab_page(s, alloc_gfp, node, oo); > if (unlikely(!page)) { > -- > 1.9.1 -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] mm/slub: don't wait for high-order page allocation
On Fri 07-08-15 11:10:03, Joonsoo Kim wrote: [...] diff --git a/mm/slub.c b/mm/slub.c index 257283f..52b9025 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -1364,6 +1364,8 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node) * so we fall-back to the minimum order allocation. */ alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY) ~__GFP_NOFAIL; + if ((alloc_gfp __GFP_WAIT) oo_order(oo) oo_order(s-min)) + alloc_gfp = (alloc_gfp | __GFP_NOMEMALLOC) ~__GFP_WAIT; Wouldn't it be preferable to fix the __GFP_WAIT behavior than spilling __GFP_NOMEMALLOC around the kernel? GFP flags are getting harder and harder to use right and that is a signal we should thing about it and unclutter the current state. page = alloc_slab_page(s, alloc_gfp, node, oo); if (unlikely(!page)) { -- 1.9.1 -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2] mm/slub: don't wait for high-order page allocation
Almost description is copied from commit fb05e7a89f50 ("net: don't wait for order-3 page allocation"). I saw excessive direct memory reclaim/compaction triggered by slub. This causes performance issues and add latency. Slub uses high-order allocation to reduce internal fragmentation and management overhead. But, direct memory reclaim/compaction has high overhead and the benefit of high-order allocation can't compensate the overhead of both work. This patch makes auxiliary high-order allocation atomic. If there is no memory pressure and memory isn't fragmented, the alloction will still success, so we don't sacrifice high-order allocation's benefit here. If the atomic allocation fails, direct memory reclaim/compaction will not be triggered, allocation fallback to low-order immediately, hence the direct memory reclaim/compaction overhead is avoided. In the allocation failure case, kswapd is waken up and trying to make high-order freepages, so allocation could success next time. Following is the test to measure effect of this patch. System: QEMU, CPU 8, 512 MB Mem: 25% memory is allocated at random position to make fragmentation. Memory-hogger occupies 150 MB memory. Workload: hackbench -g 20 -l 1000 Average result by 10 runs (Base va Patched) elapsed_time(s): 4.3468 vs 2.9838 compact_stall: 461.7 vs 73.6 pgmigrate_success: 28315.9 vs 7256.1 Signed-off-by: Joonsoo Kim --- mm/slub.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/mm/slub.c b/mm/slub.c index 257283f..52b9025 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -1364,6 +1364,8 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node) * so we fall-back to the minimum order allocation. */ alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY) & ~__GFP_NOFAIL; + if ((alloc_gfp & __GFP_WAIT) && oo_order(oo) > oo_order(s->min)) + alloc_gfp = (alloc_gfp | __GFP_NOMEMALLOC) & ~__GFP_WAIT; page = alloc_slab_page(s, alloc_gfp, node, oo); if (unlikely(!page)) { -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2] mm/slub: don't wait for high-order page allocation
Almost description is copied from commit fb05e7a89f50 (net: don't wait for order-3 page allocation). I saw excessive direct memory reclaim/compaction triggered by slub. This causes performance issues and add latency. Slub uses high-order allocation to reduce internal fragmentation and management overhead. But, direct memory reclaim/compaction has high overhead and the benefit of high-order allocation can't compensate the overhead of both work. This patch makes auxiliary high-order allocation atomic. If there is no memory pressure and memory isn't fragmented, the alloction will still success, so we don't sacrifice high-order allocation's benefit here. If the atomic allocation fails, direct memory reclaim/compaction will not be triggered, allocation fallback to low-order immediately, hence the direct memory reclaim/compaction overhead is avoided. In the allocation failure case, kswapd is waken up and trying to make high-order freepages, so allocation could success next time. Following is the test to measure effect of this patch. System: QEMU, CPU 8, 512 MB Mem: 25% memory is allocated at random position to make fragmentation. Memory-hogger occupies 150 MB memory. Workload: hackbench -g 20 -l 1000 Average result by 10 runs (Base va Patched) elapsed_time(s): 4.3468 vs 2.9838 compact_stall: 461.7 vs 73.6 pgmigrate_success: 28315.9 vs 7256.1 Signed-off-by: Joonsoo Kim iamjoonsoo@lge.com --- mm/slub.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/mm/slub.c b/mm/slub.c index 257283f..52b9025 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -1364,6 +1364,8 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node) * so we fall-back to the minimum order allocation. */ alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY) ~__GFP_NOFAIL; + if ((alloc_gfp __GFP_WAIT) oo_order(oo) oo_order(s-min)) + alloc_gfp = (alloc_gfp | __GFP_NOMEMALLOC) ~__GFP_WAIT; page = alloc_slab_page(s, alloc_gfp, node, oo); if (unlikely(!page)) { -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/