Re: [PATCH] mm/memcontrol: fix NULL pointer dereference when use_hierarchy is 0
On Tue, Feb 17, 2015 at 09:33:27AM +0100, Michal Hocko wrote: > On Tue 17-02-15 14:24:59, Joonsoo Kim wrote: > > It can be possible to return NULL in parent_mem_cgroup() > > if use_hierarchy is 0. > > This alone is not sufficient because the low limit is present only in > the unified hierarchy API and there is no use_hierarchy there. The > primary issue here is that the memcg has 0 usage so the previous > check for usage will not stop us. And that is bug IMO. > > I think that the following patch would be more correct from semantic > POV: > --- > >From f5d74671d30e44c50b45b4464c92f536f1dbdff6 Mon Sep 17 00:00:00 2001 > From: Michal Hocko > Date: Tue, 17 Feb 2015 08:02:12 +0100 > Subject: [PATCH] memcg: fix low limit calculation > > A memcg is considered low limited even when the current usage is equal > to the low limit. This leads to interesting side effects e.g. > groups/hierarchies with no memory accounted are considered protected and > so the reclaim will emit MEMCG_LOW event when encountering them. > > Another and much bigger issue was reported by Joonsoo Kim. He has hit a > NULL ptr dereference with the legacy cgroup API which even doesn't have > low limit exposed. The limit is 0 by default but the initial check fails > for memcg with 0 consumption and parent_mem_cgroup() would return NULL > if use_hierarchy is 0 and so page_counter_read would try to dereference > NULL. > > I suppose that the current implementation is just an overlook because > the documentation in Documentation/cgroups/unified-hierarchy.txt says: > " > The memory.low boundary on the other hand is a top-down allocated > reserve. A cgroup enjoys reclaim protection when it and all its > ancestors are below their low boundaries > " > > Fix the usage and the low limit comparision in mem_cgroup_low accordingly. > > Fixes: 241994ed8649 (mm: memcontrol: default hierarchy interface for memory) > Reported-by: Joonsoo Kim > Signed-off-by: Michal Hocko Good! This fixes my issue. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm/memcontrol: fix NULL pointer dereference when use_hierarchy is 0
On Tue, Feb 17, 2015 at 09:33:27AM +0100, Michal Hocko wrote: On Tue 17-02-15 14:24:59, Joonsoo Kim wrote: It can be possible to return NULL in parent_mem_cgroup() if use_hierarchy is 0. This alone is not sufficient because the low limit is present only in the unified hierarchy API and there is no use_hierarchy there. The primary issue here is that the memcg has 0 usage so the previous check for usage will not stop us. And that is bug IMO. I think that the following patch would be more correct from semantic POV: --- From f5d74671d30e44c50b45b4464c92f536f1dbdff6 Mon Sep 17 00:00:00 2001 From: Michal Hocko mho...@suse.cz Date: Tue, 17 Feb 2015 08:02:12 +0100 Subject: [PATCH] memcg: fix low limit calculation A memcg is considered low limited even when the current usage is equal to the low limit. This leads to interesting side effects e.g. groups/hierarchies with no memory accounted are considered protected and so the reclaim will emit MEMCG_LOW event when encountering them. Another and much bigger issue was reported by Joonsoo Kim. He has hit a NULL ptr dereference with the legacy cgroup API which even doesn't have low limit exposed. The limit is 0 by default but the initial check fails for memcg with 0 consumption and parent_mem_cgroup() would return NULL if use_hierarchy is 0 and so page_counter_read would try to dereference NULL. I suppose that the current implementation is just an overlook because the documentation in Documentation/cgroups/unified-hierarchy.txt says: The memory.low boundary on the other hand is a top-down allocated reserve. A cgroup enjoys reclaim protection when it and all its ancestors are below their low boundaries Fix the usage and the low limit comparision in mem_cgroup_low accordingly. Fixes: 241994ed8649 (mm: memcontrol: default hierarchy interface for memory) Reported-by: Joonsoo Kim js1...@gmail.com Signed-off-by: Michal Hocko mho...@suse.cz Good! This fixes my issue. Thanks. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm/memcontrol: fix NULL pointer dereference when use_hierarchy is 0
On Tue, Feb 17, 2015 at 09:33:27AM +0100, Michal Hocko wrote: > On Tue 17-02-15 14:24:59, Joonsoo Kim wrote: > > It can be possible to return NULL in parent_mem_cgroup() > > if use_hierarchy is 0. > > This alone is not sufficient because the low limit is present only in > the unified hierarchy API and there is no use_hierarchy there. The > primary issue here is that the memcg has 0 usage so the previous > check for usage will not stop us. And that is bug IMO. Yes, empty groups shouldn't be considered low. > From f5d74671d30e44c50b45b4464c92f536f1dbdff6 Mon Sep 17 00:00:00 2001 > From: Michal Hocko > Date: Tue, 17 Feb 2015 08:02:12 +0100 > Subject: [PATCH] memcg: fix low limit calculation > > A memcg is considered low limited even when the current usage is equal > to the low limit. This leads to interesting side effects e.g. > groups/hierarchies with no memory accounted are considered protected and > so the reclaim will emit MEMCG_LOW event when encountering them. > > Another and much bigger issue was reported by Joonsoo Kim. He has hit a > NULL ptr dereference with the legacy cgroup API which even doesn't have > low limit exposed. The limit is 0 by default but the initial check fails > for memcg with 0 consumption and parent_mem_cgroup() would return NULL > if use_hierarchy is 0 and so page_counter_read would try to dereference > NULL. > > I suppose that the current implementation is just an overlook because > the documentation in Documentation/cgroups/unified-hierarchy.txt says: > " > The memory.low boundary on the other hand is a top-down allocated > reserve. A cgroup enjoys reclaim protection when it and all its > ancestors are below their low boundaries > " > > Fix the usage and the low limit comparision in mem_cgroup_low accordingly. > > Fixes: 241994ed8649 (mm: memcontrol: default hierarchy interface for memory) > Reported-by: Joonsoo Kim > Signed-off-by: Michal Hocko Acked-by: Johannes Weiner -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm/memcontrol: fix NULL pointer dereference when use_hierarchy is 0
On Tue 17-02-15 14:24:59, Joonsoo Kim wrote: > It can be possible to return NULL in parent_mem_cgroup() > if use_hierarchy is 0. This alone is not sufficient because the low limit is present only in the unified hierarchy API and there is no use_hierarchy there. The primary issue here is that the memcg has 0 usage so the previous check for usage will not stop us. And that is bug IMO. I think that the following patch would be more correct from semantic POV: --- >From f5d74671d30e44c50b45b4464c92f536f1dbdff6 Mon Sep 17 00:00:00 2001 From: Michal Hocko Date: Tue, 17 Feb 2015 08:02:12 +0100 Subject: [PATCH] memcg: fix low limit calculation A memcg is considered low limited even when the current usage is equal to the low limit. This leads to interesting side effects e.g. groups/hierarchies with no memory accounted are considered protected and so the reclaim will emit MEMCG_LOW event when encountering them. Another and much bigger issue was reported by Joonsoo Kim. He has hit a NULL ptr dereference with the legacy cgroup API which even doesn't have low limit exposed. The limit is 0 by default but the initial check fails for memcg with 0 consumption and parent_mem_cgroup() would return NULL if use_hierarchy is 0 and so page_counter_read would try to dereference NULL. I suppose that the current implementation is just an overlook because the documentation in Documentation/cgroups/unified-hierarchy.txt says: " The memory.low boundary on the other hand is a top-down allocated reserve. A cgroup enjoys reclaim protection when it and all its ancestors are below their low boundaries " Fix the usage and the low limit comparision in mem_cgroup_low accordingly. Fixes: 241994ed8649 (mm: memcontrol: default hierarchy interface for memory) Reported-by: Joonsoo Kim Signed-off-by: Michal Hocko --- mm/memcontrol.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 0b436bc02ba4..079b5c02e245 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -5426,7 +5426,7 @@ bool mem_cgroup_low(struct mem_cgroup *root, struct mem_cgroup *memcg) if (memcg == root_mem_cgroup) return false; - if (page_counter_read(>memory) > memcg->low) + if (page_counter_read(>memory) >= memcg->low) return false; while (memcg != root) { @@ -5435,7 +5435,7 @@ bool mem_cgroup_low(struct mem_cgroup *root, struct mem_cgroup *memcg) if (memcg == root_mem_cgroup) break; - if (page_counter_read(>memory) > memcg->low) + if (page_counter_read(>memory) >= memcg->low) return false; } return true; -- 2.1.4 -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm/memcontrol: fix NULL pointer dereference when use_hierarchy is 0
On Tue 17-02-15 14:24:59, Joonsoo Kim wrote: It can be possible to return NULL in parent_mem_cgroup() if use_hierarchy is 0. This alone is not sufficient because the low limit is present only in the unified hierarchy API and there is no use_hierarchy there. The primary issue here is that the memcg has 0 usage so the previous check for usage will not stop us. And that is bug IMO. I think that the following patch would be more correct from semantic POV: --- From f5d74671d30e44c50b45b4464c92f536f1dbdff6 Mon Sep 17 00:00:00 2001 From: Michal Hocko mho...@suse.cz Date: Tue, 17 Feb 2015 08:02:12 +0100 Subject: [PATCH] memcg: fix low limit calculation A memcg is considered low limited even when the current usage is equal to the low limit. This leads to interesting side effects e.g. groups/hierarchies with no memory accounted are considered protected and so the reclaim will emit MEMCG_LOW event when encountering them. Another and much bigger issue was reported by Joonsoo Kim. He has hit a NULL ptr dereference with the legacy cgroup API which even doesn't have low limit exposed. The limit is 0 by default but the initial check fails for memcg with 0 consumption and parent_mem_cgroup() would return NULL if use_hierarchy is 0 and so page_counter_read would try to dereference NULL. I suppose that the current implementation is just an overlook because the documentation in Documentation/cgroups/unified-hierarchy.txt says: The memory.low boundary on the other hand is a top-down allocated reserve. A cgroup enjoys reclaim protection when it and all its ancestors are below their low boundaries Fix the usage and the low limit comparision in mem_cgroup_low accordingly. Fixes: 241994ed8649 (mm: memcontrol: default hierarchy interface for memory) Reported-by: Joonsoo Kim js1...@gmail.com Signed-off-by: Michal Hocko mho...@suse.cz --- mm/memcontrol.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 0b436bc02ba4..079b5c02e245 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -5426,7 +5426,7 @@ bool mem_cgroup_low(struct mem_cgroup *root, struct mem_cgroup *memcg) if (memcg == root_mem_cgroup) return false; - if (page_counter_read(memcg-memory) memcg-low) + if (page_counter_read(memcg-memory) = memcg-low) return false; while (memcg != root) { @@ -5435,7 +5435,7 @@ bool mem_cgroup_low(struct mem_cgroup *root, struct mem_cgroup *memcg) if (memcg == root_mem_cgroup) break; - if (page_counter_read(memcg-memory) memcg-low) + if (page_counter_read(memcg-memory) = memcg-low) return false; } return true; -- 2.1.4 -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm/memcontrol: fix NULL pointer dereference when use_hierarchy is 0
On Tue, Feb 17, 2015 at 09:33:27AM +0100, Michal Hocko wrote: On Tue 17-02-15 14:24:59, Joonsoo Kim wrote: It can be possible to return NULL in parent_mem_cgroup() if use_hierarchy is 0. This alone is not sufficient because the low limit is present only in the unified hierarchy API and there is no use_hierarchy there. The primary issue here is that the memcg has 0 usage so the previous check for usage will not stop us. And that is bug IMO. Yes, empty groups shouldn't be considered low. From f5d74671d30e44c50b45b4464c92f536f1dbdff6 Mon Sep 17 00:00:00 2001 From: Michal Hocko mho...@suse.cz Date: Tue, 17 Feb 2015 08:02:12 +0100 Subject: [PATCH] memcg: fix low limit calculation A memcg is considered low limited even when the current usage is equal to the low limit. This leads to interesting side effects e.g. groups/hierarchies with no memory accounted are considered protected and so the reclaim will emit MEMCG_LOW event when encountering them. Another and much bigger issue was reported by Joonsoo Kim. He has hit a NULL ptr dereference with the legacy cgroup API which even doesn't have low limit exposed. The limit is 0 by default but the initial check fails for memcg with 0 consumption and parent_mem_cgroup() would return NULL if use_hierarchy is 0 and so page_counter_read would try to dereference NULL. I suppose that the current implementation is just an overlook because the documentation in Documentation/cgroups/unified-hierarchy.txt says: The memory.low boundary on the other hand is a top-down allocated reserve. A cgroup enjoys reclaim protection when it and all its ancestors are below their low boundaries Fix the usage and the low limit comparision in mem_cgroup_low accordingly. Fixes: 241994ed8649 (mm: memcontrol: default hierarchy interface for memory) Reported-by: Joonsoo Kim js1...@gmail.com Signed-off-by: Michal Hocko mho...@suse.cz Acked-by: Johannes Weiner han...@cmpxchg.org -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] mm/memcontrol: fix NULL pointer dereference when use_hierarchy is 0
It can be possible to return NULL in parent_mem_cgroup() if use_hierarchy is 0. So, we need to check NULL in the loop on mem_cgroup_low(). Without it, following NULL pointer dereference happens. [ 33.607531] BUG: unable to handle kernel NULL pointer dereference at 00b0 [ 33.608008] IP: [] mem_cgroup_low+0x40/0x90 [ 33.608008] PGD 1d893067 PUD 1cf41067 PMD 0 [ 33.608008] Oops: [#12] SMP [ 33.608008] Modules linked in: [ 33.608008] CPU: 1 PID: 3936 Comm: as Tainted: G D 3.19.0-next-20150216 #156 [ 33.608008] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 [ 33.608008] task: 88001d9c8000 ti: 88000cb14000 task.ti: 88000cb14000 [ 33.608008] RIP: 0010:[] [] mem_cgroup_low+0x40/0x90 [ 33.608008] RSP: :88000cb17a88 EFLAGS: 00010286 [ 33.608008] RAX: RBX: 88000cb17bc0 RCX: [ 33.608008] RDX: 88001f491400 RSI: RDI: [ 33.608008] RBP: 88000cb17a88 R08: 0160 R09: [ 33.608008] R10: R11: 02b8c101 R12: [ 33.608008] R13: R14: 88001fff9e08 R15: 88001da95800 [ 33.608008] FS: 2b7a12715380() GS:88001fa4() knlGS: [ 33.608008] CS: 0010 DS: ES: CR0: 80050033 [ 33.608008] CR2: 00b0 CR3: 0762f000 CR4: 07e0 [ 33.608008] Stack: [ 33.608008] 88000cb17b18 811838ec 88000cb17cd8 [ 33.608008] 0001 000280da 88001fff8780 [ 33.608008] 88000cb17af8 810e1d7e 88001fff8780 0003000c [ 33.608008] Call Trace: [ 33.608008] [] shrink_zone+0xac/0x2d0 [ 33.608008] [] ? ktime_get+0x3e/0xa0 [ 33.608008] [] do_try_to_free_pages+0x174/0x440 [ 33.608008] [] ? throttle_direct_reclaim+0x98/0x250 [ 33.608008] [] try_to_free_pages+0xba/0x150 [ 33.608008] [] __alloc_pages_nodemask+0x5a0/0x950 [ 33.608008] [] alloc_pages_vma+0xaf/0x200 [ 33.608008] [] handle_mm_fault+0x1287/0x17e0 [ 33.608008] [] ? kvm_clock_read+0x1e/0x20 [ 33.608008] [] ? kvm_clock_read+0x1e/0x20 [ 33.608008] [] ? sched_clock+0x9/0x10 [ 33.608008] [] __do_page_fault+0x191/0x440 [ 33.608008] [] trace_do_page_fault+0x45/0x100 [ 33.608008] [] do_async_page_fault+0x1e/0xd0 [ 33.608008] [] async_page_fault+0x28/0x30 [ 33.608008] Code: 48 8b 15 cc 21 b4 00 48 39 d6 74 53 48 8b 8e b0 00 00 00 48 39 8e 28 01 00 00 72 43 31 c9 48 39 fe 75 1d eb 35 66 0f 1f 44 00 00 <48> 8b 86 b0 00 00 00 48 39 86 28 01 00 00 72 30 48 39 f7 74 1a [ 33.608008] RIP [] mem_cgroup_low+0x40/0x90 [ 33.608008] RSP [ 33.608008] CR2: 00b0 [ 33.608008] BUG: unable to handle kernel [ 33.653499] ---[ end trace e264a32717ffda51 ]--- Signed-off-by: Joonsoo Kim --- mm/memcontrol.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index d18d3a6..507cfea 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -5431,6 +5431,8 @@ bool mem_cgroup_low(struct mem_cgroup *root, struct mem_cgroup *memcg) while (memcg != root) { memcg = parent_mem_cgroup(memcg); + if (!memcg) + break; if (memcg == root_mem_cgroup) break; -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] mm/memcontrol: fix NULL pointer dereference when use_hierarchy is 0
It can be possible to return NULL in parent_mem_cgroup() if use_hierarchy is 0. So, we need to check NULL in the loop on mem_cgroup_low(). Without it, following NULL pointer dereference happens. [ 33.607531] BUG: unable to handle kernel NULL pointer dereference at 00b0 [ 33.608008] IP: [811dcf60] mem_cgroup_low+0x40/0x90 [ 33.608008] PGD 1d893067 PUD 1cf41067 PMD 0 [ 33.608008] Oops: [#12] SMP [ 33.608008] Modules linked in: [ 33.608008] CPU: 1 PID: 3936 Comm: as Tainted: G D 3.19.0-next-20150216 #156 [ 33.608008] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 [ 33.608008] task: 88001d9c8000 ti: 88000cb14000 task.ti: 88000cb14000 [ 33.608008] RIP: 0010:[811dcf60] [811dcf60] mem_cgroup_low+0x40/0x90 [ 33.608008] RSP: :88000cb17a88 EFLAGS: 00010286 [ 33.608008] RAX: RBX: 88000cb17bc0 RCX: [ 33.608008] RDX: 88001f491400 RSI: RDI: [ 33.608008] RBP: 88000cb17a88 R08: 0160 R09: [ 33.608008] R10: R11: 02b8c101 R12: [ 33.608008] R13: R14: 88001fff9e08 R15: 88001da95800 [ 33.608008] FS: 2b7a12715380() GS:88001fa4() knlGS: [ 33.608008] CS: 0010 DS: ES: CR0: 80050033 [ 33.608008] CR2: 00b0 CR3: 0762f000 CR4: 07e0 [ 33.608008] Stack: [ 33.608008] 88000cb17b18 811838ec 88000cb17cd8 [ 33.608008] 0001 000280da 88001fff8780 [ 33.608008] 88000cb17af8 810e1d7e 88001fff8780 0003000c [ 33.608008] Call Trace: [ 33.608008] [811838ec] shrink_zone+0xac/0x2d0 [ 33.608008] [810e1d7e] ? ktime_get+0x3e/0xa0 [ 33.608008] [81183e94] do_try_to_free_pages+0x174/0x440 [ 33.608008] [8117f1a8] ? throttle_direct_reclaim+0x98/0x250 [ 33.608008] [8118421a] try_to_free_pages+0xba/0x150 [ 33.608008] [81176d10] __alloc_pages_nodemask+0x5a0/0x950 [ 33.608008] [811c09ff] alloc_pages_vma+0xaf/0x200 [ 33.608008] [811a0717] handle_mm_fault+0x1287/0x17e0 [ 33.608008] [81059e9e] ? kvm_clock_read+0x1e/0x20 [ 33.608008] [81059e9e] ? kvm_clock_read+0x1e/0x20 [ 33.608008] [8101e6a9] ? sched_clock+0x9/0x10 [ 33.608008] [810605f1] __do_page_fault+0x191/0x440 [ 33.608008] [81060955] trace_do_page_fault+0x45/0x100 [ 33.608008] [8105968e] do_async_page_fault+0x1e/0xd0 [ 33.608008] [8176f628] async_page_fault+0x28/0x30 [ 33.608008] Code: 48 8b 15 cc 21 b4 00 48 39 d6 74 53 48 8b 8e b0 00 00 00 48 39 8e 28 01 00 00 72 43 31 c9 48 39 fe 75 1d eb 35 66 0f 1f 44 00 00 48 8b 86 b0 00 00 00 48 39 86 28 01 00 00 72 30 48 39 f7 74 1a [ 33.608008] RIP [811dcf60] mem_cgroup_low+0x40/0x90 [ 33.608008] RSP 88000cb17a88 [ 33.608008] CR2: 00b0 [ 33.608008] BUG: unable to handle kernel [ 33.653499] ---[ end trace e264a32717ffda51 ]--- Signed-off-by: Joonsoo Kim iamjoonsoo@lge.com --- mm/memcontrol.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index d18d3a6..507cfea 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -5431,6 +5431,8 @@ bool mem_cgroup_low(struct mem_cgroup *root, struct mem_cgroup *memcg) while (memcg != root) { memcg = parent_mem_cgroup(memcg); + if (!memcg) + break; if (memcg == root_mem_cgroup) break; -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/