Re: [v3.9-rc8]: kernel BUG at mm/memcontrol.c:3994! (was: Re: [BUG][s390x] mm: system crashed)
On Wed, May 01, 2013 at 08:28:30AM -0700, Hugh Dickins wrote: > On Tue, 30 Apr 2013, Johannes Weiner wrote: > > On Wed, Apr 24, 2013 at 08:50:01PM -0700, Hugh Dickins wrote: > > > On Wed, 24 Apr 2013, Johannes Weiner wrote: > > > > On Wed, Apr 24, 2013 at 03:18:51PM +0200, Michal Hocko wrote: > > > > > On Wed 24-04-13 12:42:55, Heiko Carstens wrote: > > > > > > On Thu, Apr 18, 2013 at 09:13:03AM +0200, Heiko Carstens wrote: > > > > > > > > > > > > [ 48.347963] [ cut here ] > > > > > > [ 48.347972] kernel BUG at mm/memcontrol.c:3994! > > > > > > __mem_cgroup_uncharge_common() triggers: > > > > > > > > > > > > [...] > > > > > > if (mem_cgroup_disabled()) > > > > > > return NULL; > > > > > > > > > > > > VM_BUG_ON(PageSwapCache(page)); > > > > > > [...] > > > > > > I agree that the actual memcg uncharging should be okay, but the memsw > > > swap stats will go wrong (doesn't matter t much), and mem_cgroup_put > > > get missed (leaking a struct mem_cgroup). > > > > Ok, so I just went over this again. For the swapout path the memsw > > uncharge is deferred, but if we "steal" this uncharge from the swap > > code, we actually do uncharge memsw in mem_cgroup_do_uncharge(), so we > > may prematurely unaccount the swap page, but we never leak a charge. > > Good. > > > > Because of this stealing, we also don't do the following: > > > > if (do_swap_account && ctype == MEM_CGROUP_CHARGE_TYPE_SWAPOUT) { > > mem_cgroup_swap_statistics(memcg, true); > > mem_cgroup_get(memcg); > > } > > > > I.e. it does not matter that mem_cgroup_uncharge_swap() doesn't do the > > put, we are also not doing the get. We should not leak references. > > > > So the only thing that I can see go wrong is that we may have a > > swapped out page that is not charged to memsw and not accounted as > > MEM_CGROUP_STAT_SWAP. But I don't know how likely that is, because we > > check for PG_swapcache in this uncharge path after the last pte is > > torn down, so even though the page is put on swap cache, it probably > > won't be swapped. It would require that the PG_swapcache setting > > would become visible only after the page has been added to the swap > > cache AND rmap has established at least one swap pte for us to > > uncharge a page that actually continues to be used. And that's a bit > > of a stretch, I think. > > Sorry, our minds seem to work in different ways, > I understood very little of what you wrote above :-( > > But once I try to disprove you with a counter-example, I seem to > arrive at the same conclusion as you have (well, I haven't quite > arrived there yet, but cannot give it any more time). I might be losing my mind. But since you are reaching the same conclusion, and I see the same mental milestones in your thought process described below, it's more likely that I suck at describing my train of thought coherently. Or the third possibility: we're both losing it! > Looking at it from my point of view, I concentrate on the racy > if (PageSwapCache(page)) > return; > __mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_ANON, false); > in mem_cgroup_uncharge_page(). > > Now, that may or may not catch the case where last reference to page > is unmapped at the same time as the page is added to swap: but being > a MEM_CGROUP_CHARGE_TYPE_ANON call, it does not interfere with the > memsw stats and get/put at all, those remain in balance. Yes, exactly. > And mem_cgroup_uncharge_swap() has all along been prepared to get > a zero id from swap_cgroup_record(), if a SwapCache page should be > uncharged when it was never quite charged as such. > > Yes, we may occasionally fail to charge a SwapCache page as such > if its final unmap from userspace races with its being added to swap; > but it's heading towards swap_writepage()'s try_to_free_swap() anyway, > so I don't think that's anything to worry about. Agreed as well. If there are no pte references to the swap slot, it will be freed either way. I didn't even think of the try_to_free_swap() in the writeout call, but was looking at the __remove_mapping later on in reclaim that will do a swapcache_free(). The only case I was worried about is the following: #0 #1 page_remove_rmap() shrink_page_list() if --page->mapcount == 0: add_to_swap() mem_cgroup_uncharge_page() __add_to_swap_cache() if PageSwapCache: SetPageSwapCache() returntry_to_unmap() __mem_cgroup_uncharge_common()for each pte: install swp_entry_t page->mapcount-- Looking at #1, I don't see anything that would force concurrent threads to observe SetSwapCache ordered against the page->mapcount--. My concern was that if those get
Re: [v3.9-rc8]: kernel BUG at mm/memcontrol.c:3994! (was: Re: [BUG][s390x] mm: system crashed)
On Tue, 30 Apr 2013, Johannes Weiner wrote: > On Wed, Apr 24, 2013 at 08:50:01PM -0700, Hugh Dickins wrote: > > On Wed, 24 Apr 2013, Johannes Weiner wrote: > > > On Wed, Apr 24, 2013 at 03:18:51PM +0200, Michal Hocko wrote: > > > > On Wed 24-04-13 12:42:55, Heiko Carstens wrote: > > > > > On Thu, Apr 18, 2013 at 09:13:03AM +0200, Heiko Carstens wrote: > > > > > > > > > > [ 48.347963] [ cut here ] > > > > > [ 48.347972] kernel BUG at mm/memcontrol.c:3994! > > > > > __mem_cgroup_uncharge_common() triggers: > > > > > > > > > > [...] > > > > > if (mem_cgroup_disabled()) > > > > > return NULL; > > > > > > > > > > VM_BUG_ON(PageSwapCache(page)); > > > > > [...] > > > > I agree that the actual memcg uncharging should be okay, but the memsw > > swap stats will go wrong (doesn't matter t much), and mem_cgroup_put > > get missed (leaking a struct mem_cgroup). > > Ok, so I just went over this again. For the swapout path the memsw > uncharge is deferred, but if we "steal" this uncharge from the swap > code, we actually do uncharge memsw in mem_cgroup_do_uncharge(), so we > may prematurely unaccount the swap page, but we never leak a charge. > Good. > > Because of this stealing, we also don't do the following: > > if (do_swap_account && ctype == MEM_CGROUP_CHARGE_TYPE_SWAPOUT) { > mem_cgroup_swap_statistics(memcg, true); > mem_cgroup_get(memcg); > } > > I.e. it does not matter that mem_cgroup_uncharge_swap() doesn't do the > put, we are also not doing the get. We should not leak references. > > So the only thing that I can see go wrong is that we may have a > swapped out page that is not charged to memsw and not accounted as > MEM_CGROUP_STAT_SWAP. But I don't know how likely that is, because we > check for PG_swapcache in this uncharge path after the last pte is > torn down, so even though the page is put on swap cache, it probably > won't be swapped. It would require that the PG_swapcache setting > would become visible only after the page has been added to the swap > cache AND rmap has established at least one swap pte for us to > uncharge a page that actually continues to be used. And that's a bit > of a stretch, I think. Sorry, our minds seem to work in different ways, I understood very little of what you wrote above :-( But once I try to disprove you with a counter-example, I seem to arrive at the same conclusion as you have (well, I haven't quite arrived there yet, but cannot give it any more time). Looking at it from my point of view, I concentrate on the racy if (PageSwapCache(page)) return; __mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_ANON, false); in mem_cgroup_uncharge_page(). Now, that may or may not catch the case where last reference to page is unmapped at the same time as the page is added to swap: but being a MEM_CGROUP_CHARGE_TYPE_ANON call, it does not interfere with the memsw stats and get/put at all, those remain in balance. And mem_cgroup_uncharge_swap() has all along been prepared to get a zero id from swap_cgroup_record(), if a SwapCache page should be uncharged when it was never quite charged as such. Yes, we may occasionally fail to charge a SwapCache page as such if its final unmap from userspace races with its being added to swap; but it's heading towards swap_writepage()'s try_to_free_swap() anyway, so I don't think that's anything to worry about. (If I had time to stop and read through that, I'd probably find it just as hard to understand as what you wrote!) > > Did I miss something? If not, I'll just send a patch that removes the > VM_BUG_ON() and adds a comment describing the scenarios and a note > that we may want to fix this in the future. I don't think you missed something. Yes, please just send Linus and Andrew a patch to remove the VM_BUG_ON() (with Cc stable tag), I now agree that's all that's really needed - thanks. Hugh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [v3.9-rc8]: kernel BUG at mm/memcontrol.c:3994! (was: Re: [BUG][s390x] mm: system crashed)
On Tue, 30 Apr 2013, Johannes Weiner wrote: On Wed, Apr 24, 2013 at 08:50:01PM -0700, Hugh Dickins wrote: On Wed, 24 Apr 2013, Johannes Weiner wrote: On Wed, Apr 24, 2013 at 03:18:51PM +0200, Michal Hocko wrote: On Wed 24-04-13 12:42:55, Heiko Carstens wrote: On Thu, Apr 18, 2013 at 09:13:03AM +0200, Heiko Carstens wrote: [ 48.347963] [ cut here ] [ 48.347972] kernel BUG at mm/memcontrol.c:3994! __mem_cgroup_uncharge_common() triggers: [...] if (mem_cgroup_disabled()) return NULL; VM_BUG_ON(PageSwapCache(page)); [...] I agree that the actual memcg uncharging should be okay, but the memsw swap stats will go wrong (doesn't matter t much), and mem_cgroup_put get missed (leaking a struct mem_cgroup). Ok, so I just went over this again. For the swapout path the memsw uncharge is deferred, but if we steal this uncharge from the swap code, we actually do uncharge memsw in mem_cgroup_do_uncharge(), so we may prematurely unaccount the swap page, but we never leak a charge. Good. Because of this stealing, we also don't do the following: if (do_swap_account ctype == MEM_CGROUP_CHARGE_TYPE_SWAPOUT) { mem_cgroup_swap_statistics(memcg, true); mem_cgroup_get(memcg); } I.e. it does not matter that mem_cgroup_uncharge_swap() doesn't do the put, we are also not doing the get. We should not leak references. So the only thing that I can see go wrong is that we may have a swapped out page that is not charged to memsw and not accounted as MEM_CGROUP_STAT_SWAP. But I don't know how likely that is, because we check for PG_swapcache in this uncharge path after the last pte is torn down, so even though the page is put on swap cache, it probably won't be swapped. It would require that the PG_swapcache setting would become visible only after the page has been added to the swap cache AND rmap has established at least one swap pte for us to uncharge a page that actually continues to be used. And that's a bit of a stretch, I think. Sorry, our minds seem to work in different ways, I understood very little of what you wrote above :-( But once I try to disprove you with a counter-example, I seem to arrive at the same conclusion as you have (well, I haven't quite arrived there yet, but cannot give it any more time). Looking at it from my point of view, I concentrate on the racy if (PageSwapCache(page)) return; __mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_ANON, false); in mem_cgroup_uncharge_page(). Now, that may or may not catch the case where last reference to page is unmapped at the same time as the page is added to swap: but being a MEM_CGROUP_CHARGE_TYPE_ANON call, it does not interfere with the memsw stats and get/put at all, those remain in balance. And mem_cgroup_uncharge_swap() has all along been prepared to get a zero id from swap_cgroup_record(), if a SwapCache page should be uncharged when it was never quite charged as such. Yes, we may occasionally fail to charge a SwapCache page as such if its final unmap from userspace races with its being added to swap; but it's heading towards swap_writepage()'s try_to_free_swap() anyway, so I don't think that's anything to worry about. (If I had time to stop and read through that, I'd probably find it just as hard to understand as what you wrote!) Did I miss something? If not, I'll just send a patch that removes the VM_BUG_ON() and adds a comment describing the scenarios and a note that we may want to fix this in the future. I don't think you missed something. Yes, please just send Linus and Andrew a patch to remove the VM_BUG_ON() (with Cc stable tag), I now agree that's all that's really needed - thanks. Hugh -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [v3.9-rc8]: kernel BUG at mm/memcontrol.c:3994! (was: Re: [BUG][s390x] mm: system crashed)
On Wed, May 01, 2013 at 08:28:30AM -0700, Hugh Dickins wrote: On Tue, 30 Apr 2013, Johannes Weiner wrote: On Wed, Apr 24, 2013 at 08:50:01PM -0700, Hugh Dickins wrote: On Wed, 24 Apr 2013, Johannes Weiner wrote: On Wed, Apr 24, 2013 at 03:18:51PM +0200, Michal Hocko wrote: On Wed 24-04-13 12:42:55, Heiko Carstens wrote: On Thu, Apr 18, 2013 at 09:13:03AM +0200, Heiko Carstens wrote: [ 48.347963] [ cut here ] [ 48.347972] kernel BUG at mm/memcontrol.c:3994! __mem_cgroup_uncharge_common() triggers: [...] if (mem_cgroup_disabled()) return NULL; VM_BUG_ON(PageSwapCache(page)); [...] I agree that the actual memcg uncharging should be okay, but the memsw swap stats will go wrong (doesn't matter t much), and mem_cgroup_put get missed (leaking a struct mem_cgroup). Ok, so I just went over this again. For the swapout path the memsw uncharge is deferred, but if we steal this uncharge from the swap code, we actually do uncharge memsw in mem_cgroup_do_uncharge(), so we may prematurely unaccount the swap page, but we never leak a charge. Good. Because of this stealing, we also don't do the following: if (do_swap_account ctype == MEM_CGROUP_CHARGE_TYPE_SWAPOUT) { mem_cgroup_swap_statistics(memcg, true); mem_cgroup_get(memcg); } I.e. it does not matter that mem_cgroup_uncharge_swap() doesn't do the put, we are also not doing the get. We should not leak references. So the only thing that I can see go wrong is that we may have a swapped out page that is not charged to memsw and not accounted as MEM_CGROUP_STAT_SWAP. But I don't know how likely that is, because we check for PG_swapcache in this uncharge path after the last pte is torn down, so even though the page is put on swap cache, it probably won't be swapped. It would require that the PG_swapcache setting would become visible only after the page has been added to the swap cache AND rmap has established at least one swap pte for us to uncharge a page that actually continues to be used. And that's a bit of a stretch, I think. Sorry, our minds seem to work in different ways, I understood very little of what you wrote above :-( But once I try to disprove you with a counter-example, I seem to arrive at the same conclusion as you have (well, I haven't quite arrived there yet, but cannot give it any more time). I might be losing my mind. But since you are reaching the same conclusion, and I see the same mental milestones in your thought process described below, it's more likely that I suck at describing my train of thought coherently. Or the third possibility: we're both losing it! Looking at it from my point of view, I concentrate on the racy if (PageSwapCache(page)) return; __mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_ANON, false); in mem_cgroup_uncharge_page(). Now, that may or may not catch the case where last reference to page is unmapped at the same time as the page is added to swap: but being a MEM_CGROUP_CHARGE_TYPE_ANON call, it does not interfere with the memsw stats and get/put at all, those remain in balance. Yes, exactly. And mem_cgroup_uncharge_swap() has all along been prepared to get a zero id from swap_cgroup_record(), if a SwapCache page should be uncharged when it was never quite charged as such. Yes, we may occasionally fail to charge a SwapCache page as such if its final unmap from userspace races with its being added to swap; but it's heading towards swap_writepage()'s try_to_free_swap() anyway, so I don't think that's anything to worry about. Agreed as well. If there are no pte references to the swap slot, it will be freed either way. I didn't even think of the try_to_free_swap() in the writeout call, but was looking at the __remove_mapping later on in reclaim that will do a swapcache_free(). The only case I was worried about is the following: #0 #1 page_remove_rmap() shrink_page_list() if --page-mapcount == 0: add_to_swap() mem_cgroup_uncharge_page() __add_to_swap_cache() if PageSwapCache: SetPageSwapCache() returntry_to_unmap() __mem_cgroup_uncharge_common()for each pte: install swp_entry_t page-mapcount-- Looking at #1, I don't see anything that would force concurrent threads to observe SetSwapCache ordered against the page-mapcount--. My concern was that if those get reordered, #0 may see page-mapcount == 1 AND !PageSwapcache, and then go ahead and uncharge the page while there is actually a swp_entry_t pointing to it. The page will be a proper
Re: [v3.9-rc8]: kernel BUG at mm/memcontrol.c:3994! (was: Re: [BUG][s390x] mm: system crashed)
On Wed, Apr 24, 2013 at 08:50:01PM -0700, Hugh Dickins wrote: > On Wed, 24 Apr 2013, Johannes Weiner wrote: > > On Wed, Apr 24, 2013 at 03:18:51PM +0200, Michal Hocko wrote: > > > On Wed 24-04-13 12:42:55, Heiko Carstens wrote: > > > > On Thu, Apr 18, 2013 at 09:13:03AM +0200, Heiko Carstens wrote: > > > > > Ok, thanks for verifying! I'll look into it; hopefully I can > > > > > reproduce it > > > > > here as well. > > > > > > > > That seems to be a common code bug. I can easily trigger the VM_BUG_ON() > > > > below (when I force the system to swap): > > > > > > > > [ 48.347963] [ cut here ] > > > > [ 48.347972] kernel BUG at mm/memcontrol.c:3994! > > > > [ 48.348012] illegal operation: 0001 [#1] SMP > > > > [ 48.348015] Modules linked in: > > > > [ 48.348017] CPU: 1 Not tainted 3.9.0-rc8+ #38 > > > > [ 48.348020] Process mmap2 (pid: 635, task: 29476100, ksp: > > > > 2e91b938) > > > > [ 48.348022] Krnl PSW : 0704f0018000 0026552c > > > > (__mem_cgroup_uncharge_common+0x2c4/0x33c) > > > > [ 48.348032]R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:3 > > > > PM:0 EA:3 > > > >Krnl GPRS: 0008 0009 > > > > 03d1002a9200 > > > > [ 48.348039] 006812d8 > > > > 03ffdf339000 321a6f98 > > > > [ 48.348043]03fffce11000 > > > > 0001 03d1002a9200 > > > > [ 48.348046]0001 00681b88 > > > > 2e91bc18 2e91bbd0 > > > > [ 48.348057] Krnl Code: 0026551e: c0e5fffaa2a1brasl > > > > %r14,1b9a60 > > > > 00265524: a7f4ff7dbrc > > > > 15,26541e > > > > #00265528: a7f40001brc > > > > 15,26552a > > > > >0026552c: e3c0b8200124stg > > > > %r12,6176(%r11) > > > > 00265532: a7f4ff57brc > > > > 15,2653e0 > > > > 00265536: e310b8280104lg > > > > %r1,6184(%r11) > > > > 0026553c: a71b0001aghi > > > > %r1,1 > > > > 00265540: e310b8280124stg > > > > %r1,6184(%r11) > > > > [ 48.348099] Call Trace: > > > > [ 48.348100] ([<03d1002a91c0>] 0x3d1002a91c0) > > > > [ 48.348102] [<002404aa>] page_remove_rmap+0xf2/0x16c > > > > [ 48.348106] [<00232dc8>] unmap_single_vma+0x494/0x7d8 > > > > [ 48.348107] [<00233ac0>] unmap_vmas+0x50/0x74 > > > > [ 48.348109] [<002396ec>] unmap_region+0x9c/0x110 > > > > [ 48.348110] [<0023bd18>] do_munmap+0x284/0x470 > > > > [ 48.348111] [<0023bf56>] vm_munmap+0x52/0x70 > > > > [ 48.348113] [<0023cf32>] SyS_munmap+0x3a/0x4c > > > > [ 48.348114] [<00665e14>] sysc_noemu+0x22/0x28 > > > > [ 48.348118] [<03fffcf187b2>] 0x3fffcf187b2 > > > > [ 48.348119] Last Breaking-Event-Address: > > > > [ 48.348120] [<00265528>] > > > > __mem_cgroup_uncharge_common+0x2c0/0x33c > > > > > > > > Looking at the code, the code flow is: > > > > > > > > page_remove_rmap() -> mem_cgroup_uncharge_page() -> > > > > __mem_cgroup_uncharge_common() > > > > > > > > Note that in mem_cgroup_uncharge_page() the page in question passed the > > > > check: > > > > > > > > [...] > > > > if (PageSwapCache(page)) > > > > return; > > > > [...] > > > > > > > > and just a couple of instructions later the VM_BUG_ON() within > > > > __mem_cgroup_uncharge_common() triggers: > > > > > > > > [...] > > > > if (mem_cgroup_disabled()) > > > > return NULL; > > > > > > > > VM_BUG_ON(PageSwapCache(page)); > > > > [...] > > > > > > > > Which means that another cpu changed the pageflags concurrently. In > > > > fact, > > > > looking at the dump a different cpu is indeed busy with running kswapd. > > > > > > Hmm, maybe I am missing something but it really looks like we can race > > > here. Reclaim path takes the page lock while zap_pte takes page table > > > lock so nothing prevents them from racing here: > > > shrink_page_list zap_pte_range > > > trylock_page pte_offset_map_lock > > > add_to_swap page_remove_rmap > > > /* Page can be still mapped */ > > > add_to_swap_cache atomic_add_negative(_mapcount) > > > __add_to_swap_cache mem_cgroup_uncharge_page > > > (PageSwapCache(page)) && > > > return > > > SetPageSwapCache > > > __mem_cgroup_uncharge_common > > > VM_BUG_ON(PageSwapCache(page)) > > > > > > Maybe not many people run
Re: [v3.9-rc8]: kernel BUG at mm/memcontrol.c:3994! (was: Re: [BUG][s390x] mm: system crashed)
On Wed, Apr 24, 2013 at 08:50:01PM -0700, Hugh Dickins wrote: On Wed, 24 Apr 2013, Johannes Weiner wrote: On Wed, Apr 24, 2013 at 03:18:51PM +0200, Michal Hocko wrote: On Wed 24-04-13 12:42:55, Heiko Carstens wrote: On Thu, Apr 18, 2013 at 09:13:03AM +0200, Heiko Carstens wrote: Ok, thanks for verifying! I'll look into it; hopefully I can reproduce it here as well. That seems to be a common code bug. I can easily trigger the VM_BUG_ON() below (when I force the system to swap): [ 48.347963] [ cut here ] [ 48.347972] kernel BUG at mm/memcontrol.c:3994! [ 48.348012] illegal operation: 0001 [#1] SMP [ 48.348015] Modules linked in: [ 48.348017] CPU: 1 Not tainted 3.9.0-rc8+ #38 [ 48.348020] Process mmap2 (pid: 635, task: 29476100, ksp: 2e91b938) [ 48.348022] Krnl PSW : 0704f0018000 0026552c (__mem_cgroup_uncharge_common+0x2c4/0x33c) [ 48.348032]R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:3 PM:0 EA:3 Krnl GPRS: 0008 0009 03d1002a9200 [ 48.348039] 006812d8 03ffdf339000 321a6f98 [ 48.348043]03fffce11000 0001 03d1002a9200 [ 48.348046]0001 00681b88 2e91bc18 2e91bbd0 [ 48.348057] Krnl Code: 0026551e: c0e5fffaa2a1brasl %r14,1b9a60 00265524: a7f4ff7dbrc 15,26541e #00265528: a7f40001brc 15,26552a 0026552c: e3c0b8200124stg %r12,6176(%r11) 00265532: a7f4ff57brc 15,2653e0 00265536: e310b8280104lg %r1,6184(%r11) 0026553c: a71b0001aghi %r1,1 00265540: e310b8280124stg %r1,6184(%r11) [ 48.348099] Call Trace: [ 48.348100] ([03d1002a91c0] 0x3d1002a91c0) [ 48.348102] [002404aa] page_remove_rmap+0xf2/0x16c [ 48.348106] [00232dc8] unmap_single_vma+0x494/0x7d8 [ 48.348107] [00233ac0] unmap_vmas+0x50/0x74 [ 48.348109] [002396ec] unmap_region+0x9c/0x110 [ 48.348110] [0023bd18] do_munmap+0x284/0x470 [ 48.348111] [0023bf56] vm_munmap+0x52/0x70 [ 48.348113] [0023cf32] SyS_munmap+0x3a/0x4c [ 48.348114] [00665e14] sysc_noemu+0x22/0x28 [ 48.348118] [03fffcf187b2] 0x3fffcf187b2 [ 48.348119] Last Breaking-Event-Address: [ 48.348120] [00265528] __mem_cgroup_uncharge_common+0x2c0/0x33c Looking at the code, the code flow is: page_remove_rmap() - mem_cgroup_uncharge_page() - __mem_cgroup_uncharge_common() Note that in mem_cgroup_uncharge_page() the page in question passed the check: [...] if (PageSwapCache(page)) return; [...] and just a couple of instructions later the VM_BUG_ON() within __mem_cgroup_uncharge_common() triggers: [...] if (mem_cgroup_disabled()) return NULL; VM_BUG_ON(PageSwapCache(page)); [...] Which means that another cpu changed the pageflags concurrently. In fact, looking at the dump a different cpu is indeed busy with running kswapd. Hmm, maybe I am missing something but it really looks like we can race here. Reclaim path takes the page lock while zap_pte takes page table lock so nothing prevents them from racing here: shrink_page_list zap_pte_range trylock_page pte_offset_map_lock add_to_swap page_remove_rmap /* Page can be still mapped */ add_to_swap_cache atomic_add_negative(_mapcount) __add_to_swap_cache mem_cgroup_uncharge_page (PageSwapCache(page)) return SetPageSwapCache __mem_cgroup_uncharge_common VM_BUG_ON(PageSwapCache(page)) Maybe not many people run with CONFIG_DEBUG_VM enabled these days so we do not this more often (even me testing configs are not consistent in that regards and only few have it on). The only thing that changed in this area recently is 0c59b89c which made the test VM_BUG_ON rather then simple return in 3.6 And maybe the BUG_ON is too harsh as CgroupUsed should guarantee that the uncharge will eventually go away. What do you think
Re: [v3.9-rc8]: kernel BUG at mm/memcontrol.c:3994! (was: Re: [BUG][s390x] mm: system crashed)
On Wed, 24 Apr 2013, Johannes Weiner wrote: > On Wed, Apr 24, 2013 at 03:18:51PM +0200, Michal Hocko wrote: > > On Wed 24-04-13 12:42:55, Heiko Carstens wrote: > > > On Thu, Apr 18, 2013 at 09:13:03AM +0200, Heiko Carstens wrote: > > > > Ok, thanks for verifying! I'll look into it; hopefully I can reproduce > > > > it > > > > here as well. > > > > > > That seems to be a common code bug. I can easily trigger the VM_BUG_ON() > > > below (when I force the system to swap): > > > > > > [ 48.347963] [ cut here ] > > > [ 48.347972] kernel BUG at mm/memcontrol.c:3994! > > > [ 48.348012] illegal operation: 0001 [#1] SMP > > > [ 48.348015] Modules linked in: > > > [ 48.348017] CPU: 1 Not tainted 3.9.0-rc8+ #38 > > > [ 48.348020] Process mmap2 (pid: 635, task: 29476100, ksp: > > > 2e91b938) > > > [ 48.348022] Krnl PSW : 0704f0018000 0026552c > > > (__mem_cgroup_uncharge_common+0x2c4/0x33c) > > > [ 48.348032]R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:3 > > > PM:0 EA:3 > > >Krnl GPRS: 0008 0009 > > > 03d1002a9200 > > > [ 48.348039] 006812d8 > > > 03ffdf339000 321a6f98 > > > [ 48.348043]03fffce11000 > > > 0001 03d1002a9200 > > > [ 48.348046]0001 00681b88 > > > 2e91bc18 2e91bbd0 > > > [ 48.348057] Krnl Code: 0026551e: c0e5fffaa2a1brasl > > > %r14,1b9a60 > > > 00265524: a7f4ff7dbrc > > > 15,26541e > > > #00265528: a7f40001brc > > > 15,26552a > > > >0026552c: e3c0b8200124stg > > > %r12,6176(%r11) > > > 00265532: a7f4ff57brc > > > 15,2653e0 > > > 00265536: e310b8280104lg > > > %r1,6184(%r11) > > > 0026553c: a71b0001aghi > > > %r1,1 > > > 00265540: e310b8280124stg > > > %r1,6184(%r11) > > > [ 48.348099] Call Trace: > > > [ 48.348100] ([<03d1002a91c0>] 0x3d1002a91c0) > > > [ 48.348102] [<002404aa>] page_remove_rmap+0xf2/0x16c > > > [ 48.348106] [<00232dc8>] unmap_single_vma+0x494/0x7d8 > > > [ 48.348107] [<00233ac0>] unmap_vmas+0x50/0x74 > > > [ 48.348109] [<002396ec>] unmap_region+0x9c/0x110 > > > [ 48.348110] [<0023bd18>] do_munmap+0x284/0x470 > > > [ 48.348111] [<0023bf56>] vm_munmap+0x52/0x70 > > > [ 48.348113] [<0023cf32>] SyS_munmap+0x3a/0x4c > > > [ 48.348114] [<00665e14>] sysc_noemu+0x22/0x28 > > > [ 48.348118] [<03fffcf187b2>] 0x3fffcf187b2 > > > [ 48.348119] Last Breaking-Event-Address: > > > [ 48.348120] [<00265528>] > > > __mem_cgroup_uncharge_common+0x2c0/0x33c > > > > > > Looking at the code, the code flow is: > > > > > > page_remove_rmap() -> mem_cgroup_uncharge_page() -> > > > __mem_cgroup_uncharge_common() > > > > > > Note that in mem_cgroup_uncharge_page() the page in question passed the > > > check: > > > > > > [...] > > > if (PageSwapCache(page)) > > > return; > > > [...] > > > > > > and just a couple of instructions later the VM_BUG_ON() within > > > __mem_cgroup_uncharge_common() triggers: > > > > > > [...] > > > if (mem_cgroup_disabled()) > > > return NULL; > > > > > > VM_BUG_ON(PageSwapCache(page)); > > > [...] > > > > > > Which means that another cpu changed the pageflags concurrently. In fact, > > > looking at the dump a different cpu is indeed busy with running kswapd. > > > > Hmm, maybe I am missing something but it really looks like we can race > > here. Reclaim path takes the page lock while zap_pte takes page table > > lock so nothing prevents them from racing here: > > shrink_page_listzap_pte_range > > trylock_pagepte_offset_map_lock > > add_to_swap page_remove_rmap > > /* Page can be still mapped */ > > add_to_swap_cache atomic_add_negative(_mapcount) > > __add_to_swap_cache mem_cgroup_uncharge_page > > (PageSwapCache(page)) && > > return > > SetPageSwapCache > > __mem_cgroup_uncharge_common > > VM_BUG_ON(PageSwapCache(page)) > > > > Maybe not many people run with CONFIG_DEBUG_VM enabled these days so we > > do not this more often (even me testing configs are not consistent in > > that regards and only few have it on). The only thing that changed in > > this area recently is 0c59b89c which made the test VM_BUG_ON
Re: [v3.9-rc8]: kernel BUG at mm/memcontrol.c:3994! (was: Re: [BUG][s390x] mm: system crashed)
On Wed, Apr 24, 2013 at 03:18:51PM +0200, Michal Hocko wrote: > On Wed 24-04-13 12:42:55, Heiko Carstens wrote: > > On Thu, Apr 18, 2013 at 09:13:03AM +0200, Heiko Carstens wrote: > > > Ok, thanks for verifying! I'll look into it; hopefully I can reproduce it > > > here as well. > > > > That seems to be a common code bug. I can easily trigger the VM_BUG_ON() > > below (when I force the system to swap): > > > > [ 48.347963] [ cut here ] > > [ 48.347972] kernel BUG at mm/memcontrol.c:3994! > > [ 48.348012] illegal operation: 0001 [#1] SMP > > [ 48.348015] Modules linked in: > > [ 48.348017] CPU: 1 Not tainted 3.9.0-rc8+ #38 > > [ 48.348020] Process mmap2 (pid: 635, task: 29476100, ksp: > > 2e91b938) > > [ 48.348022] Krnl PSW : 0704f0018000 0026552c > > (__mem_cgroup_uncharge_common+0x2c4/0x33c) > > [ 48.348032]R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:3 > > PM:0 EA:3 > >Krnl GPRS: 0008 0009 > > 03d1002a9200 > > [ 48.348039] 006812d8 > > 03ffdf339000 321a6f98 > > [ 48.348043]03fffce11000 > > 0001 03d1002a9200 > > [ 48.348046]0001 00681b88 > > 2e91bc18 2e91bbd0 > > [ 48.348057] Krnl Code: 0026551e: c0e5fffaa2a1brasl > > %r14,1b9a60 > > 00265524: a7f4ff7dbrc > > 15,26541e > > #00265528: a7f40001brc > > 15,26552a > > >0026552c: e3c0b8200124stg > > %r12,6176(%r11) > > 00265532: a7f4ff57brc > > 15,2653e0 > > 00265536: e310b8280104lg > > %r1,6184(%r11) > > 0026553c: a71b0001aghi > > %r1,1 > > 00265540: e310b8280124stg > > %r1,6184(%r11) > > [ 48.348099] Call Trace: > > [ 48.348100] ([<03d1002a91c0>] 0x3d1002a91c0) > > [ 48.348102] [<002404aa>] page_remove_rmap+0xf2/0x16c > > [ 48.348106] [<00232dc8>] unmap_single_vma+0x494/0x7d8 > > [ 48.348107] [<00233ac0>] unmap_vmas+0x50/0x74 > > [ 48.348109] [<002396ec>] unmap_region+0x9c/0x110 > > [ 48.348110] [<0023bd18>] do_munmap+0x284/0x470 > > [ 48.348111] [<0023bf56>] vm_munmap+0x52/0x70 > > [ 48.348113] [<0023cf32>] SyS_munmap+0x3a/0x4c > > [ 48.348114] [<00665e14>] sysc_noemu+0x22/0x28 > > [ 48.348118] [<03fffcf187b2>] 0x3fffcf187b2 > > [ 48.348119] Last Breaking-Event-Address: > > [ 48.348120] [<00265528>] > > __mem_cgroup_uncharge_common+0x2c0/0x33c > > > > Looking at the code, the code flow is: > > > > page_remove_rmap() -> mem_cgroup_uncharge_page() -> > > __mem_cgroup_uncharge_common() > > > > Note that in mem_cgroup_uncharge_page() the page in question passed the > > check: > > > > [...] > > if (PageSwapCache(page)) > > return; > > [...] > > > > and just a couple of instructions later the VM_BUG_ON() within > > __mem_cgroup_uncharge_common() triggers: > > > > [...] > > if (mem_cgroup_disabled()) > > return NULL; > > > > VM_BUG_ON(PageSwapCache(page)); > > [...] > > > > Which means that another cpu changed the pageflags concurrently. In fact, > > looking at the dump a different cpu is indeed busy with running kswapd. > > Hmm, maybe I am missing something but it really looks like we can race > here. Reclaim path takes the page lock while zap_pte takes page table > lock so nothing prevents them from racing here: > shrink_page_list zap_pte_range > trylock_page pte_offset_map_lock > add_to_swap page_remove_rmap > /* Page can be still mapped */ > add_to_swap_cache atomic_add_negative(_mapcount) > __add_to_swap_cache mem_cgroup_uncharge_page > (PageSwapCache(page)) && > return > SetPageSwapCache > __mem_cgroup_uncharge_common > VM_BUG_ON(PageSwapCache(page)) > > Maybe not many people run with CONFIG_DEBUG_VM enabled these days so we > do not this more often (even me testing configs are not consistent in > that regards and only few have it on). The only thing that changed in > this area recently is 0c59b89c which made the test VM_BUG_ON rather then > simple return in 3.6 > And maybe the BUG_ON is too harsh as CgroupUsed should guarantee that > the uncharge will eventually go away. What do you think Johannes? Interesting. We need to ensure there is ordering between setting PG_swapcache and
Re: [v3.9-rc8]: kernel BUG at mm/memcontrol.c:3994! (was: Re: [BUG][s390x] mm: system crashed)
On Wed 24-04-13 12:42:55, Heiko Carstens wrote: > On Thu, Apr 18, 2013 at 09:13:03AM +0200, Heiko Carstens wrote: > > Ok, thanks for verifying! I'll look into it; hopefully I can reproduce it > > here as well. > > That seems to be a common code bug. I can easily trigger the VM_BUG_ON() > below (when I force the system to swap): > > [ 48.347963] [ cut here ] > [ 48.347972] kernel BUG at mm/memcontrol.c:3994! > [ 48.348012] illegal operation: 0001 [#1] SMP > [ 48.348015] Modules linked in: > [ 48.348017] CPU: 1 Not tainted 3.9.0-rc8+ #38 > [ 48.348020] Process mmap2 (pid: 635, task: 29476100, ksp: > 2e91b938) > [ 48.348022] Krnl PSW : 0704f0018000 0026552c > (__mem_cgroup_uncharge_common+0x2c4/0x33c) > [ 48.348032]R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:3 PM:0 > EA:3 >Krnl GPRS: 0008 0009 03d1002a9200 > > [ 48.348039] 006812d8 03ffdf339000 > 321a6f98 > [ 48.348043]03fffce11000 0001 > 03d1002a9200 > [ 48.348046]0001 00681b88 2e91bc18 > 2e91bbd0 > [ 48.348057] Krnl Code: 0026551e: c0e5fffaa2a1brasl > %r14,1b9a60 > 00265524: a7f4ff7dbrc > 15,26541e > #00265528: a7f40001brc > 15,26552a > >0026552c: e3c0b8200124stg > %r12,6176(%r11) > 00265532: a7f4ff57brc > 15,2653e0 > 00265536: e310b8280104lg > %r1,6184(%r11) > 0026553c: a71b0001aghi%r1,1 > 00265540: e310b8280124stg > %r1,6184(%r11) > [ 48.348099] Call Trace: > [ 48.348100] ([<03d1002a91c0>] 0x3d1002a91c0) > [ 48.348102] [<002404aa>] page_remove_rmap+0xf2/0x16c > [ 48.348106] [<00232dc8>] unmap_single_vma+0x494/0x7d8 > [ 48.348107] [<00233ac0>] unmap_vmas+0x50/0x74 > [ 48.348109] [<002396ec>] unmap_region+0x9c/0x110 > [ 48.348110] [<0023bd18>] do_munmap+0x284/0x470 > [ 48.348111] [<0023bf56>] vm_munmap+0x52/0x70 > [ 48.348113] [<0023cf32>] SyS_munmap+0x3a/0x4c > [ 48.348114] [<00665e14>] sysc_noemu+0x22/0x28 > [ 48.348118] [<03fffcf187b2>] 0x3fffcf187b2 > [ 48.348119] Last Breaking-Event-Address: > [ 48.348120] [<00265528>] __mem_cgroup_uncharge_common+0x2c0/0x33c > > Looking at the code, the code flow is: > > page_remove_rmap() -> mem_cgroup_uncharge_page() -> > __mem_cgroup_uncharge_common() > > Note that in mem_cgroup_uncharge_page() the page in question passed the check: > > [...] > if (PageSwapCache(page)) > return; > [...] > > and just a couple of instructions later the VM_BUG_ON() within > __mem_cgroup_uncharge_common() triggers: > > [...] > if (mem_cgroup_disabled()) > return NULL; > > VM_BUG_ON(PageSwapCache(page)); > [...] > > Which means that another cpu changed the pageflags concurrently. In fact, > looking at the dump a different cpu is indeed busy with running kswapd. Hmm, maybe I am missing something but it really looks like we can race here. Reclaim path takes the page lock while zap_pte takes page table lock so nothing prevents them from racing here: shrink_page_listzap_pte_range trylock_pagepte_offset_map_lock add_to_swap page_remove_rmap /* Page can be still mapped */ add_to_swap_cache atomic_add_negative(_mapcount) __add_to_swap_cache mem_cgroup_uncharge_page (PageSwapCache(page)) && return SetPageSwapCache __mem_cgroup_uncharge_common VM_BUG_ON(PageSwapCache(page)) Maybe not many people run with CONFIG_DEBUG_VM enabled these days so we do not this more often (even me testing configs are not consistent in that regards and only few have it on). The only thing that changed in this area recently is 0c59b89c which made the test VM_BUG_ON rather then simple return in 3.6 And maybe the BUG_ON is too harsh as CgroupUsed should guarantee that the uncharge will eventually go away. What do you think Johannes? > So.. this seems to be somewhat broken. Anyone familiar with memcontrol? -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[v3.9-rc8]: kernel BUG at mm/memcontrol.c:3994! (was: Re: [BUG][s390x] mm: system crashed)
On Thu, Apr 18, 2013 at 09:13:03AM +0200, Heiko Carstens wrote: > Ok, thanks for verifying! I'll look into it; hopefully I can reproduce it > here as well. That seems to be a common code bug. I can easily trigger the VM_BUG_ON() below (when I force the system to swap): [ 48.347963] [ cut here ] [ 48.347972] kernel BUG at mm/memcontrol.c:3994! [ 48.348012] illegal operation: 0001 [#1] SMP [ 48.348015] Modules linked in: [ 48.348017] CPU: 1 Not tainted 3.9.0-rc8+ #38 [ 48.348020] Process mmap2 (pid: 635, task: 29476100, ksp: 2e91b938) [ 48.348022] Krnl PSW : 0704f0018000 0026552c (__mem_cgroup_uncharge_common+0x2c4/0x33c) [ 48.348032]R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:3 PM:0 EA:3 Krnl GPRS: 0008 0009 03d1002a9200 [ 48.348039] 006812d8 03ffdf339000 321a6f98 [ 48.348043]03fffce11000 0001 03d1002a9200 [ 48.348046]0001 00681b88 2e91bc18 2e91bbd0 [ 48.348057] Krnl Code: 0026551e: c0e5fffaa2a1brasl %r14,1b9a60 00265524: a7f4ff7dbrc 15,26541e #00265528: a7f40001brc 15,26552a >0026552c: e3c0b8200124stg %r12,6176(%r11) 00265532: a7f4ff57brc 15,2653e0 00265536: e310b8280104lg %r1,6184(%r11) 0026553c: a71b0001aghi%r1,1 00265540: e310b8280124stg %r1,6184(%r11) [ 48.348099] Call Trace: [ 48.348100] ([<03d1002a91c0>] 0x3d1002a91c0) [ 48.348102] [<002404aa>] page_remove_rmap+0xf2/0x16c [ 48.348106] [<00232dc8>] unmap_single_vma+0x494/0x7d8 [ 48.348107] [<00233ac0>] unmap_vmas+0x50/0x74 [ 48.348109] [<002396ec>] unmap_region+0x9c/0x110 [ 48.348110] [<0023bd18>] do_munmap+0x284/0x470 [ 48.348111] [<0023bf56>] vm_munmap+0x52/0x70 [ 48.348113] [<0023cf32>] SyS_munmap+0x3a/0x4c [ 48.348114] [<00665e14>] sysc_noemu+0x22/0x28 [ 48.348118] [<03fffcf187b2>] 0x3fffcf187b2 [ 48.348119] Last Breaking-Event-Address: [ 48.348120] [<00265528>] __mem_cgroup_uncharge_common+0x2c0/0x33c Looking at the code, the code flow is: page_remove_rmap() -> mem_cgroup_uncharge_page() -> __mem_cgroup_uncharge_common() Note that in mem_cgroup_uncharge_page() the page in question passed the check: [...] if (PageSwapCache(page)) return; [...] and just a couple of instructions later the VM_BUG_ON() within __mem_cgroup_uncharge_common() triggers: [...] if (mem_cgroup_disabled()) return NULL; VM_BUG_ON(PageSwapCache(page)); [...] Which means that another cpu changed the pageflags concurrently. In fact, looking at the dump a different cpu is indeed busy with running kswapd. So.. this seems to be somewhat broken. Anyone familiar with memcontrol? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[v3.9-rc8]: kernel BUG at mm/memcontrol.c:3994! (was: Re: [BUG][s390x] mm: system crashed)
On Thu, Apr 18, 2013 at 09:13:03AM +0200, Heiko Carstens wrote: Ok, thanks for verifying! I'll look into it; hopefully I can reproduce it here as well. That seems to be a common code bug. I can easily trigger the VM_BUG_ON() below (when I force the system to swap): [ 48.347963] [ cut here ] [ 48.347972] kernel BUG at mm/memcontrol.c:3994! [ 48.348012] illegal operation: 0001 [#1] SMP [ 48.348015] Modules linked in: [ 48.348017] CPU: 1 Not tainted 3.9.0-rc8+ #38 [ 48.348020] Process mmap2 (pid: 635, task: 29476100, ksp: 2e91b938) [ 48.348022] Krnl PSW : 0704f0018000 0026552c (__mem_cgroup_uncharge_common+0x2c4/0x33c) [ 48.348032]R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:3 PM:0 EA:3 Krnl GPRS: 0008 0009 03d1002a9200 [ 48.348039] 006812d8 03ffdf339000 321a6f98 [ 48.348043]03fffce11000 0001 03d1002a9200 [ 48.348046]0001 00681b88 2e91bc18 2e91bbd0 [ 48.348057] Krnl Code: 0026551e: c0e5fffaa2a1brasl %r14,1b9a60 00265524: a7f4ff7dbrc 15,26541e #00265528: a7f40001brc 15,26552a 0026552c: e3c0b8200124stg %r12,6176(%r11) 00265532: a7f4ff57brc 15,2653e0 00265536: e310b8280104lg %r1,6184(%r11) 0026553c: a71b0001aghi%r1,1 00265540: e310b8280124stg %r1,6184(%r11) [ 48.348099] Call Trace: [ 48.348100] ([03d1002a91c0] 0x3d1002a91c0) [ 48.348102] [002404aa] page_remove_rmap+0xf2/0x16c [ 48.348106] [00232dc8] unmap_single_vma+0x494/0x7d8 [ 48.348107] [00233ac0] unmap_vmas+0x50/0x74 [ 48.348109] [002396ec] unmap_region+0x9c/0x110 [ 48.348110] [0023bd18] do_munmap+0x284/0x470 [ 48.348111] [0023bf56] vm_munmap+0x52/0x70 [ 48.348113] [0023cf32] SyS_munmap+0x3a/0x4c [ 48.348114] [00665e14] sysc_noemu+0x22/0x28 [ 48.348118] [03fffcf187b2] 0x3fffcf187b2 [ 48.348119] Last Breaking-Event-Address: [ 48.348120] [00265528] __mem_cgroup_uncharge_common+0x2c0/0x33c Looking at the code, the code flow is: page_remove_rmap() - mem_cgroup_uncharge_page() - __mem_cgroup_uncharge_common() Note that in mem_cgroup_uncharge_page() the page in question passed the check: [...] if (PageSwapCache(page)) return; [...] and just a couple of instructions later the VM_BUG_ON() within __mem_cgroup_uncharge_common() triggers: [...] if (mem_cgroup_disabled()) return NULL; VM_BUG_ON(PageSwapCache(page)); [...] Which means that another cpu changed the pageflags concurrently. In fact, looking at the dump a different cpu is indeed busy with running kswapd. So.. this seems to be somewhat broken. Anyone familiar with memcontrol? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [v3.9-rc8]: kernel BUG at mm/memcontrol.c:3994! (was: Re: [BUG][s390x] mm: system crashed)
On Wed 24-04-13 12:42:55, Heiko Carstens wrote: On Thu, Apr 18, 2013 at 09:13:03AM +0200, Heiko Carstens wrote: Ok, thanks for verifying! I'll look into it; hopefully I can reproduce it here as well. That seems to be a common code bug. I can easily trigger the VM_BUG_ON() below (when I force the system to swap): [ 48.347963] [ cut here ] [ 48.347972] kernel BUG at mm/memcontrol.c:3994! [ 48.348012] illegal operation: 0001 [#1] SMP [ 48.348015] Modules linked in: [ 48.348017] CPU: 1 Not tainted 3.9.0-rc8+ #38 [ 48.348020] Process mmap2 (pid: 635, task: 29476100, ksp: 2e91b938) [ 48.348022] Krnl PSW : 0704f0018000 0026552c (__mem_cgroup_uncharge_common+0x2c4/0x33c) [ 48.348032]R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:3 PM:0 EA:3 Krnl GPRS: 0008 0009 03d1002a9200 [ 48.348039] 006812d8 03ffdf339000 321a6f98 [ 48.348043]03fffce11000 0001 03d1002a9200 [ 48.348046]0001 00681b88 2e91bc18 2e91bbd0 [ 48.348057] Krnl Code: 0026551e: c0e5fffaa2a1brasl %r14,1b9a60 00265524: a7f4ff7dbrc 15,26541e #00265528: a7f40001brc 15,26552a 0026552c: e3c0b8200124stg %r12,6176(%r11) 00265532: a7f4ff57brc 15,2653e0 00265536: e310b8280104lg %r1,6184(%r11) 0026553c: a71b0001aghi%r1,1 00265540: e310b8280124stg %r1,6184(%r11) [ 48.348099] Call Trace: [ 48.348100] ([03d1002a91c0] 0x3d1002a91c0) [ 48.348102] [002404aa] page_remove_rmap+0xf2/0x16c [ 48.348106] [00232dc8] unmap_single_vma+0x494/0x7d8 [ 48.348107] [00233ac0] unmap_vmas+0x50/0x74 [ 48.348109] [002396ec] unmap_region+0x9c/0x110 [ 48.348110] [0023bd18] do_munmap+0x284/0x470 [ 48.348111] [0023bf56] vm_munmap+0x52/0x70 [ 48.348113] [0023cf32] SyS_munmap+0x3a/0x4c [ 48.348114] [00665e14] sysc_noemu+0x22/0x28 [ 48.348118] [03fffcf187b2] 0x3fffcf187b2 [ 48.348119] Last Breaking-Event-Address: [ 48.348120] [00265528] __mem_cgroup_uncharge_common+0x2c0/0x33c Looking at the code, the code flow is: page_remove_rmap() - mem_cgroup_uncharge_page() - __mem_cgroup_uncharge_common() Note that in mem_cgroup_uncharge_page() the page in question passed the check: [...] if (PageSwapCache(page)) return; [...] and just a couple of instructions later the VM_BUG_ON() within __mem_cgroup_uncharge_common() triggers: [...] if (mem_cgroup_disabled()) return NULL; VM_BUG_ON(PageSwapCache(page)); [...] Which means that another cpu changed the pageflags concurrently. In fact, looking at the dump a different cpu is indeed busy with running kswapd. Hmm, maybe I am missing something but it really looks like we can race here. Reclaim path takes the page lock while zap_pte takes page table lock so nothing prevents them from racing here: shrink_page_listzap_pte_range trylock_pagepte_offset_map_lock add_to_swap page_remove_rmap /* Page can be still mapped */ add_to_swap_cache atomic_add_negative(_mapcount) __add_to_swap_cache mem_cgroup_uncharge_page (PageSwapCache(page)) return SetPageSwapCache __mem_cgroup_uncharge_common VM_BUG_ON(PageSwapCache(page)) Maybe not many people run with CONFIG_DEBUG_VM enabled these days so we do not this more often (even me testing configs are not consistent in that regards and only few have it on). The only thing that changed in this area recently is 0c59b89c which made the test VM_BUG_ON rather then simple return in 3.6 And maybe the BUG_ON is too harsh as CgroupUsed should guarantee that the uncharge will eventually go away. What do you think Johannes? So.. this seems to be somewhat broken. Anyone familiar with memcontrol? -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [v3.9-rc8]: kernel BUG at mm/memcontrol.c:3994! (was: Re: [BUG][s390x] mm: system crashed)
On Wed, Apr 24, 2013 at 03:18:51PM +0200, Michal Hocko wrote: On Wed 24-04-13 12:42:55, Heiko Carstens wrote: On Thu, Apr 18, 2013 at 09:13:03AM +0200, Heiko Carstens wrote: Ok, thanks for verifying! I'll look into it; hopefully I can reproduce it here as well. That seems to be a common code bug. I can easily trigger the VM_BUG_ON() below (when I force the system to swap): [ 48.347963] [ cut here ] [ 48.347972] kernel BUG at mm/memcontrol.c:3994! [ 48.348012] illegal operation: 0001 [#1] SMP [ 48.348015] Modules linked in: [ 48.348017] CPU: 1 Not tainted 3.9.0-rc8+ #38 [ 48.348020] Process mmap2 (pid: 635, task: 29476100, ksp: 2e91b938) [ 48.348022] Krnl PSW : 0704f0018000 0026552c (__mem_cgroup_uncharge_common+0x2c4/0x33c) [ 48.348032]R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:3 PM:0 EA:3 Krnl GPRS: 0008 0009 03d1002a9200 [ 48.348039] 006812d8 03ffdf339000 321a6f98 [ 48.348043]03fffce11000 0001 03d1002a9200 [ 48.348046]0001 00681b88 2e91bc18 2e91bbd0 [ 48.348057] Krnl Code: 0026551e: c0e5fffaa2a1brasl %r14,1b9a60 00265524: a7f4ff7dbrc 15,26541e #00265528: a7f40001brc 15,26552a 0026552c: e3c0b8200124stg %r12,6176(%r11) 00265532: a7f4ff57brc 15,2653e0 00265536: e310b8280104lg %r1,6184(%r11) 0026553c: a71b0001aghi %r1,1 00265540: e310b8280124stg %r1,6184(%r11) [ 48.348099] Call Trace: [ 48.348100] ([03d1002a91c0] 0x3d1002a91c0) [ 48.348102] [002404aa] page_remove_rmap+0xf2/0x16c [ 48.348106] [00232dc8] unmap_single_vma+0x494/0x7d8 [ 48.348107] [00233ac0] unmap_vmas+0x50/0x74 [ 48.348109] [002396ec] unmap_region+0x9c/0x110 [ 48.348110] [0023bd18] do_munmap+0x284/0x470 [ 48.348111] [0023bf56] vm_munmap+0x52/0x70 [ 48.348113] [0023cf32] SyS_munmap+0x3a/0x4c [ 48.348114] [00665e14] sysc_noemu+0x22/0x28 [ 48.348118] [03fffcf187b2] 0x3fffcf187b2 [ 48.348119] Last Breaking-Event-Address: [ 48.348120] [00265528] __mem_cgroup_uncharge_common+0x2c0/0x33c Looking at the code, the code flow is: page_remove_rmap() - mem_cgroup_uncharge_page() - __mem_cgroup_uncharge_common() Note that in mem_cgroup_uncharge_page() the page in question passed the check: [...] if (PageSwapCache(page)) return; [...] and just a couple of instructions later the VM_BUG_ON() within __mem_cgroup_uncharge_common() triggers: [...] if (mem_cgroup_disabled()) return NULL; VM_BUG_ON(PageSwapCache(page)); [...] Which means that another cpu changed the pageflags concurrently. In fact, looking at the dump a different cpu is indeed busy with running kswapd. Hmm, maybe I am missing something but it really looks like we can race here. Reclaim path takes the page lock while zap_pte takes page table lock so nothing prevents them from racing here: shrink_page_list zap_pte_range trylock_page pte_offset_map_lock add_to_swap page_remove_rmap /* Page can be still mapped */ add_to_swap_cache atomic_add_negative(_mapcount) __add_to_swap_cache mem_cgroup_uncharge_page (PageSwapCache(page)) return SetPageSwapCache __mem_cgroup_uncharge_common VM_BUG_ON(PageSwapCache(page)) Maybe not many people run with CONFIG_DEBUG_VM enabled these days so we do not this more often (even me testing configs are not consistent in that regards and only few have it on). The only thing that changed in this area recently is 0c59b89c which made the test VM_BUG_ON rather then simple return in 3.6 And maybe the BUG_ON is too harsh as CgroupUsed should guarantee that the uncharge will eventually go away. What do you think Johannes? Interesting. We need to ensure there is ordering between setting PG_swapcache and installing swap entries because I think we are the only ones looking at PG_swapcache without the page lock held. So we don't have a safe way to check for PG_swapcache but if we get it wrong, we may steal an uncharge that
Re: [v3.9-rc8]: kernel BUG at mm/memcontrol.c:3994! (was: Re: [BUG][s390x] mm: system crashed)
On Wed, 24 Apr 2013, Johannes Weiner wrote: On Wed, Apr 24, 2013 at 03:18:51PM +0200, Michal Hocko wrote: On Wed 24-04-13 12:42:55, Heiko Carstens wrote: On Thu, Apr 18, 2013 at 09:13:03AM +0200, Heiko Carstens wrote: Ok, thanks for verifying! I'll look into it; hopefully I can reproduce it here as well. That seems to be a common code bug. I can easily trigger the VM_BUG_ON() below (when I force the system to swap): [ 48.347963] [ cut here ] [ 48.347972] kernel BUG at mm/memcontrol.c:3994! [ 48.348012] illegal operation: 0001 [#1] SMP [ 48.348015] Modules linked in: [ 48.348017] CPU: 1 Not tainted 3.9.0-rc8+ #38 [ 48.348020] Process mmap2 (pid: 635, task: 29476100, ksp: 2e91b938) [ 48.348022] Krnl PSW : 0704f0018000 0026552c (__mem_cgroup_uncharge_common+0x2c4/0x33c) [ 48.348032]R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:3 PM:0 EA:3 Krnl GPRS: 0008 0009 03d1002a9200 [ 48.348039] 006812d8 03ffdf339000 321a6f98 [ 48.348043]03fffce11000 0001 03d1002a9200 [ 48.348046]0001 00681b88 2e91bc18 2e91bbd0 [ 48.348057] Krnl Code: 0026551e: c0e5fffaa2a1brasl %r14,1b9a60 00265524: a7f4ff7dbrc 15,26541e #00265528: a7f40001brc 15,26552a 0026552c: e3c0b8200124stg %r12,6176(%r11) 00265532: a7f4ff57brc 15,2653e0 00265536: e310b8280104lg %r1,6184(%r11) 0026553c: a71b0001aghi %r1,1 00265540: e310b8280124stg %r1,6184(%r11) [ 48.348099] Call Trace: [ 48.348100] ([03d1002a91c0] 0x3d1002a91c0) [ 48.348102] [002404aa] page_remove_rmap+0xf2/0x16c [ 48.348106] [00232dc8] unmap_single_vma+0x494/0x7d8 [ 48.348107] [00233ac0] unmap_vmas+0x50/0x74 [ 48.348109] [002396ec] unmap_region+0x9c/0x110 [ 48.348110] [0023bd18] do_munmap+0x284/0x470 [ 48.348111] [0023bf56] vm_munmap+0x52/0x70 [ 48.348113] [0023cf32] SyS_munmap+0x3a/0x4c [ 48.348114] [00665e14] sysc_noemu+0x22/0x28 [ 48.348118] [03fffcf187b2] 0x3fffcf187b2 [ 48.348119] Last Breaking-Event-Address: [ 48.348120] [00265528] __mem_cgroup_uncharge_common+0x2c0/0x33c Looking at the code, the code flow is: page_remove_rmap() - mem_cgroup_uncharge_page() - __mem_cgroup_uncharge_common() Note that in mem_cgroup_uncharge_page() the page in question passed the check: [...] if (PageSwapCache(page)) return; [...] and just a couple of instructions later the VM_BUG_ON() within __mem_cgroup_uncharge_common() triggers: [...] if (mem_cgroup_disabled()) return NULL; VM_BUG_ON(PageSwapCache(page)); [...] Which means that another cpu changed the pageflags concurrently. In fact, looking at the dump a different cpu is indeed busy with running kswapd. Hmm, maybe I am missing something but it really looks like we can race here. Reclaim path takes the page lock while zap_pte takes page table lock so nothing prevents them from racing here: shrink_page_listzap_pte_range trylock_pagepte_offset_map_lock add_to_swap page_remove_rmap /* Page can be still mapped */ add_to_swap_cache atomic_add_negative(_mapcount) __add_to_swap_cache mem_cgroup_uncharge_page (PageSwapCache(page)) return SetPageSwapCache __mem_cgroup_uncharge_common VM_BUG_ON(PageSwapCache(page)) Maybe not many people run with CONFIG_DEBUG_VM enabled these days so we do not this more often (even me testing configs are not consistent in that regards and only few have it on). The only thing that changed in this area recently is 0c59b89c which made the test VM_BUG_ON rather then simple return in 3.6 And maybe the BUG_ON is too harsh as CgroupUsed should guarantee that the uncharge will eventually go away. What do you think Johannes? Interesting. We need to ensure there is ordering between setting PG_swapcache and installing swap entries because I think we are the only ones
Re: [BUG][s390x] mm: system crashed
On Thu, Apr 18, 2013 at 02:27:45AM -0400, Zhouping Liu wrote: > Hello Heiko, > > If you have some time, could you please repeat your test with the kernel > > command line option " user_mode=home "? > > I tested the system with the kernel parameter, but the issue still appeared, > I just to say it takes longer time to reproduce the issue than the before. > > > > > As far as I can tell there was only one s390 patch merged that was > > mmap related: 486c0a0bc80d370471b21662bf03f04fbb37cdc6 "s390/mm: Fix crst > > upgrade of mmap with MAP_FIXED". > > also I tested the revert commit, unluckily, the same issue as the before. Ok, thanks for verifying! I'll look into it; hopefully I can reproduce it here as well. Thanks! -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG][s390x] mm: system crashed
Hello Heiko, - Original Message - > From: "Heiko Carstens" > To: "Zhouping Liu" > Cc: linux...@kvack.org, "LKML" , "caiqian" > , "Caspar Zhang" > , "Martin Schwidefsky" > Sent: Tuesday, April 16, 2013 3:50:47 PM > Subject: Re: [BUG][s390x] mm: system crashed > > On Mon, Apr 15, 2013 at 02:16:55PM +0800, Zhouping Liu wrote: > > On 04/15/2013 01:56 PM, Heiko Carstens wrote: > > >On Sun, Apr 14, 2013 at 11:28:40PM -0400, Zhouping Liu wrote: > > >>� 16109.346170¨ Call Trace: > > >>� 16109.346179¨ (� <00100920>¨ show_trace+0x128/0x12c) > > >>� 16109.346195¨ � <001cd320>¨ rcu_check_callbacks+0x458/0xccc > > >>� 16109.346209¨ � <00140f2e>¨ update_process_times+0x4a/0x74 > > >>� 16109.346222¨ � <00199452>¨ > > >>tick_sched_handle.isra.12+0x5e/0x70 > > >>� 16109.346235¨ � <001995aa>¨ tick_sched_timer+0x6a/0x98 > > >>� 16109.346247¨ � <0015c1ea>¨ __run_hrtimer+0x8e/0x200 > > >>� 16109.346381¨ � <0015d1b2>¨ hrtimer_interrupt+0x212/0x2b0 > > >>� 16109.346385¨ � <001040f6>¨ clock_comparator_work+0x4a/0x54 > > >>� 16109.346390¨ � <0010d658>¨ do_extint+0x158/0x15c > > >>� 16109.346396¨ � <0062aa24>¨ ext_skip+0x38/0x3c > > >>� 16109.346404¨ � <001153c8>¨ smp_yield_cpu+0x44/0x48 > > >>� 16109.346412¨ (� <03d10051aec0>¨ 0x3d10051aec0) > > >>� 16109.346457¨ � <0024206a>¨ __page_check_address+0x16a/0x170 > > >>� 16109.346466¨ � <002423a2>¨ page_referenced_one+0x3e/0xa0 > > >>� 16109.346501¨ � <0024427c>¨ page_referenced+0x32c/0x41c > > >>� 16109.346510¨ � <0021b1dc>¨ shrink_page_list+0x380/0xb9c > > >>� 16109.346521¨ � <0021c0a6>¨ shrink_inactive_list+0x1c6/0x56c > > >>� 16109.346532¨ � <0021c69e>¨ shrink_lruvec+0x252/0x56c > > >>� 16109.346542¨ � <0021ca44>¨ shrink_zone+0x8c/0x1bc > > >>� 16109.346553¨ � <0021d080>¨ balance_pgdat+0x50c/0x658 > > >>� 16109.346564¨ � <0021d318>¨ kswapd+0x14c/0x470 > > >>� 16109.346576¨ � <00158292>¨ kthread+0xda/0xe4 > > >>� 16109.346656¨ � <0062a5de>¨ kernel_thread_starter+0x6/0xc > > >>� 16109.346682¨ � <0062a5d8>¨ kernel_thread_starter+0x0/0xc > > >>[-- MARK -- Fri Apr 12 06:15:00 2013] > > >>� 16289.386061¨ INFO: rcu_sched self-detected stall on CPU { 0} (t=42010 > > >>jiffies > > >> g=89766 c=89765 q=10627) > > >Did the system really crash or did you just see the rcu related > > >warning(s)? > > > > I just check it again, actually at first the system didn't really > > crash, but the system is very slow in response. > > and the reproducer process can't be killed, after I did some common > > actions such as 'ls' 'vim' etc, the system > > seemed to be really crashed, no any response. > > > > also in the previous testing, I can remember that the system would > > be no any response for a long time, just only > > repeatedly print out the such above 'Call Trace' into console. > > Ok, thanks. > Just a couple of more questions: did you see this also on other archs, or > just > s390 (if you tried other platforms at all). > > If you have some time, could you please repeat your test with the kernel > command line option " user_mode=home "? I tested the system with the kernel parameter, but the issue still appeared, I just to say it takes longer time to reproduce the issue than the before. > > As far as I can tell there was only one s390 patch merged that was > mmap related: 486c0a0bc80d370471b21662bf03f04fbb37cdc6 "s390/mm: Fix crst > upgrade of mmap with MAP_FIXED". also I tested the revert commit, unluckily, the same issue as the before. -- Thanks, Zhouping -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG][s390x] mm: system crashed
Hello Heiko, - Original Message - From: Heiko Carstens heiko.carst...@de.ibm.com To: Zhouping Liu z...@redhat.com Cc: linux...@kvack.org, LKML linux-kernel@vger.kernel.org, caiqian caiq...@redhat.com, Caspar Zhang czh...@redhat.com, Martin Schwidefsky schwidef...@de.ibm.com Sent: Tuesday, April 16, 2013 3:50:47 PM Subject: Re: [BUG][s390x] mm: system crashed On Mon, Apr 15, 2013 at 02:16:55PM +0800, Zhouping Liu wrote: On 04/15/2013 01:56 PM, Heiko Carstens wrote: On Sun, Apr 14, 2013 at 11:28:40PM -0400, Zhouping Liu wrote: � 16109.346170¨ Call Trace: � 16109.346179¨ (� 00100920¨ show_trace+0x128/0x12c) � 16109.346195¨ � 001cd320¨ rcu_check_callbacks+0x458/0xccc � 16109.346209¨ � 00140f2e¨ update_process_times+0x4a/0x74 � 16109.346222¨ � 00199452¨ tick_sched_handle.isra.12+0x5e/0x70 � 16109.346235¨ � 001995aa¨ tick_sched_timer+0x6a/0x98 � 16109.346247¨ � 0015c1ea¨ __run_hrtimer+0x8e/0x200 � 16109.346381¨ � 0015d1b2¨ hrtimer_interrupt+0x212/0x2b0 � 16109.346385¨ � 001040f6¨ clock_comparator_work+0x4a/0x54 � 16109.346390¨ � 0010d658¨ do_extint+0x158/0x15c � 16109.346396¨ � 0062aa24¨ ext_skip+0x38/0x3c � 16109.346404¨ � 001153c8¨ smp_yield_cpu+0x44/0x48 � 16109.346412¨ (� 03d10051aec0¨ 0x3d10051aec0) � 16109.346457¨ � 0024206a¨ __page_check_address+0x16a/0x170 � 16109.346466¨ � 002423a2¨ page_referenced_one+0x3e/0xa0 � 16109.346501¨ � 0024427c¨ page_referenced+0x32c/0x41c � 16109.346510¨ � 0021b1dc¨ shrink_page_list+0x380/0xb9c � 16109.346521¨ � 0021c0a6¨ shrink_inactive_list+0x1c6/0x56c � 16109.346532¨ � 0021c69e¨ shrink_lruvec+0x252/0x56c � 16109.346542¨ � 0021ca44¨ shrink_zone+0x8c/0x1bc � 16109.346553¨ � 0021d080¨ balance_pgdat+0x50c/0x658 � 16109.346564¨ � 0021d318¨ kswapd+0x14c/0x470 � 16109.346576¨ � 00158292¨ kthread+0xda/0xe4 � 16109.346656¨ � 0062a5de¨ kernel_thread_starter+0x6/0xc � 16109.346682¨ � 0062a5d8¨ kernel_thread_starter+0x0/0xc [-- MARK -- Fri Apr 12 06:15:00 2013] � 16289.386061¨ INFO: rcu_sched self-detected stall on CPU { 0} (t=42010 jiffies g=89766 c=89765 q=10627) Did the system really crash or did you just see the rcu related warning(s)? I just check it again, actually at first the system didn't really crash, but the system is very slow in response. and the reproducer process can't be killed, after I did some common actions such as 'ls' 'vim' etc, the system seemed to be really crashed, no any response. also in the previous testing, I can remember that the system would be no any response for a long time, just only repeatedly print out the such above 'Call Trace' into console. Ok, thanks. Just a couple of more questions: did you see this also on other archs, or just s390 (if you tried other platforms at all). If you have some time, could you please repeat your test with the kernel command line option user_mode=home ? I tested the system with the kernel parameter, but the issue still appeared, I just to say it takes longer time to reproduce the issue than the before. As far as I can tell there was only one s390 patch merged that was mmap related: 486c0a0bc80d370471b21662bf03f04fbb37cdc6 s390/mm: Fix crst upgrade of mmap with MAP_FIXED. also I tested the revert commit, unluckily, the same issue as the before. -- Thanks, Zhouping -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG][s390x] mm: system crashed
On Thu, Apr 18, 2013 at 02:27:45AM -0400, Zhouping Liu wrote: Hello Heiko, If you have some time, could you please repeat your test with the kernel command line option user_mode=home ? I tested the system with the kernel parameter, but the issue still appeared, I just to say it takes longer time to reproduce the issue than the before. As far as I can tell there was only one s390 patch merged that was mmap related: 486c0a0bc80d370471b21662bf03f04fbb37cdc6 s390/mm: Fix crst upgrade of mmap with MAP_FIXED. also I tested the revert commit, unluckily, the same issue as the before. Ok, thanks for verifying! I'll look into it; hopefully I can reproduce it here as well. Thanks! -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG][s390x] mm: system crashed
On Tue, Apr 16, 2013 at 03:56:59PM +0800, Simon Jeons wrote: > Hi Heiko, > >If you have some time, could you please repeat your test with the kernel > >command line option " user_mode=home "? > > What's the meaning of this command line? I can't find it in > Documentation/kernel-parameters.txt/ It switches the architectural address space where kernel and user space reside in. We only recently switched the default address space for user space from home space to primary space, since that's needed for kvm. The user space runs in home space mode will be removed in the future; we keep it currently as fallback, just in case something breaks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG][s390x] mm: system crashed
Hi Heiko, On 04/16/2013 03:50 PM, Heiko Carstens wrote: On Mon, Apr 15, 2013 at 02:16:55PM +0800, Zhouping Liu wrote: On 04/15/2013 01:56 PM, Heiko Carstens wrote: On Sun, Apr 14, 2013 at 11:28:40PM -0400, Zhouping Liu wrote: � 16109.346170¨ Call Trace: � 16109.346179¨ (� <00100920>¨ show_trace+0x128/0x12c) � 16109.346195¨ � <001cd320>¨ rcu_check_callbacks+0x458/0xccc � 16109.346209¨ � <00140f2e>¨ update_process_times+0x4a/0x74 � 16109.346222¨ � <00199452>¨ tick_sched_handle.isra.12+0x5e/0x70 � 16109.346235¨ � <001995aa>¨ tick_sched_timer+0x6a/0x98 � 16109.346247¨ � <0015c1ea>¨ __run_hrtimer+0x8e/0x200 � 16109.346381¨ � <0015d1b2>¨ hrtimer_interrupt+0x212/0x2b0 � 16109.346385¨ � <001040f6>¨ clock_comparator_work+0x4a/0x54 � 16109.346390¨ � <0010d658>¨ do_extint+0x158/0x15c � 16109.346396¨ � <0062aa24>¨ ext_skip+0x38/0x3c � 16109.346404¨ � <001153c8>¨ smp_yield_cpu+0x44/0x48 � 16109.346412¨ (� <03d10051aec0>¨ 0x3d10051aec0) � 16109.346457¨ � <0024206a>¨ __page_check_address+0x16a/0x170 � 16109.346466¨ � <002423a2>¨ page_referenced_one+0x3e/0xa0 � 16109.346501¨ � <0024427c>¨ page_referenced+0x32c/0x41c � 16109.346510¨ � <0021b1dc>¨ shrink_page_list+0x380/0xb9c � 16109.346521¨ � <0021c0a6>¨ shrink_inactive_list+0x1c6/0x56c � 16109.346532¨ � <0021c69e>¨ shrink_lruvec+0x252/0x56c � 16109.346542¨ � <0021ca44>¨ shrink_zone+0x8c/0x1bc � 16109.346553¨ � <0021d080>¨ balance_pgdat+0x50c/0x658 � 16109.346564¨ � <0021d318>¨ kswapd+0x14c/0x470 � 16109.346576¨ � <00158292>¨ kthread+0xda/0xe4 � 16109.346656¨ � <0062a5de>¨ kernel_thread_starter+0x6/0xc � 16109.346682¨ � <0062a5d8>¨ kernel_thread_starter+0x0/0xc [-- MARK -- Fri Apr 12 06:15:00 2013] � 16289.386061¨ INFO: rcu_sched self-detected stall on CPU { 0} (t=42010 jiffies g=89766 c=89765 q=10627) Did the system really crash or did you just see the rcu related warning(s)? I just check it again, actually at first the system didn't really crash, but the system is very slow in response. and the reproducer process can't be killed, after I did some common actions such as 'ls' 'vim' etc, the system seemed to be really crashed, no any response. also in the previous testing, I can remember that the system would be no any response for a long time, just only repeatedly print out the such above 'Call Trace' into console. Ok, thanks. Just a couple of more questions: did you see this also on other archs, or just s390 (if you tried other platforms at all). If you have some time, could you please repeat your test with the kernel command line option " user_mode=home "? What's the meaning of this command line? I can't find it in Documentation/kernel-parameters.txt/ As far as I can tell there was only one s390 patch merged that was mmap related: 486c0a0bc80d370471b21662bf03f04fbb37cdc6 "s390/mm: Fix crst upgrade of mmap with MAP_FIXED". Even though I don't think it explains the bug you've seen it might be worth to try to revert it. And at last, can you share your kernel config? Thanks, Heiko -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majord...@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: mailto:"d...@kvack.org;> em...@kvack.org -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG][s390x] mm: system crashed
On Mon, Apr 15, 2013 at 02:16:55PM +0800, Zhouping Liu wrote: > On 04/15/2013 01:56 PM, Heiko Carstens wrote: > >On Sun, Apr 14, 2013 at 11:28:40PM -0400, Zhouping Liu wrote: > >>� 16109.346170¨ Call Trace: > >>� 16109.346179¨ (� <00100920>¨ show_trace+0x128/0x12c) > >>� 16109.346195¨ � <001cd320>¨ rcu_check_callbacks+0x458/0xccc > >>� 16109.346209¨ � <00140f2e>¨ update_process_times+0x4a/0x74 > >>� 16109.346222¨ � <00199452>¨ tick_sched_handle.isra.12+0x5e/0x70 > >>� 16109.346235¨ � <001995aa>¨ tick_sched_timer+0x6a/0x98 > >>� 16109.346247¨ � <0015c1ea>¨ __run_hrtimer+0x8e/0x200 > >>� 16109.346381¨ � <0015d1b2>¨ hrtimer_interrupt+0x212/0x2b0 > >>� 16109.346385¨ � <001040f6>¨ clock_comparator_work+0x4a/0x54 > >>� 16109.346390¨ � <0010d658>¨ do_extint+0x158/0x15c > >>� 16109.346396¨ � <0062aa24>¨ ext_skip+0x38/0x3c > >>� 16109.346404¨ � <001153c8>¨ smp_yield_cpu+0x44/0x48 > >>� 16109.346412¨ (� <03d10051aec0>¨ 0x3d10051aec0) > >>� 16109.346457¨ � <0024206a>¨ __page_check_address+0x16a/0x170 > >>� 16109.346466¨ � <002423a2>¨ page_referenced_one+0x3e/0xa0 > >>� 16109.346501¨ � <0024427c>¨ page_referenced+0x32c/0x41c > >>� 16109.346510¨ � <0021b1dc>¨ shrink_page_list+0x380/0xb9c > >>� 16109.346521¨ � <0021c0a6>¨ shrink_inactive_list+0x1c6/0x56c > >>� 16109.346532¨ � <0021c69e>¨ shrink_lruvec+0x252/0x56c > >>� 16109.346542¨ � <0021ca44>¨ shrink_zone+0x8c/0x1bc > >>� 16109.346553¨ � <0021d080>¨ balance_pgdat+0x50c/0x658 > >>� 16109.346564¨ � <0021d318>¨ kswapd+0x14c/0x470 > >>� 16109.346576¨ � <00158292>¨ kthread+0xda/0xe4 > >>� 16109.346656¨ � <0062a5de>¨ kernel_thread_starter+0x6/0xc > >>� 16109.346682¨ � <0062a5d8>¨ kernel_thread_starter+0x0/0xc > >>[-- MARK -- Fri Apr 12 06:15:00 2013] > >>� 16289.386061¨ INFO: rcu_sched self-detected stall on CPU { 0} (t=42010 > >>jiffies > >> g=89766 c=89765 q=10627) > >Did the system really crash or did you just see the rcu related warning(s)? > > I just check it again, actually at first the system didn't really > crash, but the system is very slow in response. > and the reproducer process can't be killed, after I did some common > actions such as 'ls' 'vim' etc, the system > seemed to be really crashed, no any response. > > also in the previous testing, I can remember that the system would > be no any response for a long time, just only > repeatedly print out the such above 'Call Trace' into console. Ok, thanks. Just a couple of more questions: did you see this also on other archs, or just s390 (if you tried other platforms at all). If you have some time, could you please repeat your test with the kernel command line option " user_mode=home "? As far as I can tell there was only one s390 patch merged that was mmap related: 486c0a0bc80d370471b21662bf03f04fbb37cdc6 "s390/mm: Fix crst upgrade of mmap with MAP_FIXED". Even though I don't think it explains the bug you've seen it might be worth to try to revert it. And at last, can you share your kernel config? Thanks, Heiko -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG][s390x] mm: system crashed
On Mon, Apr 15, 2013 at 02:16:55PM +0800, Zhouping Liu wrote: On 04/15/2013 01:56 PM, Heiko Carstens wrote: On Sun, Apr 14, 2013 at 11:28:40PM -0400, Zhouping Liu wrote: � 16109.346170¨ Call Trace: � 16109.346179¨ (� 00100920¨ show_trace+0x128/0x12c) � 16109.346195¨ � 001cd320¨ rcu_check_callbacks+0x458/0xccc � 16109.346209¨ � 00140f2e¨ update_process_times+0x4a/0x74 � 16109.346222¨ � 00199452¨ tick_sched_handle.isra.12+0x5e/0x70 � 16109.346235¨ � 001995aa¨ tick_sched_timer+0x6a/0x98 � 16109.346247¨ � 0015c1ea¨ __run_hrtimer+0x8e/0x200 � 16109.346381¨ � 0015d1b2¨ hrtimer_interrupt+0x212/0x2b0 � 16109.346385¨ � 001040f6¨ clock_comparator_work+0x4a/0x54 � 16109.346390¨ � 0010d658¨ do_extint+0x158/0x15c � 16109.346396¨ � 0062aa24¨ ext_skip+0x38/0x3c � 16109.346404¨ � 001153c8¨ smp_yield_cpu+0x44/0x48 � 16109.346412¨ (� 03d10051aec0¨ 0x3d10051aec0) � 16109.346457¨ � 0024206a¨ __page_check_address+0x16a/0x170 � 16109.346466¨ � 002423a2¨ page_referenced_one+0x3e/0xa0 � 16109.346501¨ � 0024427c¨ page_referenced+0x32c/0x41c � 16109.346510¨ � 0021b1dc¨ shrink_page_list+0x380/0xb9c � 16109.346521¨ � 0021c0a6¨ shrink_inactive_list+0x1c6/0x56c � 16109.346532¨ � 0021c69e¨ shrink_lruvec+0x252/0x56c � 16109.346542¨ � 0021ca44¨ shrink_zone+0x8c/0x1bc � 16109.346553¨ � 0021d080¨ balance_pgdat+0x50c/0x658 � 16109.346564¨ � 0021d318¨ kswapd+0x14c/0x470 � 16109.346576¨ � 00158292¨ kthread+0xda/0xe4 � 16109.346656¨ � 0062a5de¨ kernel_thread_starter+0x6/0xc � 16109.346682¨ � 0062a5d8¨ kernel_thread_starter+0x0/0xc [-- MARK -- Fri Apr 12 06:15:00 2013] � 16289.386061¨ INFO: rcu_sched self-detected stall on CPU { 0} (t=42010 jiffies g=89766 c=89765 q=10627) Did the system really crash or did you just see the rcu related warning(s)? I just check it again, actually at first the system didn't really crash, but the system is very slow in response. and the reproducer process can't be killed, after I did some common actions such as 'ls' 'vim' etc, the system seemed to be really crashed, no any response. also in the previous testing, I can remember that the system would be no any response for a long time, just only repeatedly print out the such above 'Call Trace' into console. Ok, thanks. Just a couple of more questions: did you see this also on other archs, or just s390 (if you tried other platforms at all). If you have some time, could you please repeat your test with the kernel command line option user_mode=home ? As far as I can tell there was only one s390 patch merged that was mmap related: 486c0a0bc80d370471b21662bf03f04fbb37cdc6 s390/mm: Fix crst upgrade of mmap with MAP_FIXED. Even though I don't think it explains the bug you've seen it might be worth to try to revert it. And at last, can you share your kernel config? Thanks, Heiko -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG][s390x] mm: system crashed
Hi Heiko, On 04/16/2013 03:50 PM, Heiko Carstens wrote: On Mon, Apr 15, 2013 at 02:16:55PM +0800, Zhouping Liu wrote: On 04/15/2013 01:56 PM, Heiko Carstens wrote: On Sun, Apr 14, 2013 at 11:28:40PM -0400, Zhouping Liu wrote: � 16109.346170¨ Call Trace: � 16109.346179¨ (� 00100920¨ show_trace+0x128/0x12c) � 16109.346195¨ � 001cd320¨ rcu_check_callbacks+0x458/0xccc � 16109.346209¨ � 00140f2e¨ update_process_times+0x4a/0x74 � 16109.346222¨ � 00199452¨ tick_sched_handle.isra.12+0x5e/0x70 � 16109.346235¨ � 001995aa¨ tick_sched_timer+0x6a/0x98 � 16109.346247¨ � 0015c1ea¨ __run_hrtimer+0x8e/0x200 � 16109.346381¨ � 0015d1b2¨ hrtimer_interrupt+0x212/0x2b0 � 16109.346385¨ � 001040f6¨ clock_comparator_work+0x4a/0x54 � 16109.346390¨ � 0010d658¨ do_extint+0x158/0x15c � 16109.346396¨ � 0062aa24¨ ext_skip+0x38/0x3c � 16109.346404¨ � 001153c8¨ smp_yield_cpu+0x44/0x48 � 16109.346412¨ (� 03d10051aec0¨ 0x3d10051aec0) � 16109.346457¨ � 0024206a¨ __page_check_address+0x16a/0x170 � 16109.346466¨ � 002423a2¨ page_referenced_one+0x3e/0xa0 � 16109.346501¨ � 0024427c¨ page_referenced+0x32c/0x41c � 16109.346510¨ � 0021b1dc¨ shrink_page_list+0x380/0xb9c � 16109.346521¨ � 0021c0a6¨ shrink_inactive_list+0x1c6/0x56c � 16109.346532¨ � 0021c69e¨ shrink_lruvec+0x252/0x56c � 16109.346542¨ � 0021ca44¨ shrink_zone+0x8c/0x1bc � 16109.346553¨ � 0021d080¨ balance_pgdat+0x50c/0x658 � 16109.346564¨ � 0021d318¨ kswapd+0x14c/0x470 � 16109.346576¨ � 00158292¨ kthread+0xda/0xe4 � 16109.346656¨ � 0062a5de¨ kernel_thread_starter+0x6/0xc � 16109.346682¨ � 0062a5d8¨ kernel_thread_starter+0x0/0xc [-- MARK -- Fri Apr 12 06:15:00 2013] � 16289.386061¨ INFO: rcu_sched self-detected stall on CPU { 0} (t=42010 jiffies g=89766 c=89765 q=10627) Did the system really crash or did you just see the rcu related warning(s)? I just check it again, actually at first the system didn't really crash, but the system is very slow in response. and the reproducer process can't be killed, after I did some common actions such as 'ls' 'vim' etc, the system seemed to be really crashed, no any response. also in the previous testing, I can remember that the system would be no any response for a long time, just only repeatedly print out the such above 'Call Trace' into console. Ok, thanks. Just a couple of more questions: did you see this also on other archs, or just s390 (if you tried other platforms at all). If you have some time, could you please repeat your test with the kernel command line option user_mode=home ? What's the meaning of this command line? I can't find it in Documentation/kernel-parameters.txt/ As far as I can tell there was only one s390 patch merged that was mmap related: 486c0a0bc80d370471b21662bf03f04fbb37cdc6 s390/mm: Fix crst upgrade of mmap with MAP_FIXED. Even though I don't think it explains the bug you've seen it might be worth to try to revert it. And at last, can you share your kernel config? Thanks, Heiko -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majord...@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: a href=mailto:d...@kvack.org; em...@kvack.org /a -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG][s390x] mm: system crashed
On Tue, Apr 16, 2013 at 03:56:59PM +0800, Simon Jeons wrote: Hi Heiko, If you have some time, could you please repeat your test with the kernel command line option user_mode=home ? What's the meaning of this command line? I can't find it in Documentation/kernel-parameters.txt/ It switches the architectural address space where kernel and user space reside in. We only recently switched the default address space for user space from home space to primary space, since that's needed for kvm. The user space runs in home space mode will be removed in the future; we keep it currently as fallback, just in case something breaks. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG][s390x] mm: system crashed
On 04/15/2013 01:56 PM, Heiko Carstens wrote: On Sun, Apr 14, 2013 at 11:28:40PM -0400, Zhouping Liu wrote: Hi All, I hit the below crashed when doing memory related tests[1] on s390x: --- snip - � 15929.351639¨ � <0021c0a6>¨ shrink_inactive_list+0x1c6/0x56c � 15929.351647¨ � <0021c69e>¨ shrink_lruvec+0x252/0x56c � 15929.351654¨ � <0021ca44>¨ shrink_zone+0x8c/0x1bc � 15929.351662¨ � <0021d080>¨ balance_pgdat+0x50c/0x658 � 15929.351671¨ � <0021d318>¨ kswapd+0x14c/0x470 � 15929.351680¨ � <00158292>¨ kthread+0xda/0xe4 � 15929.351690¨ � <0062a5de>¨ kernel_thread_starter+0x6/0xc � 15929.351700¨ � <0062a5d8>¨ kernel_thread_starter+0x0/0xc � 16109.346061¨ INFO: rcu_sched self-detected stall on CPU { 0} (t=24006 jiffies g=89766 c=89765 q=10544) � 16109.346101¨ CPU: 0 Tainted: G D 3.9.0-rc6+ #1 � 16109.346106¨ Process kswapd0 (pid: 28, task: 3b2a, ksp: 3b 2ab8c0) � 16109.346110¨0001bb60 0001bb70 0002 000 0 0001bc00 0001bb78 0001bb78 001009ca 2930 000a 000a 0001bbc0 0001bb60 0063bb18 001009ca 0001bb60 0001bbb0 � 16109.346170¨ Call Trace: � 16109.346179¨ (� <00100920>¨ show_trace+0x128/0x12c) � 16109.346195¨ � <001cd320>¨ rcu_check_callbacks+0x458/0xccc � 16109.346209¨ � <00140f2e>¨ update_process_times+0x4a/0x74 � 16109.346222¨ � <00199452>¨ tick_sched_handle.isra.12+0x5e/0x70 � 16109.346235¨ � <001995aa>¨ tick_sched_timer+0x6a/0x98 � 16109.346247¨ � <0015c1ea>¨ __run_hrtimer+0x8e/0x200 � 16109.346381¨ � <0015d1b2>¨ hrtimer_interrupt+0x212/0x2b0 � 16109.346385¨ � <001040f6>¨ clock_comparator_work+0x4a/0x54 � 16109.346390¨ � <0010d658>¨ do_extint+0x158/0x15c � 16109.346396¨ � <0062aa24>¨ ext_skip+0x38/0x3c � 16109.346404¨ � <001153c8>¨ smp_yield_cpu+0x44/0x48 � 16109.346412¨ (� <03d10051aec0>¨ 0x3d10051aec0) � 16109.346457¨ � <0024206a>¨ __page_check_address+0x16a/0x170 � 16109.346466¨ � <002423a2>¨ page_referenced_one+0x3e/0xa0 � 16109.346501¨ � <0024427c>¨ page_referenced+0x32c/0x41c � 16109.346510¨ � <0021b1dc>¨ shrink_page_list+0x380/0xb9c � 16109.346521¨ � <0021c0a6>¨ shrink_inactive_list+0x1c6/0x56c � 16109.346532¨ � <0021c69e>¨ shrink_lruvec+0x252/0x56c � 16109.346542¨ � <0021ca44>¨ shrink_zone+0x8c/0x1bc � 16109.346553¨ � <0021d080>¨ balance_pgdat+0x50c/0x658 � 16109.346564¨ � <0021d318>¨ kswapd+0x14c/0x470 � 16109.346576¨ � <00158292>¨ kthread+0xda/0xe4 � 16109.346656¨ � <0062a5de>¨ kernel_thread_starter+0x6/0xc � 16109.346682¨ � <0062a5d8>¨ kernel_thread_starter+0x0/0xc [-- MARK -- Fri Apr 12 06:15:00 2013] � 16289.386061¨ INFO: rcu_sched self-detected stall on CPU { 0} (t=42010 jiffies g=89766 c=89765 q=10627) Did the system really crash or did you just see the rcu related warning(s)? I just check it again, actually at first the system didn't really crash, but the system is very slow in response. and the reproducer process can't be killed, after I did some common actions such as 'ls' 'vim' etc, the system seemed to be really crashed, no any response. also in the previous testing, I can remember that the system would be no any response for a long time, just only repeatedly print out the such above 'Call Trace' into console. Thanks, Zhouping -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG][s390x] mm: system crashed
On 04/15/2013 01:56 PM, Heiko Carstens wrote: On Sun, Apr 14, 2013 at 11:28:40PM -0400, Zhouping Liu wrote: Hi All, I hit the below crashed when doing memory related tests[1] on s390x: --- snip - � 15929.351639¨ � 0021c0a6¨ shrink_inactive_list+0x1c6/0x56c � 15929.351647¨ � 0021c69e¨ shrink_lruvec+0x252/0x56c � 15929.351654¨ � 0021ca44¨ shrink_zone+0x8c/0x1bc � 15929.351662¨ � 0021d080¨ balance_pgdat+0x50c/0x658 � 15929.351671¨ � 0021d318¨ kswapd+0x14c/0x470 � 15929.351680¨ � 00158292¨ kthread+0xda/0xe4 � 15929.351690¨ � 0062a5de¨ kernel_thread_starter+0x6/0xc � 15929.351700¨ � 0062a5d8¨ kernel_thread_starter+0x0/0xc � 16109.346061¨ INFO: rcu_sched self-detected stall on CPU { 0} (t=24006 jiffies g=89766 c=89765 q=10544) � 16109.346101¨ CPU: 0 Tainted: G D 3.9.0-rc6+ #1 � 16109.346106¨ Process kswapd0 (pid: 28, task: 3b2a, ksp: 3b 2ab8c0) � 16109.346110¨0001bb60 0001bb70 0002 000 0 0001bc00 0001bb78 0001bb78 001009ca 2930 000a 000a 0001bbc0 0001bb60 0063bb18 001009ca 0001bb60 0001bbb0 � 16109.346170¨ Call Trace: � 16109.346179¨ (� 00100920¨ show_trace+0x128/0x12c) � 16109.346195¨ � 001cd320¨ rcu_check_callbacks+0x458/0xccc � 16109.346209¨ � 00140f2e¨ update_process_times+0x4a/0x74 � 16109.346222¨ � 00199452¨ tick_sched_handle.isra.12+0x5e/0x70 � 16109.346235¨ � 001995aa¨ tick_sched_timer+0x6a/0x98 � 16109.346247¨ � 0015c1ea¨ __run_hrtimer+0x8e/0x200 � 16109.346381¨ � 0015d1b2¨ hrtimer_interrupt+0x212/0x2b0 � 16109.346385¨ � 001040f6¨ clock_comparator_work+0x4a/0x54 � 16109.346390¨ � 0010d658¨ do_extint+0x158/0x15c � 16109.346396¨ � 0062aa24¨ ext_skip+0x38/0x3c � 16109.346404¨ � 001153c8¨ smp_yield_cpu+0x44/0x48 � 16109.346412¨ (� 03d10051aec0¨ 0x3d10051aec0) � 16109.346457¨ � 0024206a¨ __page_check_address+0x16a/0x170 � 16109.346466¨ � 002423a2¨ page_referenced_one+0x3e/0xa0 � 16109.346501¨ � 0024427c¨ page_referenced+0x32c/0x41c � 16109.346510¨ � 0021b1dc¨ shrink_page_list+0x380/0xb9c � 16109.346521¨ � 0021c0a6¨ shrink_inactive_list+0x1c6/0x56c � 16109.346532¨ � 0021c69e¨ shrink_lruvec+0x252/0x56c � 16109.346542¨ � 0021ca44¨ shrink_zone+0x8c/0x1bc � 16109.346553¨ � 0021d080¨ balance_pgdat+0x50c/0x658 � 16109.346564¨ � 0021d318¨ kswapd+0x14c/0x470 � 16109.346576¨ � 00158292¨ kthread+0xda/0xe4 � 16109.346656¨ � 0062a5de¨ kernel_thread_starter+0x6/0xc � 16109.346682¨ � 0062a5d8¨ kernel_thread_starter+0x0/0xc [-- MARK -- Fri Apr 12 06:15:00 2013] � 16289.386061¨ INFO: rcu_sched self-detected stall on CPU { 0} (t=42010 jiffies g=89766 c=89765 q=10627) Did the system really crash or did you just see the rcu related warning(s)? I just check it again, actually at first the system didn't really crash, but the system is very slow in response. and the reproducer process can't be killed, after I did some common actions such as 'ls' 'vim' etc, the system seemed to be really crashed, no any response. also in the previous testing, I can remember that the system would be no any response for a long time, just only repeatedly print out the such above 'Call Trace' into console. Thanks, Zhouping -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG][s390x] mm: system crashed
On Sun, Apr 14, 2013 at 11:28:40PM -0400, Zhouping Liu wrote: > Hi All, > > I hit the below crashed when doing memory related tests[1] on s390x: > > --- snip - > � 15929.351639¨ � <0021c0a6>¨ shrink_inactive_list+0x1c6/0x56c > � 15929.351647¨ � <0021c69e>¨ shrink_lruvec+0x252/0x56c > � 15929.351654¨ � <0021ca44>¨ shrink_zone+0x8c/0x1bc > � 15929.351662¨ � <0021d080>¨ balance_pgdat+0x50c/0x658 > � 15929.351671¨ � <0021d318>¨ kswapd+0x14c/0x470 > � 15929.351680¨ � <00158292>¨ kthread+0xda/0xe4 > � 15929.351690¨ � <0062a5de>¨ kernel_thread_starter+0x6/0xc > � 15929.351700¨ � <0062a5d8>¨ kernel_thread_starter+0x0/0xc > � 16109.346061¨ INFO: rcu_sched self-detected stall on CPU { 0} (t=24006 > jiffies > g=89766 c=89765 q=10544) > � 16109.346101¨ CPU: 0 Tainted: G D 3.9.0-rc6+ #1 > � 16109.346106¨ Process kswapd0 (pid: 28, task: 3b2a, ksp: > 3b > 2ab8c0) > � 16109.346110¨0001bb60 0001bb70 0002 > 000 > 0 >0001bc00 0001bb78 0001bb78 001009ca > 2930 000a 000a >0001bbc0 0001bb60 >0063bb18 001009ca 0001bb60 0001bbb0 > � 16109.346170¨ Call Trace: > � 16109.346179¨ (� <00100920>¨ show_trace+0x128/0x12c) > � 16109.346195¨ � <001cd320>¨ rcu_check_callbacks+0x458/0xccc > � 16109.346209¨ � <00140f2e>¨ update_process_times+0x4a/0x74 > � 16109.346222¨ � <00199452>¨ tick_sched_handle.isra.12+0x5e/0x70 > � 16109.346235¨ � <001995aa>¨ tick_sched_timer+0x6a/0x98 > � 16109.346247¨ � <0015c1ea>¨ __run_hrtimer+0x8e/0x200 > � 16109.346381¨ � <0015d1b2>¨ hrtimer_interrupt+0x212/0x2b0 > � 16109.346385¨ � <001040f6>¨ clock_comparator_work+0x4a/0x54 > � 16109.346390¨ � <0010d658>¨ do_extint+0x158/0x15c > � 16109.346396¨ � <0062aa24>¨ ext_skip+0x38/0x3c > � 16109.346404¨ � <001153c8>¨ smp_yield_cpu+0x44/0x48 > � 16109.346412¨ (� <03d10051aec0>¨ 0x3d10051aec0) > � 16109.346457¨ � <0024206a>¨ __page_check_address+0x16a/0x170 > � 16109.346466¨ � <002423a2>¨ page_referenced_one+0x3e/0xa0 > � 16109.346501¨ � <0024427c>¨ page_referenced+0x32c/0x41c > � 16109.346510¨ � <0021b1dc>¨ shrink_page_list+0x380/0xb9c > � 16109.346521¨ � <0021c0a6>¨ shrink_inactive_list+0x1c6/0x56c > � 16109.346532¨ � <0021c69e>¨ shrink_lruvec+0x252/0x56c > � 16109.346542¨ � <0021ca44>¨ shrink_zone+0x8c/0x1bc > � 16109.346553¨ � <0021d080>¨ balance_pgdat+0x50c/0x658 > � 16109.346564¨ � <0021d318>¨ kswapd+0x14c/0x470 > � 16109.346576¨ � <00158292>¨ kthread+0xda/0xe4 > � 16109.346656¨ � <0062a5de>¨ kernel_thread_starter+0x6/0xc > � 16109.346682¨ � <0062a5d8>¨ kernel_thread_starter+0x0/0xc > [-- MARK -- Fri Apr 12 06:15:00 2013] > � 16289.386061¨ INFO: rcu_sched self-detected stall on CPU { 0} (t=42010 > jiffies > g=89766 c=89765 q=10627) Did the system really crash or did you just see the rcu related warning(s)? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[BUG][s390x] mm: system crashed
Hi All, I hit the below crashed when doing memory related tests[1] on s390x: --- snip - � 15929.351639¨ � <0021c0a6>¨ shrink_inactive_list+0x1c6/0x56c � 15929.351647¨ � <0021c69e>¨ shrink_lruvec+0x252/0x56c � 15929.351654¨ � <0021ca44>¨ shrink_zone+0x8c/0x1bc � 15929.351662¨ � <0021d080>¨ balance_pgdat+0x50c/0x658 � 15929.351671¨ � <0021d318>¨ kswapd+0x14c/0x470 � 15929.351680¨ � <00158292>¨ kthread+0xda/0xe4 � 15929.351690¨ � <0062a5de>¨ kernel_thread_starter+0x6/0xc � 15929.351700¨ � <0062a5d8>¨ kernel_thread_starter+0x0/0xc � 16109.346061¨ INFO: rcu_sched self-detected stall on CPU { 0} (t=24006 jiffies g=89766 c=89765 q=10544) � 16109.346101¨ CPU: 0 Tainted: G D 3.9.0-rc6+ #1 � 16109.346106¨ Process kswapd0 (pid: 28, task: 3b2a, ksp: 3b 2ab8c0) � 16109.346110¨0001bb60 0001bb70 0002 000 0 0001bc00 0001bb78 0001bb78 001009ca 2930 000a 000a 0001bbc0 0001bb60 0063bb18 001009ca 0001bb60 0001bbb0 � 16109.346170¨ Call Trace: � 16109.346179¨ (� <00100920>¨ show_trace+0x128/0x12c) � 16109.346195¨ � <001cd320>¨ rcu_check_callbacks+0x458/0xccc � 16109.346209¨ � <00140f2e>¨ update_process_times+0x4a/0x74 � 16109.346222¨ � <00199452>¨ tick_sched_handle.isra.12+0x5e/0x70 � 16109.346235¨ � <001995aa>¨ tick_sched_timer+0x6a/0x98 � 16109.346247¨ � <0015c1ea>¨ __run_hrtimer+0x8e/0x200 � 16109.346381¨ � <0015d1b2>¨ hrtimer_interrupt+0x212/0x2b0 � 16109.346385¨ � <001040f6>¨ clock_comparator_work+0x4a/0x54 � 16109.346390¨ � <0010d658>¨ do_extint+0x158/0x15c � 16109.346396¨ � <0062aa24>¨ ext_skip+0x38/0x3c � 16109.346404¨ � <001153c8>¨ smp_yield_cpu+0x44/0x48 � 16109.346412¨ (� <03d10051aec0>¨ 0x3d10051aec0) � 16109.346457¨ � <0024206a>¨ __page_check_address+0x16a/0x170 � 16109.346466¨ � <002423a2>¨ page_referenced_one+0x3e/0xa0 � 16109.346501¨ � <0024427c>¨ page_referenced+0x32c/0x41c � 16109.346510¨ � <0021b1dc>¨ shrink_page_list+0x380/0xb9c � 16109.346521¨ � <0021c0a6>¨ shrink_inactive_list+0x1c6/0x56c � 16109.346532¨ � <0021c69e>¨ shrink_lruvec+0x252/0x56c � 16109.346542¨ � <0021ca44>¨ shrink_zone+0x8c/0x1bc � 16109.346553¨ � <0021d080>¨ balance_pgdat+0x50c/0x658 � 16109.346564¨ � <0021d318>¨ kswapd+0x14c/0x470 � 16109.346576¨ � <00158292>¨ kthread+0xda/0xe4 � 16109.346656¨ � <0062a5de>¨ kernel_thread_starter+0x6/0xc � 16109.346682¨ � <0062a5d8>¨ kernel_thread_starter+0x0/0xc [-- MARK -- Fri Apr 12 06:15:00 2013] � 16289.386061¨ INFO: rcu_sched self-detected stall on CPU { 0} (t=42010 jiffies g=89766 c=89765 q=10627) -- snip -- The testing system has 1Gb RAM, kernel is new latest mainline. please let me know if you need any more info. [1] reproducer is come from LTP: https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/mem/mtest06/mmap2.c and execute it using this command: `./mmap2 -x 0.002 -a -p` -- Thanks, Zhouping -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[BUG][s390x] mm: system crashed
Hi All, I hit the below crashed when doing memory related tests[1] on s390x: --- snip - � 15929.351639¨ � 0021c0a6¨ shrink_inactive_list+0x1c6/0x56c � 15929.351647¨ � 0021c69e¨ shrink_lruvec+0x252/0x56c � 15929.351654¨ � 0021ca44¨ shrink_zone+0x8c/0x1bc � 15929.351662¨ � 0021d080¨ balance_pgdat+0x50c/0x658 � 15929.351671¨ � 0021d318¨ kswapd+0x14c/0x470 � 15929.351680¨ � 00158292¨ kthread+0xda/0xe4 � 15929.351690¨ � 0062a5de¨ kernel_thread_starter+0x6/0xc � 15929.351700¨ � 0062a5d8¨ kernel_thread_starter+0x0/0xc � 16109.346061¨ INFO: rcu_sched self-detected stall on CPU { 0} (t=24006 jiffies g=89766 c=89765 q=10544) � 16109.346101¨ CPU: 0 Tainted: G D 3.9.0-rc6+ #1 � 16109.346106¨ Process kswapd0 (pid: 28, task: 3b2a, ksp: 3b 2ab8c0) � 16109.346110¨0001bb60 0001bb70 0002 000 0 0001bc00 0001bb78 0001bb78 001009ca 2930 000a 000a 0001bbc0 0001bb60 0063bb18 001009ca 0001bb60 0001bbb0 � 16109.346170¨ Call Trace: � 16109.346179¨ (� 00100920¨ show_trace+0x128/0x12c) � 16109.346195¨ � 001cd320¨ rcu_check_callbacks+0x458/0xccc � 16109.346209¨ � 00140f2e¨ update_process_times+0x4a/0x74 � 16109.346222¨ � 00199452¨ tick_sched_handle.isra.12+0x5e/0x70 � 16109.346235¨ � 001995aa¨ tick_sched_timer+0x6a/0x98 � 16109.346247¨ � 0015c1ea¨ __run_hrtimer+0x8e/0x200 � 16109.346381¨ � 0015d1b2¨ hrtimer_interrupt+0x212/0x2b0 � 16109.346385¨ � 001040f6¨ clock_comparator_work+0x4a/0x54 � 16109.346390¨ � 0010d658¨ do_extint+0x158/0x15c � 16109.346396¨ � 0062aa24¨ ext_skip+0x38/0x3c � 16109.346404¨ � 001153c8¨ smp_yield_cpu+0x44/0x48 � 16109.346412¨ (� 03d10051aec0¨ 0x3d10051aec0) � 16109.346457¨ � 0024206a¨ __page_check_address+0x16a/0x170 � 16109.346466¨ � 002423a2¨ page_referenced_one+0x3e/0xa0 � 16109.346501¨ � 0024427c¨ page_referenced+0x32c/0x41c � 16109.346510¨ � 0021b1dc¨ shrink_page_list+0x380/0xb9c � 16109.346521¨ � 0021c0a6¨ shrink_inactive_list+0x1c6/0x56c � 16109.346532¨ � 0021c69e¨ shrink_lruvec+0x252/0x56c � 16109.346542¨ � 0021ca44¨ shrink_zone+0x8c/0x1bc � 16109.346553¨ � 0021d080¨ balance_pgdat+0x50c/0x658 � 16109.346564¨ � 0021d318¨ kswapd+0x14c/0x470 � 16109.346576¨ � 00158292¨ kthread+0xda/0xe4 � 16109.346656¨ � 0062a5de¨ kernel_thread_starter+0x6/0xc � 16109.346682¨ � 0062a5d8¨ kernel_thread_starter+0x0/0xc [-- MARK -- Fri Apr 12 06:15:00 2013] � 16289.386061¨ INFO: rcu_sched self-detected stall on CPU { 0} (t=42010 jiffies g=89766 c=89765 q=10627) -- snip -- The testing system has 1Gb RAM, kernel is new latest mainline. please let me know if you need any more info. [1] reproducer is come from LTP: https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/mem/mtest06/mmap2.c and execute it using this command: `./mmap2 -x 0.002 -a -p` -- Thanks, Zhouping -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG][s390x] mm: system crashed
On Sun, Apr 14, 2013 at 11:28:40PM -0400, Zhouping Liu wrote: Hi All, I hit the below crashed when doing memory related tests[1] on s390x: --- snip - � 15929.351639¨ � 0021c0a6¨ shrink_inactive_list+0x1c6/0x56c � 15929.351647¨ � 0021c69e¨ shrink_lruvec+0x252/0x56c � 15929.351654¨ � 0021ca44¨ shrink_zone+0x8c/0x1bc � 15929.351662¨ � 0021d080¨ balance_pgdat+0x50c/0x658 � 15929.351671¨ � 0021d318¨ kswapd+0x14c/0x470 � 15929.351680¨ � 00158292¨ kthread+0xda/0xe4 � 15929.351690¨ � 0062a5de¨ kernel_thread_starter+0x6/0xc � 15929.351700¨ � 0062a5d8¨ kernel_thread_starter+0x0/0xc � 16109.346061¨ INFO: rcu_sched self-detected stall on CPU { 0} (t=24006 jiffies g=89766 c=89765 q=10544) � 16109.346101¨ CPU: 0 Tainted: G D 3.9.0-rc6+ #1 � 16109.346106¨ Process kswapd0 (pid: 28, task: 3b2a, ksp: 3b 2ab8c0) � 16109.346110¨0001bb60 0001bb70 0002 000 0 0001bc00 0001bb78 0001bb78 001009ca 2930 000a 000a 0001bbc0 0001bb60 0063bb18 001009ca 0001bb60 0001bbb0 � 16109.346170¨ Call Trace: � 16109.346179¨ (� 00100920¨ show_trace+0x128/0x12c) � 16109.346195¨ � 001cd320¨ rcu_check_callbacks+0x458/0xccc � 16109.346209¨ � 00140f2e¨ update_process_times+0x4a/0x74 � 16109.346222¨ � 00199452¨ tick_sched_handle.isra.12+0x5e/0x70 � 16109.346235¨ � 001995aa¨ tick_sched_timer+0x6a/0x98 � 16109.346247¨ � 0015c1ea¨ __run_hrtimer+0x8e/0x200 � 16109.346381¨ � 0015d1b2¨ hrtimer_interrupt+0x212/0x2b0 � 16109.346385¨ � 001040f6¨ clock_comparator_work+0x4a/0x54 � 16109.346390¨ � 0010d658¨ do_extint+0x158/0x15c � 16109.346396¨ � 0062aa24¨ ext_skip+0x38/0x3c � 16109.346404¨ � 001153c8¨ smp_yield_cpu+0x44/0x48 � 16109.346412¨ (� 03d10051aec0¨ 0x3d10051aec0) � 16109.346457¨ � 0024206a¨ __page_check_address+0x16a/0x170 � 16109.346466¨ � 002423a2¨ page_referenced_one+0x3e/0xa0 � 16109.346501¨ � 0024427c¨ page_referenced+0x32c/0x41c � 16109.346510¨ � 0021b1dc¨ shrink_page_list+0x380/0xb9c � 16109.346521¨ � 0021c0a6¨ shrink_inactive_list+0x1c6/0x56c � 16109.346532¨ � 0021c69e¨ shrink_lruvec+0x252/0x56c � 16109.346542¨ � 0021ca44¨ shrink_zone+0x8c/0x1bc � 16109.346553¨ � 0021d080¨ balance_pgdat+0x50c/0x658 � 16109.346564¨ � 0021d318¨ kswapd+0x14c/0x470 � 16109.346576¨ � 00158292¨ kthread+0xda/0xe4 � 16109.346656¨ � 0062a5de¨ kernel_thread_starter+0x6/0xc � 16109.346682¨ � 0062a5d8¨ kernel_thread_starter+0x0/0xc [-- MARK -- Fri Apr 12 06:15:00 2013] � 16289.386061¨ INFO: rcu_sched self-detected stall on CPU { 0} (t=42010 jiffies g=89766 c=89765 q=10627) Did the system really crash or did you just see the rcu related warning(s)? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/