Re: [PATCH 6/7] mm: add CONFIG_DEBUG_VM_RB build option
On 09/16/2012 09:07 PM, Hugh Dickins wrote: >> What was the way that >> Hugh used to reproduce the other issue? > > I've lost track of which issue is "other". The other was meant to be the BUG I hit. > To reproduce Sasha's interval_tree.c warnings, all I had to do was switch > on CONFIG_DEBUG_VM_RB (I regret not having done so before) and boot up. > > I didn't look to see what was doing the mremap which caused the warning > until now: surprisingly, it's microcode_ctl. I've not made much effort > to get the right set of sources and work out why that would be using > mremap (a realloc inside a library?). > > I failed to reproduce your BUG in huge_memory.c, but what I was trying > was SuSE update via yast2, on several machines; but perhaps because > they were all fairly close to up-to-date, I didn't hit a problem. > (That was before I turned on DEBUG_VM_RB for Sasha's.) The good news are that I cannot reproduce either with the patch applied. thanks, -- js suse labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 6/7] mm: add CONFIG_DEBUG_VM_RB build option
On 09/16/2012 09:07 PM, Hugh Dickins wrote: What was the way that Hugh used to reproduce the other issue? I've lost track of which issue is other. The other was meant to be the BUG I hit. To reproduce Sasha's interval_tree.c warnings, all I had to do was switch on CONFIG_DEBUG_VM_RB (I regret not having done so before) and boot up. I didn't look to see what was doing the mremap which caused the warning until now: surprisingly, it's microcode_ctl. I've not made much effort to get the right set of sources and work out why that would be using mremap (a realloc inside a library?). I failed to reproduce your BUG in huge_memory.c, but what I was trying was SuSE update via yast2, on several machines; but perhaps because they were all fairly close to up-to-date, I didn't hit a problem. (That was before I turned on DEBUG_VM_RB for Sasha's.) The good news are that I cannot reproduce either with the patch applied. thanks, -- js suse labs -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 6/7] mm: add CONFIG_DEBUG_VM_RB build option
On Thu, Sep 20, 2012 at 03:27:11PM -0700, Hugh Dickins wrote: > On Fri, 21 Sep 2012, Fengguang Wu wrote: > > On Sat, Sep 15, 2012 at 11:26:23AM +0200, Sasha Levin wrote: > > > On 09/15/2012 02:00 AM, Michel Lespinasse wrote: > > > > All right. Hugh managed to reproduce the issue on his suse laptop, and > > > > I came up with a fix. > > > > > > > > The problem was that in mremap, the new vma's vm_{start,end,pgoff} > > > > fields need to be updated before calling anon_vma_clone() so that the > > > > new vma will be properly indexed. > > > > > > > > Patch attached. I expect this should also explain Jiri's reported > > > > failure involving splitting THP pages during mremap(), even though we > > > > did not manage to reproduce that one. > > > > > > Initially I've stumbled on it by running trinity inside a KVM tools > > > guest. fwiw, > > > the guest is pretty custom and isn't based on suse. > > > > > > I re-ran tests with patch applied and looks like it fixed the issue, I > > > haven't > > > seen the warnings even though it runs for quite a while now. > > > > Not sure if it's the same problem you are talking about, but I got the > > below warning and it's still happening in linux-next 20120920: > > It is (almost certainly) the same problem, for which Michel provided > the fix earlier in this thread (some of us find we have to delete a > " {" from the context at the end to get it to apply). > > That fix has gone into akpm's tree, but linux-next is still using an > older rollup of akpm's tree. Got it, thank you for the quick information! Thanks, Fengguang > > [ 38.482925] scsi_nl_rcv_msg: discarding partial skb > > [ 62.679879] [ cut here ] > > [ 62.680380] WARNING: at /c/kernel-tests/src/linux/mm/interval_tree.c:109 > > anon_vma_interval_tree_verify+0x33/0x80() > > [ 62.681356] Pid: 195, comm: trinity-child0 Not tainted > > 3.6.0-rc6-next-20120918-08732-g3de9d1a #1 > > [ 62.682130] Call Trace: > > [ 62.682356] [] ? > > anon_vma_interval_tree_verify+0x33/0x80 > > [ 62.682968] [] warn_slowpath_common+0x5d/0x74 > > [ 62.683577] [] warn_slowpath_null+0x15/0x19 > > [ 62.684098] [] anon_vma_interval_tree_verify+0x33/0x80 > > [ 62.684714] [] validate_mm+0x32/0x15b > > [ 62.685202] [] vma_link+0x95/0xa4 > > [ 62.685637] [] copy_vma+0x1c7/0x1fe > > [ 62.686168] [] move_vma+0x90/0x1ef > > [ 62.686614] [] sys_mremap+0x3a1/0x429 > > [ 62.687094] [] ? trace_hardirqs_on_thunk+0x3a/0x3f > > [ 62.687670] [] system_call_fastpath+0x16/0x1b > > > > Bisected down to > > > > commit cb58d445d2ec3a06f313e29d6f6af5bef6c9e43c > > Author: Michel Lespinasse > > Date: Thu Sep 13 10:58:56 2012 +1000 > > > > mm: add CONFIG_DEBUG_VM_RB build option > > > > Thanks, > > Fengguang -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 6/7] mm: add CONFIG_DEBUG_VM_RB build option
On Fri, 21 Sep 2012, Fengguang Wu wrote: > On Sat, Sep 15, 2012 at 11:26:23AM +0200, Sasha Levin wrote: > > On 09/15/2012 02:00 AM, Michel Lespinasse wrote: > > > All right. Hugh managed to reproduce the issue on his suse laptop, and > > > I came up with a fix. > > > > > > The problem was that in mremap, the new vma's vm_{start,end,pgoff} > > > fields need to be updated before calling anon_vma_clone() so that the > > > new vma will be properly indexed. > > > > > > Patch attached. I expect this should also explain Jiri's reported > > > failure involving splitting THP pages during mremap(), even though we > > > did not manage to reproduce that one. > > > > Initially I've stumbled on it by running trinity inside a KVM tools guest. > > fwiw, > > the guest is pretty custom and isn't based on suse. > > > > I re-ran tests with patch applied and looks like it fixed the issue, I > > haven't > > seen the warnings even though it runs for quite a while now. > > Not sure if it's the same problem you are talking about, but I got the > below warning and it's still happening in linux-next 20120920: It is (almost certainly) the same problem, for which Michel provided the fix earlier in this thread (some of us find we have to delete a " {" from the context at the end to get it to apply). That fix has gone into akpm's tree, but linux-next is still using an older rollup of akpm's tree. Thanks, Hugh > > [ 38.482925] scsi_nl_rcv_msg: discarding partial skb > [ 62.679879] [ cut here ] > [ 62.680380] WARNING: at /c/kernel-tests/src/linux/mm/interval_tree.c:109 > anon_vma_interval_tree_verify+0x33/0x80() > [ 62.681356] Pid: 195, comm: trinity-child0 Not tainted > 3.6.0-rc6-next-20120918-08732-g3de9d1a #1 > [ 62.682130] Call Trace: > [ 62.682356] [] ? anon_vma_interval_tree_verify+0x33/0x80 > [ 62.682968] [] warn_slowpath_common+0x5d/0x74 > [ 62.683577] [] warn_slowpath_null+0x15/0x19 > [ 62.684098] [] anon_vma_interval_tree_verify+0x33/0x80 > [ 62.684714] [] validate_mm+0x32/0x15b > [ 62.685202] [] vma_link+0x95/0xa4 > [ 62.685637] [] copy_vma+0x1c7/0x1fe > [ 62.686168] [] move_vma+0x90/0x1ef > [ 62.686614] [] sys_mremap+0x3a1/0x429 > [ 62.687094] [] ? trace_hardirqs_on_thunk+0x3a/0x3f > [ 62.687670] [] system_call_fastpath+0x16/0x1b > > Bisected down to > > commit cb58d445d2ec3a06f313e29d6f6af5bef6c9e43c > Author: Michel Lespinasse > Date: Thu Sep 13 10:58:56 2012 +1000 > > mm: add CONFIG_DEBUG_VM_RB build option > > Thanks, > Fengguang -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 6/7] mm: add CONFIG_DEBUG_VM_RB build option
On Fri, 21 Sep 2012, Fengguang Wu wrote: On Sat, Sep 15, 2012 at 11:26:23AM +0200, Sasha Levin wrote: On 09/15/2012 02:00 AM, Michel Lespinasse wrote: All right. Hugh managed to reproduce the issue on his suse laptop, and I came up with a fix. The problem was that in mremap, the new vma's vm_{start,end,pgoff} fields need to be updated before calling anon_vma_clone() so that the new vma will be properly indexed. Patch attached. I expect this should also explain Jiri's reported failure involving splitting THP pages during mremap(), even though we did not manage to reproduce that one. Initially I've stumbled on it by running trinity inside a KVM tools guest. fwiw, the guest is pretty custom and isn't based on suse. I re-ran tests with patch applied and looks like it fixed the issue, I haven't seen the warnings even though it runs for quite a while now. Not sure if it's the same problem you are talking about, but I got the below warning and it's still happening in linux-next 20120920: It is (almost certainly) the same problem, for which Michel provided the fix earlier in this thread (some of us find we have to delete a { from the context at the end to get it to apply). That fix has gone into akpm's tree, but linux-next is still using an older rollup of akpm's tree. Thanks, Hugh [ 38.482925] scsi_nl_rcv_msg: discarding partial skb [ 62.679879] [ cut here ] [ 62.680380] WARNING: at /c/kernel-tests/src/linux/mm/interval_tree.c:109 anon_vma_interval_tree_verify+0x33/0x80() [ 62.681356] Pid: 195, comm: trinity-child0 Not tainted 3.6.0-rc6-next-20120918-08732-g3de9d1a #1 [ 62.682130] Call Trace: [ 62.682356] [810c249f] ? anon_vma_interval_tree_verify+0x33/0x80 [ 62.682968] [81044356] warn_slowpath_common+0x5d/0x74 [ 62.683577] [81044424] warn_slowpath_null+0x15/0x19 [ 62.684098] [810c249f] anon_vma_interval_tree_verify+0x33/0x80 [ 62.684714] [810ca57c] validate_mm+0x32/0x15b [ 62.685202] [810ca767] vma_link+0x95/0xa4 [ 62.685637] [810cbc31] copy_vma+0x1c7/0x1fe [ 62.686168] [810cdd50] move_vma+0x90/0x1ef [ 62.686614] [810ce250] sys_mremap+0x3a1/0x429 [ 62.687094] [813caafe] ? trace_hardirqs_on_thunk+0x3a/0x3f [ 62.687670] [81b505b9] system_call_fastpath+0x16/0x1b Bisected down to commit cb58d445d2ec3a06f313e29d6f6af5bef6c9e43c Author: Michel Lespinasse wal...@google.com Date: Thu Sep 13 10:58:56 2012 +1000 mm: add CONFIG_DEBUG_VM_RB build option Thanks, Fengguang -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 6/7] mm: add CONFIG_DEBUG_VM_RB build option
On Thu, Sep 20, 2012 at 03:27:11PM -0700, Hugh Dickins wrote: On Fri, 21 Sep 2012, Fengguang Wu wrote: On Sat, Sep 15, 2012 at 11:26:23AM +0200, Sasha Levin wrote: On 09/15/2012 02:00 AM, Michel Lespinasse wrote: All right. Hugh managed to reproduce the issue on his suse laptop, and I came up with a fix. The problem was that in mremap, the new vma's vm_{start,end,pgoff} fields need to be updated before calling anon_vma_clone() so that the new vma will be properly indexed. Patch attached. I expect this should also explain Jiri's reported failure involving splitting THP pages during mremap(), even though we did not manage to reproduce that one. Initially I've stumbled on it by running trinity inside a KVM tools guest. fwiw, the guest is pretty custom and isn't based on suse. I re-ran tests with patch applied and looks like it fixed the issue, I haven't seen the warnings even though it runs for quite a while now. Not sure if it's the same problem you are talking about, but I got the below warning and it's still happening in linux-next 20120920: It is (almost certainly) the same problem, for which Michel provided the fix earlier in this thread (some of us find we have to delete a { from the context at the end to get it to apply). That fix has gone into akpm's tree, but linux-next is still using an older rollup of akpm's tree. Got it, thank you for the quick information! Thanks, Fengguang [ 38.482925] scsi_nl_rcv_msg: discarding partial skb [ 62.679879] [ cut here ] [ 62.680380] WARNING: at /c/kernel-tests/src/linux/mm/interval_tree.c:109 anon_vma_interval_tree_verify+0x33/0x80() [ 62.681356] Pid: 195, comm: trinity-child0 Not tainted 3.6.0-rc6-next-20120918-08732-g3de9d1a #1 [ 62.682130] Call Trace: [ 62.682356] [810c249f] ? anon_vma_interval_tree_verify+0x33/0x80 [ 62.682968] [81044356] warn_slowpath_common+0x5d/0x74 [ 62.683577] [81044424] warn_slowpath_null+0x15/0x19 [ 62.684098] [810c249f] anon_vma_interval_tree_verify+0x33/0x80 [ 62.684714] [810ca57c] validate_mm+0x32/0x15b [ 62.685202] [810ca767] vma_link+0x95/0xa4 [ 62.685637] [810cbc31] copy_vma+0x1c7/0x1fe [ 62.686168] [810cdd50] move_vma+0x90/0x1ef [ 62.686614] [810ce250] sys_mremap+0x3a1/0x429 [ 62.687094] [813caafe] ? trace_hardirqs_on_thunk+0x3a/0x3f [ 62.687670] [81b505b9] system_call_fastpath+0x16/0x1b Bisected down to commit cb58d445d2ec3a06f313e29d6f6af5bef6c9e43c Author: Michel Lespinasse wal...@google.com Date: Thu Sep 13 10:58:56 2012 +1000 mm: add CONFIG_DEBUG_VM_RB build option Thanks, Fengguang -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 6/7] mm: add CONFIG_DEBUG_VM_RB build option
On Sat, 15 Sep 2012, Jiri Slaby wrote: > On 09/15/2012 02:00 AM, Michel Lespinasse wrote: > > All right. Hugh managed to reproduce the issue on his suse laptop, and > > I came up with a fix. > > > > The problem was that in mremap, the new vma's vm_{start,end,pgoff} > > fields need to be updated before calling anon_vma_clone() so that the > > new vma will be properly indexed. > > > > Patch attached. I expect this should also explain Jiri's reported > > failure involving splitting THP pages during mremap(), even though we > > did not manage to reproduce that one. > > Oh, great. This is BTW also machine with suse. We guessed that for you it might be :) I've not yet moved up from 11.4 by the way, if that makes a difference. In fact, even before these reports, when Michel was wondering about the uses of mremap, I did mention an mremap/THP bug from a year ago, which the SuSE update had been good for reproducing. > What was the way that > Hugh used to reproduce the other issue? I've lost track of which issue is "other". To reproduce Sasha's interval_tree.c warnings, all I had to do was switch on CONFIG_DEBUG_VM_RB (I regret not having done so before) and boot up. I didn't look to see what was doing the mremap which caused the warning until now: surprisingly, it's microcode_ctl. I've not made much effort to get the right set of sources and work out why that would be using mremap (a realloc inside a library?). I failed to reproduce your BUG in huge_memory.c, but what I was trying was SuSE update via yast2, on several machines; but perhaps because they were all fairly close to up-to-date, I didn't hit a problem. (That was before I turned on DEBUG_VM_RB for Sasha's.) Hugh > For me it happened twice in a > row when using zypper to upgrade packages. But it did not happen any > more after that. > > thanks, > -- > js > suse labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 6/7] mm: add CONFIG_DEBUG_VM_RB build option
On Sat, 15 Sep 2012, Jiri Slaby wrote: On 09/15/2012 02:00 AM, Michel Lespinasse wrote: All right. Hugh managed to reproduce the issue on his suse laptop, and I came up with a fix. The problem was that in mremap, the new vma's vm_{start,end,pgoff} fields need to be updated before calling anon_vma_clone() so that the new vma will be properly indexed. Patch attached. I expect this should also explain Jiri's reported failure involving splitting THP pages during mremap(), even though we did not manage to reproduce that one. Oh, great. This is BTW also machine with suse. We guessed that for you it might be :) I've not yet moved up from 11.4 by the way, if that makes a difference. In fact, even before these reports, when Michel was wondering about the uses of mremap, I did mention an mremap/THP bug from a year ago, which the SuSE update had been good for reproducing. What was the way that Hugh used to reproduce the other issue? I've lost track of which issue is other. To reproduce Sasha's interval_tree.c warnings, all I had to do was switch on CONFIG_DEBUG_VM_RB (I regret not having done so before) and boot up. I didn't look to see what was doing the mremap which caused the warning until now: surprisingly, it's microcode_ctl. I've not made much effort to get the right set of sources and work out why that would be using mremap (a realloc inside a library?). I failed to reproduce your BUG in huge_memory.c, but what I was trying was SuSE update via yast2, on several machines; but perhaps because they were all fairly close to up-to-date, I didn't hit a problem. (That was before I turned on DEBUG_VM_RB for Sasha's.) Hugh For me it happened twice in a row when using zypper to upgrade packages. But it did not happen any more after that. thanks, -- js suse labs -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 6/7] mm: add CONFIG_DEBUG_VM_RB build option
On 09/15/2012 02:00 AM, Michel Lespinasse wrote: > All right. Hugh managed to reproduce the issue on his suse laptop, and > I came up with a fix. > > The problem was that in mremap, the new vma's vm_{start,end,pgoff} > fields need to be updated before calling anon_vma_clone() so that the > new vma will be properly indexed. > > Patch attached. I expect this should also explain Jiri's reported > failure involving splitting THP pages during mremap(), even though we > did not manage to reproduce that one. Initially I've stumbled on it by running trinity inside a KVM tools guest. fwiw, the guest is pretty custom and isn't based on suse. I re-ran tests with patch applied and looks like it fixed the issue, I haven't seen the warnings even though it runs for quite a while now. Thanks, Sasha -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 6/7] mm: add CONFIG_DEBUG_VM_RB build option
On 09/15/2012 02:00 AM, Michel Lespinasse wrote: > All right. Hugh managed to reproduce the issue on his suse laptop, and > I came up with a fix. > > The problem was that in mremap, the new vma's vm_{start,end,pgoff} > fields need to be updated before calling anon_vma_clone() so that the > new vma will be properly indexed. > > Patch attached. I expect this should also explain Jiri's reported > failure involving splitting THP pages during mremap(), even though we > did not manage to reproduce that one. Oh, great. This is BTW also machine with suse. What was the way that Hugh used to reproduce the other issue? For me it happened twice in a row when using zypper to upgrade packages. But it did not happen any more after that. thanks, -- js suse labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 6/7] mm: add CONFIG_DEBUG_VM_RB build option
On 09/15/2012 02:00 AM, Michel Lespinasse wrote: All right. Hugh managed to reproduce the issue on his suse laptop, and I came up with a fix. The problem was that in mremap, the new vma's vm_{start,end,pgoff} fields need to be updated before calling anon_vma_clone() so that the new vma will be properly indexed. Patch attached. I expect this should also explain Jiri's reported failure involving splitting THP pages during mremap(), even though we did not manage to reproduce that one. Oh, great. This is BTW also machine with suse. What was the way that Hugh used to reproduce the other issue? For me it happened twice in a row when using zypper to upgrade packages. But it did not happen any more after that. thanks, -- js suse labs -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 6/7] mm: add CONFIG_DEBUG_VM_RB build option
On 09/15/2012 02:00 AM, Michel Lespinasse wrote: All right. Hugh managed to reproduce the issue on his suse laptop, and I came up with a fix. The problem was that in mremap, the new vma's vm_{start,end,pgoff} fields need to be updated before calling anon_vma_clone() so that the new vma will be properly indexed. Patch attached. I expect this should also explain Jiri's reported failure involving splitting THP pages during mremap(), even though we did not manage to reproduce that one. Initially I've stumbled on it by running trinity inside a KVM tools guest. fwiw, the guest is pretty custom and isn't based on suse. I re-ran tests with patch applied and looks like it fixed the issue, I haven't seen the warnings even though it runs for quite a while now. Thanks, Sasha -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 6/7] mm: add CONFIG_DEBUG_VM_RB build option
On Fri, Sep 14, 2012 at 3:46 PM, Michel Lespinasse wrote: > On Fri, Sep 14, 2012 at 3:14 PM, Sasha Levin wrote: >> On 09/04/2012 11:20 AM, Michel Lespinasse wrote: >>> Add a CONFIG_DEBUG_VM_RB build option for the previously existing >>> DEBUG_MM_RB code. Now that Andi Kleen modified it to avoid using >>> recursive algorithms, we can expose it a bit more. >>> >>> Also extend this code to validate_mm() after stack expansion, and to >>> check that the vma's start and last pgoffs have not changed since the >>> nodes were inserted on the anon vma interval tree (as it is important >>> that the nodes be reindexed after each such update). >> >> This patch exposes the following warning: >> >> [ 24.977502] [ cut here ] >> [ 24.979089] WARNING: at mm/interval_tree.c:110 >> anon_vma_interval_tree_verify+0x81/0xa0() >> [ 24.981765] Pid: 5928, comm: trinity-child37 Tainted: GW >> 3.6.0-rc5-next-20120914-sasha-3-g7deb7fa-dirty #333 >> [ 24.985501] Call Trace: >> [ 24.986345] [] ? >> anon_vma_interval_tree_verify+0x81/0xa0 >> [ 24.988535] [] warn_slowpath_common+0x86/0xb0 >> [ 24.990636] [] warn_slowpath_null+0x15/0x20 >> [ 24.992658] [] anon_vma_interval_tree_verify+0x81/0xa0 >> [ 24.994980] [] validate_mm+0x58/0x1e0 >> [ 24.996772] [] vma_link+0x94/0xe0 >> [ 24.997719] [] copy_vma+0x279/0x2e0 >> [ 24.998522] [] ? trace_hardirqs_off+0xd/0x10 >> [ 25.000772] [] move_vma+0xa9/0x260 >> [ 25.002499] [] sys_mremap+0x475/0x540 >> [ 25.004364] [] tracesys+0xe1/0xe6 >> [ 25.006108] ---[ end trace 7c901670963aa6e2 ]--- >> >> The code line is >> >> WARN_ON_ONCE(node->cached_vma_last != avc_last_pgoff(node)); > > That's very interesting (and potentially relevant to another bug > that's been reported too). > > I'd like to know, what workload did you use that triggered this ? > (I find it hard to test mremap as I don't know of enough users of it) All right. Hugh managed to reproduce the issue on his suse laptop, and I came up with a fix. The problem was that in mremap, the new vma's vm_{start,end,pgoff} fields need to be updated before calling anon_vma_clone() so that the new vma will be properly indexed. Patch attached. I expect this should also explain Jiri's reported failure involving splitting THP pages during mremap(), even though we did not manage to reproduce that one. -8<--- From: Michel Lespinasse Date: Fri, 14 Sep 2012 16:43:49 -0700 Subject: [PATCH] mm anon rmap: in mremap, set the new vma's position before anon_vma_clone() anon_vma_clone() expects new_vma->vm_{start,end,pgoff} to be correctly set so that the new vma can be indexed on the anon interval tree. copy_vma() was failing to do that, which broke mremap(). Signed-off-by: Michel Lespinasse --- mm/mmap.c |8 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/mm/mmap.c b/mm/mmap.c index cc8c64077a42..7e672800b5d4 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -2446,16 +2446,16 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap, new_vma = kmem_cache_alloc(vm_area_cachep, GFP_KERNEL); if (new_vma) { *new_vma = *vma; + new_vma->vm_start = addr; + new_vma->vm_end = addr + len; + new_vma->vm_pgoff = pgoff; pol = mpol_dup(vma_policy(vma)); if (IS_ERR(pol)) goto out_free_vma; + vma_set_policy(new_vma, pol); INIT_LIST_HEAD(_vma->anon_vma_chain); if (anon_vma_clone(new_vma, vma)) goto out_free_mempol; - vma_set_policy(new_vma, pol); - new_vma->vm_start = addr; - new_vma->vm_end = addr + len; - new_vma->vm_pgoff = pgoff; if (new_vma->vm_file) { get_file(new_vma->vm_file); -- Michel "Walken" Lespinasse A program is never fully debugged until the last user dies. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 6/7] mm: add CONFIG_DEBUG_VM_RB build option
On Fri, Sep 14, 2012 at 3:14 PM, Sasha Levin wrote: > On 09/04/2012 11:20 AM, Michel Lespinasse wrote: >> Add a CONFIG_DEBUG_VM_RB build option for the previously existing >> DEBUG_MM_RB code. Now that Andi Kleen modified it to avoid using >> recursive algorithms, we can expose it a bit more. >> >> Also extend this code to validate_mm() after stack expansion, and to >> check that the vma's start and last pgoffs have not changed since the >> nodes were inserted on the anon vma interval tree (as it is important >> that the nodes be reindexed after each such update). > > This patch exposes the following warning: > > [ 24.977502] [ cut here ] > [ 24.979089] WARNING: at mm/interval_tree.c:110 > anon_vma_interval_tree_verify+0x81/0xa0() > [ 24.981765] Pid: 5928, comm: trinity-child37 Tainted: GW > 3.6.0-rc5-next-20120914-sasha-3-g7deb7fa-dirty #333 > [ 24.985501] Call Trace: > [ 24.986345] [] ? anon_vma_interval_tree_verify+0x81/0xa0 > [ 24.988535] [] warn_slowpath_common+0x86/0xb0 > [ 24.990636] [] warn_slowpath_null+0x15/0x20 > [ 24.992658] [] anon_vma_interval_tree_verify+0x81/0xa0 > [ 24.994980] [] validate_mm+0x58/0x1e0 > [ 24.996772] [] vma_link+0x94/0xe0 > [ 24.997719] [] copy_vma+0x279/0x2e0 > [ 24.998522] [] ? trace_hardirqs_off+0xd/0x10 > [ 25.000772] [] move_vma+0xa9/0x260 > [ 25.002499] [] sys_mremap+0x475/0x540 > [ 25.004364] [] tracesys+0xe1/0xe6 > [ 25.006108] ---[ end trace 7c901670963aa6e2 ]--- > > The code line is > > WARN_ON_ONCE(node->cached_vma_last != avc_last_pgoff(node)); That's very interesting (and potentially relevant to another bug that's been reported too). I'd like to know, what workload did you use that triggered this ? (I find it hard to test mremap as I don't know of enough users of it) Thanks, -- Michel "Walken" Lespinasse A program is never fully debugged until the last user dies. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 6/7] mm: add CONFIG_DEBUG_VM_RB build option
On 09/15/2012 12:14 AM, Sasha Levin wrote: > On 09/04/2012 11:20 AM, Michel Lespinasse wrote: >> Add a CONFIG_DEBUG_VM_RB build option for the previously existing >> DEBUG_MM_RB code. Now that Andi Kleen modified it to avoid using >> recursive algorithms, we can expose it a bit more. >> >> Also extend this code to validate_mm() after stack expansion, and to >> check that the vma's start and last pgoffs have not changed since the >> nodes were inserted on the anon vma interval tree (as it is important >> that the nodes be reindexed after each such update). > > This patch exposes the following warning: > > [ 24.977502] [ cut here ] > [ 24.979089] WARNING: at mm/interval_tree.c:110 > anon_vma_interval_tree_verify+0x81/0xa0() > [ 24.981765] Pid: 5928, comm: trinity-child37 Tainted: GW > 3.6.0-rc5-next-20120914-sasha-3-g7deb7fa-dirty #333 > [ 24.985501] Call Trace: > [ 24.986345] [] ? anon_vma_interval_tree_verify+0x81/0xa0 > [ 24.988535] [] warn_slowpath_common+0x86/0xb0 > [ 24.990636] [] warn_slowpath_null+0x15/0x20 > [ 24.992658] [] anon_vma_interval_tree_verify+0x81/0xa0 > [ 24.994980] [] validate_mm+0x58/0x1e0 > [ 24.996772] [] vma_link+0x94/0xe0 > [ 24.997719] [] copy_vma+0x279/0x2e0 > [ 24.998522] [] ? trace_hardirqs_off+0xd/0x10 > [ 25.000772] [] move_vma+0xa9/0x260 > [ 25.002499] [] sys_mremap+0x475/0x540 > [ 25.004364] [] tracesys+0xe1/0xe6 > [ 25.006108] ---[ end trace 7c901670963aa6e2 ]--- > > The code line is > > WARN_ON_ONCE(node->cached_vma_last != avc_last_pgoff(node)); > The second WARN in the function also triggers once in a while: [ 18.360283] [ cut here ] [ 18.360289] WARNING: at mm/interval_tree.c:109 anon_vma_interval_tree_verify+0x36/0xa0() [ 18.360292] Pid: 5694, comm: trinity-child15 Tainted: GW 3.6.0-rc5-next-20120914-sasha-3-g7deb7fa-dirty #335 [ 18.360293] Call Trace: [ 18.360297] [] ? anon_vma_interval_tree_verify+0x36/0xa0 [ 18.360300] [] warn_slowpath_common+0x86/0xb0 [ 18.360303] [] warn_slowpath_null+0x15/0x20 [ 18.360305] [] anon_vma_interval_tree_verify+0x36/0xa0 [ 18.360309] [] validate_mm+0x58/0x1e0 [ 18.360312] [] vma_link+0x94/0xe0 [ 18.360315] [] copy_vma+0x279/0x2e0 [ 18.360319] [] ? trace_hardirqs_off+0xd/0x10 [ 18.360322] [] move_vma+0xa9/0x260 [ 18.360326] [] sys_mremap+0x475/0x540 [ 18.360330] [] tracesys+0xe1/0xe6 [ 18.360332] ---[ end trace de862a218d00cefd ]--- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 6/7] mm: add CONFIG_DEBUG_VM_RB build option
On 09/04/2012 11:20 AM, Michel Lespinasse wrote: > Add a CONFIG_DEBUG_VM_RB build option for the previously existing > DEBUG_MM_RB code. Now that Andi Kleen modified it to avoid using > recursive algorithms, we can expose it a bit more. > > Also extend this code to validate_mm() after stack expansion, and to > check that the vma's start and last pgoffs have not changed since the > nodes were inserted on the anon vma interval tree (as it is important > that the nodes be reindexed after each such update). This patch exposes the following warning: [ 24.977502] [ cut here ] [ 24.979089] WARNING: at mm/interval_tree.c:110 anon_vma_interval_tree_verify+0x81/0xa0() [ 24.981765] Pid: 5928, comm: trinity-child37 Tainted: GW 3.6.0-rc5-next-20120914-sasha-3-g7deb7fa-dirty #333 [ 24.985501] Call Trace: [ 24.986345] [] ? anon_vma_interval_tree_verify+0x81/0xa0 [ 24.988535] [] warn_slowpath_common+0x86/0xb0 [ 24.990636] [] warn_slowpath_null+0x15/0x20 [ 24.992658] [] anon_vma_interval_tree_verify+0x81/0xa0 [ 24.994980] [] validate_mm+0x58/0x1e0 [ 24.996772] [] vma_link+0x94/0xe0 [ 24.997719] [] copy_vma+0x279/0x2e0 [ 24.998522] [] ? trace_hardirqs_off+0xd/0x10 [ 25.000772] [] move_vma+0xa9/0x260 [ 25.002499] [] sys_mremap+0x475/0x540 [ 25.004364] [] tracesys+0xe1/0xe6 [ 25.006108] ---[ end trace 7c901670963aa6e2 ]--- The code line is WARN_ON_ONCE(node->cached_vma_last != avc_last_pgoff(node)); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 6/7] mm: add CONFIG_DEBUG_VM_RB build option
On 09/04/2012 11:20 AM, Michel Lespinasse wrote: Add a CONFIG_DEBUG_VM_RB build option for the previously existing DEBUG_MM_RB code. Now that Andi Kleen modified it to avoid using recursive algorithms, we can expose it a bit more. Also extend this code to validate_mm() after stack expansion, and to check that the vma's start and last pgoffs have not changed since the nodes were inserted on the anon vma interval tree (as it is important that the nodes be reindexed after each such update). This patch exposes the following warning: [ 24.977502] [ cut here ] [ 24.979089] WARNING: at mm/interval_tree.c:110 anon_vma_interval_tree_verify+0x81/0xa0() [ 24.981765] Pid: 5928, comm: trinity-child37 Tainted: GW 3.6.0-rc5-next-20120914-sasha-3-g7deb7fa-dirty #333 [ 24.985501] Call Trace: [ 24.986345] [81224c91] ? anon_vma_interval_tree_verify+0x81/0xa0 [ 24.988535] [81106766] warn_slowpath_common+0x86/0xb0 [ 24.990636] [81106855] warn_slowpath_null+0x15/0x20 [ 24.992658] [81224c91] anon_vma_interval_tree_verify+0x81/0xa0 [ 24.994980] [8122e6e8] validate_mm+0x58/0x1e0 [ 24.996772] [8122e934] vma_link+0x94/0xe0 [ 24.997719] [812315e9] copy_vma+0x279/0x2e0 [ 24.998522] [8117a7fd] ? trace_hardirqs_off+0xd/0x10 [ 25.000772] [81232e89] move_vma+0xa9/0x260 [ 25.002499] [812334b5] sys_mremap+0x475/0x540 [ 25.004364] [8374b6e8] tracesys+0xe1/0xe6 [ 25.006108] ---[ end trace 7c901670963aa6e2 ]--- The code line is WARN_ON_ONCE(node-cached_vma_last != avc_last_pgoff(node)); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 6/7] mm: add CONFIG_DEBUG_VM_RB build option
On 09/15/2012 12:14 AM, Sasha Levin wrote: On 09/04/2012 11:20 AM, Michel Lespinasse wrote: Add a CONFIG_DEBUG_VM_RB build option for the previously existing DEBUG_MM_RB code. Now that Andi Kleen modified it to avoid using recursive algorithms, we can expose it a bit more. Also extend this code to validate_mm() after stack expansion, and to check that the vma's start and last pgoffs have not changed since the nodes were inserted on the anon vma interval tree (as it is important that the nodes be reindexed after each such update). This patch exposes the following warning: [ 24.977502] [ cut here ] [ 24.979089] WARNING: at mm/interval_tree.c:110 anon_vma_interval_tree_verify+0x81/0xa0() [ 24.981765] Pid: 5928, comm: trinity-child37 Tainted: GW 3.6.0-rc5-next-20120914-sasha-3-g7deb7fa-dirty #333 [ 24.985501] Call Trace: [ 24.986345] [81224c91] ? anon_vma_interval_tree_verify+0x81/0xa0 [ 24.988535] [81106766] warn_slowpath_common+0x86/0xb0 [ 24.990636] [81106855] warn_slowpath_null+0x15/0x20 [ 24.992658] [81224c91] anon_vma_interval_tree_verify+0x81/0xa0 [ 24.994980] [8122e6e8] validate_mm+0x58/0x1e0 [ 24.996772] [8122e934] vma_link+0x94/0xe0 [ 24.997719] [812315e9] copy_vma+0x279/0x2e0 [ 24.998522] [8117a7fd] ? trace_hardirqs_off+0xd/0x10 [ 25.000772] [81232e89] move_vma+0xa9/0x260 [ 25.002499] [812334b5] sys_mremap+0x475/0x540 [ 25.004364] [8374b6e8] tracesys+0xe1/0xe6 [ 25.006108] ---[ end trace 7c901670963aa6e2 ]--- The code line is WARN_ON_ONCE(node-cached_vma_last != avc_last_pgoff(node)); The second WARN in the function also triggers once in a while: [ 18.360283] [ cut here ] [ 18.360289] WARNING: at mm/interval_tree.c:109 anon_vma_interval_tree_verify+0x36/0xa0() [ 18.360292] Pid: 5694, comm: trinity-child15 Tainted: GW 3.6.0-rc5-next-20120914-sasha-3-g7deb7fa-dirty #335 [ 18.360293] Call Trace: [ 18.360297] [81224c26] ? anon_vma_interval_tree_verify+0x36/0xa0 [ 18.360300] [81106746] warn_slowpath_common+0x86/0xb0 [ 18.360303] [81106835] warn_slowpath_null+0x15/0x20 [ 18.360305] [81224c26] anon_vma_interval_tree_verify+0x36/0xa0 [ 18.360309] [8122e6c8] validate_mm+0x58/0x1e0 [ 18.360312] [8122e914] vma_link+0x94/0xe0 [ 18.360315] [812315c9] copy_vma+0x279/0x2e0 [ 18.360319] [8117a7dd] ? trace_hardirqs_off+0xd/0x10 [ 18.360322] [81232e69] move_vma+0xa9/0x260 [ 18.360326] [81233495] sys_mremap+0x475/0x540 [ 18.360330] [8374b6e8] tracesys+0xe1/0xe6 [ 18.360332] ---[ end trace de862a218d00cefd ]--- -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 6/7] mm: add CONFIG_DEBUG_VM_RB build option
On Fri, Sep 14, 2012 at 3:14 PM, Sasha Levin levinsasha...@gmail.com wrote: On 09/04/2012 11:20 AM, Michel Lespinasse wrote: Add a CONFIG_DEBUG_VM_RB build option for the previously existing DEBUG_MM_RB code. Now that Andi Kleen modified it to avoid using recursive algorithms, we can expose it a bit more. Also extend this code to validate_mm() after stack expansion, and to check that the vma's start and last pgoffs have not changed since the nodes were inserted on the anon vma interval tree (as it is important that the nodes be reindexed after each such update). This patch exposes the following warning: [ 24.977502] [ cut here ] [ 24.979089] WARNING: at mm/interval_tree.c:110 anon_vma_interval_tree_verify+0x81/0xa0() [ 24.981765] Pid: 5928, comm: trinity-child37 Tainted: GW 3.6.0-rc5-next-20120914-sasha-3-g7deb7fa-dirty #333 [ 24.985501] Call Trace: [ 24.986345] [81224c91] ? anon_vma_interval_tree_verify+0x81/0xa0 [ 24.988535] [81106766] warn_slowpath_common+0x86/0xb0 [ 24.990636] [81106855] warn_slowpath_null+0x15/0x20 [ 24.992658] [81224c91] anon_vma_interval_tree_verify+0x81/0xa0 [ 24.994980] [8122e6e8] validate_mm+0x58/0x1e0 [ 24.996772] [8122e934] vma_link+0x94/0xe0 [ 24.997719] [812315e9] copy_vma+0x279/0x2e0 [ 24.998522] [8117a7fd] ? trace_hardirqs_off+0xd/0x10 [ 25.000772] [81232e89] move_vma+0xa9/0x260 [ 25.002499] [812334b5] sys_mremap+0x475/0x540 [ 25.004364] [8374b6e8] tracesys+0xe1/0xe6 [ 25.006108] ---[ end trace 7c901670963aa6e2 ]--- The code line is WARN_ON_ONCE(node-cached_vma_last != avc_last_pgoff(node)); That's very interesting (and potentially relevant to another bug that's been reported too). I'd like to know, what workload did you use that triggered this ? (I find it hard to test mremap as I don't know of enough users of it) Thanks, -- Michel Walken Lespinasse A program is never fully debugged until the last user dies. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 6/7] mm: add CONFIG_DEBUG_VM_RB build option
On Fri, Sep 14, 2012 at 3:46 PM, Michel Lespinasse wal...@google.com wrote: On Fri, Sep 14, 2012 at 3:14 PM, Sasha Levin levinsasha...@gmail.com wrote: On 09/04/2012 11:20 AM, Michel Lespinasse wrote: Add a CONFIG_DEBUG_VM_RB build option for the previously existing DEBUG_MM_RB code. Now that Andi Kleen modified it to avoid using recursive algorithms, we can expose it a bit more. Also extend this code to validate_mm() after stack expansion, and to check that the vma's start and last pgoffs have not changed since the nodes were inserted on the anon vma interval tree (as it is important that the nodes be reindexed after each such update). This patch exposes the following warning: [ 24.977502] [ cut here ] [ 24.979089] WARNING: at mm/interval_tree.c:110 anon_vma_interval_tree_verify+0x81/0xa0() [ 24.981765] Pid: 5928, comm: trinity-child37 Tainted: GW 3.6.0-rc5-next-20120914-sasha-3-g7deb7fa-dirty #333 [ 24.985501] Call Trace: [ 24.986345] [81224c91] ? anon_vma_interval_tree_verify+0x81/0xa0 [ 24.988535] [81106766] warn_slowpath_common+0x86/0xb0 [ 24.990636] [81106855] warn_slowpath_null+0x15/0x20 [ 24.992658] [81224c91] anon_vma_interval_tree_verify+0x81/0xa0 [ 24.994980] [8122e6e8] validate_mm+0x58/0x1e0 [ 24.996772] [8122e934] vma_link+0x94/0xe0 [ 24.997719] [812315e9] copy_vma+0x279/0x2e0 [ 24.998522] [8117a7fd] ? trace_hardirqs_off+0xd/0x10 [ 25.000772] [81232e89] move_vma+0xa9/0x260 [ 25.002499] [812334b5] sys_mremap+0x475/0x540 [ 25.004364] [8374b6e8] tracesys+0xe1/0xe6 [ 25.006108] ---[ end trace 7c901670963aa6e2 ]--- The code line is WARN_ON_ONCE(node-cached_vma_last != avc_last_pgoff(node)); That's very interesting (and potentially relevant to another bug that's been reported too). I'd like to know, what workload did you use that triggered this ? (I find it hard to test mremap as I don't know of enough users of it) All right. Hugh managed to reproduce the issue on his suse laptop, and I came up with a fix. The problem was that in mremap, the new vma's vm_{start,end,pgoff} fields need to be updated before calling anon_vma_clone() so that the new vma will be properly indexed. Patch attached. I expect this should also explain Jiri's reported failure involving splitting THP pages during mremap(), even though we did not manage to reproduce that one. -8--- From: Michel Lespinasse wal...@google.com Date: Fri, 14 Sep 2012 16:43:49 -0700 Subject: [PATCH] mm anon rmap: in mremap, set the new vma's position before anon_vma_clone() anon_vma_clone() expects new_vma-vm_{start,end,pgoff} to be correctly set so that the new vma can be indexed on the anon interval tree. copy_vma() was failing to do that, which broke mremap(). Signed-off-by: Michel Lespinasse wal...@google.com --- mm/mmap.c |8 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/mm/mmap.c b/mm/mmap.c index cc8c64077a42..7e672800b5d4 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -2446,16 +2446,16 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap, new_vma = kmem_cache_alloc(vm_area_cachep, GFP_KERNEL); if (new_vma) { *new_vma = *vma; + new_vma-vm_start = addr; + new_vma-vm_end = addr + len; + new_vma-vm_pgoff = pgoff; pol = mpol_dup(vma_policy(vma)); if (IS_ERR(pol)) goto out_free_vma; + vma_set_policy(new_vma, pol); INIT_LIST_HEAD(new_vma-anon_vma_chain); if (anon_vma_clone(new_vma, vma)) goto out_free_mempol; - vma_set_policy(new_vma, pol); - new_vma-vm_start = addr; - new_vma-vm_end = addr + len; - new_vma-vm_pgoff = pgoff; if (new_vma-vm_file) { get_file(new_vma-vm_file); -- Michel Walken Lespinasse A program is never fully debugged until the last user dies. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 6/7] mm: add CONFIG_DEBUG_VM_RB build option
Add a CONFIG_DEBUG_VM_RB build option for the previously existing DEBUG_MM_RB code. Now that Andi Kleen modified it to avoid using recursive algorithms, we can expose it a bit more. Also extend this code to validate_mm() after stack expansion, and to check that the vma's start and last pgoffs have not changed since the nodes were inserted on the anon vma interval tree (as it is important that the nodes be reindexed after each such update). Signed-off-by: Michel Lespinasse --- include/linux/mm.h |3 +++ include/linux/rmap.h |3 +++ lib/Kconfig.debug|9 + mm/interval_tree.c | 41 - mm/mmap.c| 19 +-- 5 files changed, 64 insertions(+), 11 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 19d63ec2cbbb..1a2b1a44bd4e 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1367,6 +1367,9 @@ struct anon_vma_chain *anon_vma_interval_tree_iter_first( struct rb_root *root, unsigned long start, unsigned long last); struct anon_vma_chain *anon_vma_interval_tree_iter_next( struct anon_vma_chain *node, unsigned long start, unsigned long last); +#ifdef CONFIG_DEBUG_VM_RB +void anon_vma_interval_tree_verify(struct anon_vma_chain *node); +#endif #define anon_vma_interval_tree_foreach(avc, root, start, last) \ for (avc = anon_vma_interval_tree_iter_first(root, start, last); \ diff --git a/include/linux/rmap.h b/include/linux/rmap.h index dce44f7d3ed8..b2cce644ffc7 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -66,6 +66,9 @@ struct anon_vma_chain { struct list_head same_vma; /* locked by mmap_sem & page_table_lock */ struct rb_node rb; /* locked by anon_vma->mutex */ unsigned long rb_subtree_last; +#ifdef CONFIG_DEBUG_VM_RB + unsigned long cached_vma_start, cached_vma_last; +#endif }; #ifdef CONFIG_MMU diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index eba4b0961187..d261b4555dc5 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -781,6 +781,15 @@ config DEBUG_VM If unsure, say N. +config DEBUG_VM_RB + bool "Debug VM red-black trees" + depends on DEBUG_VM + help + Enable this to turn on more extended checks in the virtual-memory + system that may impact performance. + + If unsure, say N. + config DEBUG_VIRTUAL bool "Debug VM translations" depends on DEBUG_KERNEL && X86 diff --git a/mm/interval_tree.c b/mm/interval_tree.c index f7c72cd35e1d..4a5822a586e6 100644 --- a/mm/interval_tree.c +++ b/mm/interval_tree.c @@ -70,4 +70,43 @@ static inline unsigned long avc_last_pgoff(struct anon_vma_chain *avc) } INTERVAL_TREE_DEFINE(struct anon_vma_chain, rb, unsigned long, rb_subtree_last, -avc_start_pgoff, avc_last_pgoff,, anon_vma_interval_tree) +avc_start_pgoff, avc_last_pgoff, +static inline, __anon_vma_interval_tree) + +void anon_vma_interval_tree_insert(struct anon_vma_chain *node, + struct rb_root *root) +{ +#ifdef CONFIG_DEBUG_VM_RB + node->cached_vma_start = avc_start_pgoff(node); + node->cached_vma_last = avc_last_pgoff(node); +#endif + __anon_vma_interval_tree_insert(node, root); +} + +void anon_vma_interval_tree_remove(struct anon_vma_chain *node, + struct rb_root *root) +{ + __anon_vma_interval_tree_remove(node, root); +} + +struct anon_vma_chain * +anon_vma_interval_tree_iter_first(struct rb_root *root, + unsigned long first, unsigned long last) +{ + return __anon_vma_interval_tree_iter_first(root, first, last); +} + +struct anon_vma_chain * +anon_vma_interval_tree_iter_next(struct anon_vma_chain *node, +unsigned long first, unsigned long last) +{ + return __anon_vma_interval_tree_iter_next(node, first, last); +} + +#ifdef CONFIG_DEBUG_VM_RB +void anon_vma_interval_tree_verify(struct anon_vma_chain *node) +{ + WARN_ON_ONCE(node->cached_vma_start != avc_start_pgoff(node)); + WARN_ON_ONCE(node->cached_vma_last != avc_last_pgoff(node)); +} +#endif diff --git a/mm/mmap.c b/mm/mmap.c index 1a6afdb5194a..884bda4cd3ea 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -51,12 +51,6 @@ static void unmap_region(struct mm_struct *mm, struct vm_area_struct *vma, struct vm_area_struct *prev, unsigned long start, unsigned long end); -/* - * WARNING: the debugging will use recursive algorithms so never enable this - * unless you know what you are doing. - */ -#undef DEBUG_MM_RB - /* description of effects of mapping type and prot in current implementation. * this is due to the limited x86 page protection hardware. The expected * behavior is in parens: @@ -306,7 +300,7 @@ out: return retval; } -#ifdef DEBUG_MM_RB +#ifdef
[PATCH 6/7] mm: add CONFIG_DEBUG_VM_RB build option
Add a CONFIG_DEBUG_VM_RB build option for the previously existing DEBUG_MM_RB code. Now that Andi Kleen modified it to avoid using recursive algorithms, we can expose it a bit more. Also extend this code to validate_mm() after stack expansion, and to check that the vma's start and last pgoffs have not changed since the nodes were inserted on the anon vma interval tree (as it is important that the nodes be reindexed after each such update). Signed-off-by: Michel Lespinasse wal...@google.com --- include/linux/mm.h |3 +++ include/linux/rmap.h |3 +++ lib/Kconfig.debug|9 + mm/interval_tree.c | 41 - mm/mmap.c| 19 +-- 5 files changed, 64 insertions(+), 11 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 19d63ec2cbbb..1a2b1a44bd4e 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1367,6 +1367,9 @@ struct anon_vma_chain *anon_vma_interval_tree_iter_first( struct rb_root *root, unsigned long start, unsigned long last); struct anon_vma_chain *anon_vma_interval_tree_iter_next( struct anon_vma_chain *node, unsigned long start, unsigned long last); +#ifdef CONFIG_DEBUG_VM_RB +void anon_vma_interval_tree_verify(struct anon_vma_chain *node); +#endif #define anon_vma_interval_tree_foreach(avc, root, start, last) \ for (avc = anon_vma_interval_tree_iter_first(root, start, last); \ diff --git a/include/linux/rmap.h b/include/linux/rmap.h index dce44f7d3ed8..b2cce644ffc7 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -66,6 +66,9 @@ struct anon_vma_chain { struct list_head same_vma; /* locked by mmap_sem page_table_lock */ struct rb_node rb; /* locked by anon_vma-mutex */ unsigned long rb_subtree_last; +#ifdef CONFIG_DEBUG_VM_RB + unsigned long cached_vma_start, cached_vma_last; +#endif }; #ifdef CONFIG_MMU diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index eba4b0961187..d261b4555dc5 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -781,6 +781,15 @@ config DEBUG_VM If unsure, say N. +config DEBUG_VM_RB + bool Debug VM red-black trees + depends on DEBUG_VM + help + Enable this to turn on more extended checks in the virtual-memory + system that may impact performance. + + If unsure, say N. + config DEBUG_VIRTUAL bool Debug VM translations depends on DEBUG_KERNEL X86 diff --git a/mm/interval_tree.c b/mm/interval_tree.c index f7c72cd35e1d..4a5822a586e6 100644 --- a/mm/interval_tree.c +++ b/mm/interval_tree.c @@ -70,4 +70,43 @@ static inline unsigned long avc_last_pgoff(struct anon_vma_chain *avc) } INTERVAL_TREE_DEFINE(struct anon_vma_chain, rb, unsigned long, rb_subtree_last, -avc_start_pgoff, avc_last_pgoff,, anon_vma_interval_tree) +avc_start_pgoff, avc_last_pgoff, +static inline, __anon_vma_interval_tree) + +void anon_vma_interval_tree_insert(struct anon_vma_chain *node, + struct rb_root *root) +{ +#ifdef CONFIG_DEBUG_VM_RB + node-cached_vma_start = avc_start_pgoff(node); + node-cached_vma_last = avc_last_pgoff(node); +#endif + __anon_vma_interval_tree_insert(node, root); +} + +void anon_vma_interval_tree_remove(struct anon_vma_chain *node, + struct rb_root *root) +{ + __anon_vma_interval_tree_remove(node, root); +} + +struct anon_vma_chain * +anon_vma_interval_tree_iter_first(struct rb_root *root, + unsigned long first, unsigned long last) +{ + return __anon_vma_interval_tree_iter_first(root, first, last); +} + +struct anon_vma_chain * +anon_vma_interval_tree_iter_next(struct anon_vma_chain *node, +unsigned long first, unsigned long last) +{ + return __anon_vma_interval_tree_iter_next(node, first, last); +} + +#ifdef CONFIG_DEBUG_VM_RB +void anon_vma_interval_tree_verify(struct anon_vma_chain *node) +{ + WARN_ON_ONCE(node-cached_vma_start != avc_start_pgoff(node)); + WARN_ON_ONCE(node-cached_vma_last != avc_last_pgoff(node)); +} +#endif diff --git a/mm/mmap.c b/mm/mmap.c index 1a6afdb5194a..884bda4cd3ea 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -51,12 +51,6 @@ static void unmap_region(struct mm_struct *mm, struct vm_area_struct *vma, struct vm_area_struct *prev, unsigned long start, unsigned long end); -/* - * WARNING: the debugging will use recursive algorithms so never enable this - * unless you know what you are doing. - */ -#undef DEBUG_MM_RB - /* description of effects of mapping type and prot in current implementation. * this is due to the limited x86 page protection hardware. The expected * behavior is in parens: @@ -306,7 +300,7 @@ out: return retval; } -#ifdef DEBUG_MM_RB +#ifdef