Re: [lkp-robot] [mm] 1be7107fbe: kernel_BUG_at_mm/mmap.c
On 06/22, Hugh Dickins wrote: > > On Thu, 22 Jun 2017, Oleg Nesterov wrote: > > > > Something like the patch below? Yes, I thought about this too. > > Yes, that patch (times 11 for all the architectures) ^^ Yes, yes, this is clear to me. But only after you have already explained this in your previous email ;) > But my own preference this morning is to do nothing, until we hear > more complaints and can classify them as genuine userspace breakage, > as opposed to testcases surprised by a new kernel implementation. OK. Agreed. Lets wait for the "real" bug report. FYI. I am still investigating that redhat internal bug report. And yes, it was the real application. But. I still think that it fails by another reason, just the test-case they provided doesn't match the reality and it hits another (this) problem by accident. Oleg.
Re: [lkp-robot] [mm] 1be7107fbe: kernel_BUG_at_mm/mmap.c
On Thu, 22 Jun 2017, Oleg Nesterov wrote: > On 06/21, Hugh Dickins wrote: > > > > On Wed, 21 Jun 2017, Linus Torvalds wrote: > > > On Wed, Jun 21, 2017 at 1:56 PM, Oleg Nesterov wrote: > > > > > > > > I understand. My point is that this check was invalidated by > > > > stack-guard-page > > > > a long ago, and this means that we add the user-visible change now. > > > > > > Yeah. I guess we could consider it an *old* regression that got fixed, > > > but if people started relying on the regression... > > > > > > >> Do you have a pointer to the report for this regression? I must have > > > >> missed it. > > > > > > > > See http://marc.info/?t=14979452301&r=1&w=2 > > > > > > Ok. > > > > > > And thinking about it, while that is a silly test-case, the notion of > > > "create top-down segment, then start populating it _before_ moving the > > > stack pointer into it" is actually perfectly valid. > > > > > > So I guess checking against the stack pointer is wrong in that case - > > > at least if the stack pointer isn't inside that vma to begin with. > > > > > > So yes, removing that check looks like the right thing to do for now. > > > > > > Do you want to send me the patch if you already have a commit message etc? > > > > I have a bit of a bad feeling about this. > > > > Perhaps it's just sentimental attachment to all those weird > > and ancient stack pointer checks in arch//fault.c. > > > > We have been inconsistent: cris frv m32r m68k microblaze mn10300 > > openrisc powerpc tile um x86 have such checks, the others don't. > > So that's a good reason to delete them. > > OK, I didn't bother to check other acrhitectures, thanks... > > > But at least at the moment those checks impose some sanity: > > just a page less than we had imagined for several years. > > Once we remove them, they cannot go back. Should we now > > complicate them with an extra page of slop? > > Something like the patch below? Yes, I thought about this too. Yes, that patch (times 11 for all the architectures) would be a good conservative choice: imposing the traditional sanity check, but weakened by one page to match what we've inadvertently been doing for the last four years. Would deserve a comment (since it makes no sense in any tree by itself), but unfair to ask you to write that: I must get this mail off before a meeting, can't think what to say now. But my own preference this morning is to do nothing, until we hear more complaints and can classify them as genuine userspace breakage, as opposed to testcases surprised by a new kernel implementation. Hugh > > I simply do not know. Honestly, I do not even know why MAP_GROWSDOWN > exists. I mean, I do not understand how user-space can actually use it > to get auto-growing, the usage of MAP_GROWSDOWN in (say) criu is clear. > The main thread's stack can grow, but this is only because it is placed > at the right place, above mm->mmap_base in case of top-down layout. > > > I'm not entirely persuaded by your pre-population argument: > > it's perfectly possible to prepare a MAP_GROWSDOWN area with > > an initial size, that's populated in a normal way, before handing > > off for stack expansion - isn't it? > > Exactly. > > > I'd be interested to hear more about that (redhat internal) bug > > report that Oleg mentions: whether it gives stronger grounds for > > making this sudden change than the CRIU testcase. > > Probably not. Well, the customer reported multiple problems, but most > of them were caused by rhel-specific bugs. As for "MAP_GROWSDOWN does > not grow", most probably this was another test-case, not the real > application. I will ask and report back if this is not true. > > In short, I agree with any decision. Even with "we do not care if we > break some artificial test-cases". > > Oleg. > --- > > --- a/arch/x86/mm/fault.c > +++ b/arch/x86/mm/fault.c > @@ -1409,7 +1409,7 @@ __do_page_fault(struct pt_regs *regs, unsigned long > error_code, > bad_area(regs, error_code, address); > return; > } > - if (error_code & PF_USER) { > + if ((error_code & PF_USER) && (address + PAGE_SIZE < vma->vm_start)) { > /* >* Accessing the stack below %sp is always a bug. >* The large cushion allows instructions like enter
Re: [lkp-robot] [mm] 1be7107fbe: kernel_BUG_at_mm/mmap.c
On 06/21, Hugh Dickins wrote: > > On Wed, 21 Jun 2017, Linus Torvalds wrote: > > On Wed, Jun 21, 2017 at 1:56 PM, Oleg Nesterov wrote: > > > > > > I understand. My point is that this check was invalidated by > > > stack-guard-page > > > a long ago, and this means that we add the user-visible change now. > > > > Yeah. I guess we could consider it an *old* regression that got fixed, > > but if people started relying on the regression... > > > > >> Do you have a pointer to the report for this regression? I must have > > >> missed it. > > > > > > See http://marc.info/?t=14979452301&r=1&w=2 > > > > Ok. > > > > And thinking about it, while that is a silly test-case, the notion of > > "create top-down segment, then start populating it _before_ moving the > > stack pointer into it" is actually perfectly valid. > > > > So I guess checking against the stack pointer is wrong in that case - > > at least if the stack pointer isn't inside that vma to begin with. > > > > So yes, removing that check looks like the right thing to do for now. > > > > Do you want to send me the patch if you already have a commit message etc? > > I have a bit of a bad feeling about this. > > Perhaps it's just sentimental attachment to all those weird > and ancient stack pointer checks in arch//fault.c. > > We have been inconsistent: cris frv m32r m68k microblaze mn10300 > openrisc powerpc tile um x86 have such checks, the others don't. > So that's a good reason to delete them. OK, I didn't bother to check other acrhitectures, thanks... > But at least at the moment those checks impose some sanity: > just a page less than we had imagined for several years. > Once we remove them, they cannot go back. Should we now > complicate them with an extra page of slop? Something like the patch below? Yes, I thought about this too. I simply do not know. Honestly, I do not even know why MAP_GROWSDOWN exists. I mean, I do not understand how user-space can actually use it to get auto-growing, the usage of MAP_GROWSDOWN in (say) criu is clear. The main thread's stack can grow, but this is only because it is placed at the right place, above mm->mmap_base in case of top-down layout. > I'm not entirely persuaded by your pre-population argument: > it's perfectly possible to prepare a MAP_GROWSDOWN area with > an initial size, that's populated in a normal way, before handing > off for stack expansion - isn't it? Exactly. > I'd be interested to hear more about that (redhat internal) bug > report that Oleg mentions: whether it gives stronger grounds for > making this sudden change than the CRIU testcase. Probably not. Well, the customer reported multiple problems, but most of them were caused by rhel-specific bugs. As for "MAP_GROWSDOWN does not grow", most probably this was another test-case, not the real application. I will ask and report back if this is not true. In short, I agree with any decision. Even with "we do not care if we break some artificial test-cases". Oleg. --- --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -1409,7 +1409,7 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code, bad_area(regs, error_code, address); return; } - if (error_code & PF_USER) { + if ((error_code & PF_USER) && (address + PAGE_SIZE < vma->vm_start)) { /* * Accessing the stack below %sp is always a bug. * The large cushion allows instructions like enter
Re: [lkp-robot] [mm] 1be7107fbe: kernel_BUG_at_mm/mmap.c
On 06/22/2017 04:07 AM, Hugh Dickins wrote: On Wed, 21 Jun 2017, Linus Torvalds wrote: On Wed, Jun 21, 2017 at 1:56 PM, Oleg Nesterov wrote: I understand. My point is that this check was invalidated by stack-guard-page a long ago, and this means that we add the user-visible change now. Yeah. I guess we could consider it an *old* regression that got fixed, but if people started relying on the regression... Do you have a pointer to the report for this regression? I must have missed it. See http://marc.info/?t=14979452301&r=1&w=2 Ok. And thinking about it, while that is a silly test-case, the notion of "create top-down segment, then start populating it _before_ moving the stack pointer into it" is actually perfectly valid. So I guess checking against the stack pointer is wrong in that case - at least if the stack pointer isn't inside that vma to begin with. So yes, removing that check looks like the right thing to do for now. Do you want to send me the patch if you already have a commit message etc? I have a bit of a bad feeling about this. Perhaps it's just sentimental attachment to all those weird and ancient stack pointer checks in arch//fault.c. We have been inconsistent: cris frv m32r m68k microblaze mn10300 openrisc powerpc tile um x86 have such checks, the others don't. So that's a good reason to delete them. But at least at the moment those checks impose some sanity: just a page less than we had imagined for several years. Once we remove them, they cannot go back. Should we now complicate them with an extra page of slop? I'm not entirely persuaded by your pre-population argument: it's perfectly possible to prepare a MAP_GROWSDOWN area with an initial size, that's populated in a normal way, before handing off for stack expansion - isn't it? I'd be interested to hear more about that (redhat internal) bug report that Oleg mentions: whether it gives stronger grounds for making this sudden change than the CRIU testcase. Well, if all the deal is in CRIU testcase - it can be easily reworked. The question - will it break anything else? Maybe it's better to disable this check on the release and enable it back for v4.13 kernel, so if it'll break some user-space, it'll be caught on linux-next. I can go ahead and create a patch if Oleg is not there at the moment - but I might prefer his or your name on it - particularly if we're rushing it in before consulting the arch maintainers whose work we would be deleting. Queasily, Hugh -- Dmitry
Re: [lkp-robot] [mm] 1be7107fbe: kernel_BUG_at_mm/mmap.c
On Wed, 21 Jun 2017, Oleg Nesterov wrote: > On 06/21, Linus Torvalds wrote: > > > > Hugh, Michal - I also merged Helge's drop-up cleanup, is there > > anything I've missed? I think Oleg had something, but I can't recall > > right now, and I might just have missed it. > > Well, I meant, perhaps we need a bit more changes to ensure that a new > GROWSDOWN vma can't come without a gap below. But this is really minor, > we can do this later even if I am right. I'm mortified. At last I understand you, and see that you spelt it out perfectly clearly in your "unmapped_gap" mail earlier in another thread, when I was in a rush to prioritize what bugs needed looking at first, and brusquely persuaded you to say that this is only a minor defect. Well, yes, it's not a new vulnerability and it's not a new regression, and the users of MAP_GROWSDOWN are few and far between, and often the assignment of holes will work out just fine; but it's an embarrassing oversight on my part, that everything was geared to inflating the size of the existing VM_GROWS vmas, completely forgetting to inflate the size of the area to be added. Without reading you thoroughly (and all the fault mine not yours), I had thought you were referring to the way that a MAP_GROWSDOWN area may be assigned a place in which the stack_guard_gap would immediately prevent it from growing down afterwards (and I'm not sure what to do about that). But your point is that it may be assigned a place in which there is not even a stack_guard_gap below it. (And so the bug that trinity found would not even depend upon MAP_FIXED.) I'll go back to read you again and think on the best way to correct that, I hope it won't need lots of plumbing through different levels. Hugh
Re: [lkp-robot] [mm] 1be7107fbe: kernel_BUG_at_mm/mmap.c
On Wed, 21 Jun 2017, Linus Torvalds wrote: > On Wed, Jun 21, 2017 at 1:56 PM, Oleg Nesterov wrote: > > > > I understand. My point is that this check was invalidated by > > stack-guard-page > > a long ago, and this means that we add the user-visible change now. > > Yeah. I guess we could consider it an *old* regression that got fixed, > but if people started relying on the regression... > > >> Do you have a pointer to the report for this regression? I must have > >> missed it. > > > > See http://marc.info/?t=14979452301&r=1&w=2 > > Ok. > > And thinking about it, while that is a silly test-case, the notion of > "create top-down segment, then start populating it _before_ moving the > stack pointer into it" is actually perfectly valid. > > So I guess checking against the stack pointer is wrong in that case - > at least if the stack pointer isn't inside that vma to begin with. > > So yes, removing that check looks like the right thing to do for now. > > Do you want to send me the patch if you already have a commit message etc? I have a bit of a bad feeling about this. Perhaps it's just sentimental attachment to all those weird and ancient stack pointer checks in arch//fault.c. We have been inconsistent: cris frv m32r m68k microblaze mn10300 openrisc powerpc tile um x86 have such checks, the others don't. So that's a good reason to delete them. But at least at the moment those checks impose some sanity: just a page less than we had imagined for several years. Once we remove them, they cannot go back. Should we now complicate them with an extra page of slop? I'm not entirely persuaded by your pre-population argument: it's perfectly possible to prepare a MAP_GROWSDOWN area with an initial size, that's populated in a normal way, before handing off for stack expansion - isn't it? I'd be interested to hear more about that (redhat internal) bug report that Oleg mentions: whether it gives stronger grounds for making this sudden change than the CRIU testcase. I can go ahead and create a patch if Oleg is not there at the moment - but I might prefer his or your name on it - particularly if we're rushing it in before consulting the arch maintainers whose work we would be deleting. Queasily, Hugh
Re: [lkp-robot] [mm] 1be7107fbe: kernel_BUG_at_mm/mmap.c
On Wed, Jun 21, 2017 at 1:56 PM, Oleg Nesterov wrote: > > I understand. My point is that this check was invalidated by stack-guard-page > a long ago, and this means that we add the user-visible change now. Yeah. I guess we could consider it an *old* regression that got fixed, but if people started relying on the regression... >> Do you have a pointer to the report for this regression? I must have missed >> it. > > See http://marc.info/?t=14979452301&r=1&w=2 Ok. And thinking about it, while that is a silly test-case, the notion of "create top-down segment, then start populating it _before_ moving the stack pointer into it" is actually perfectly valid. So I guess checking against the stack pointer is wrong in that case - at least if the stack pointer isn't inside that vma to begin with. So yes, removing that check looks like the right thing to do for now. Do you want to send me the patch if you already have a commit message etc? Linus
Re: [lkp-robot] [mm] 1be7107fbe: kernel_BUG_at_mm/mmap.c
On 06/21, Linus Torvalds wrote: > > On Wed, Jun 21, 2017 at 1:27 PM, Oleg Nesterov wrote: > > > > Now __do_page_fault() tries to expand the stack itself, and this check > > fails. > > But we want that check to trigger and cause the access to fail. > Accessing the stack below the stack pointer is wrong. I understand. My point is that this check was invalidated by stack-guard-page a long ago, and this means that we add the user-visible change now. > Do you have a pointer to the report for this regression? I must have missed > it. See http://marc.info/?t=14979452301&r=1&w=2 this thread is a bit confusing, the most relevant emails are http://marc.info/?l=linux-kernel&m=149806256621440&w=2 (with test case) http://marc.info/?l=linux-kernel&m=149806452322233&w=2 Actually, I got another (redhat internal) bug report after this discussion, and at first glance this is the same thing. Just in case, I agree that mmap(MAP_GROWSDOWN) is mostly useless, perhaps we do not even care. But still this is regression. Oleg.
Re: [lkp-robot] [mm] 1be7107fbe: kernel_BUG_at_mm/mmap.c
On Wed, Jun 21, 2017 at 1:27 PM, Oleg Nesterov wrote: > > Now __do_page_fault() tries to expand the stack itself, and this check > fails. But we want that check to trigger and cause the access to fail. Accessing the stack below the stack pointer is wrong. Do you have a pointer to the report for this regression? I must have missed it. Linus
Re: [lkp-robot] [mm] 1be7107fbe: kernel_BUG_at_mm/mmap.c
On 06/21, Linus Torvalds wrote: > > On Wed, Jun 21, 2017 at 12:33 PM, Oleg Nesterov wrote: > > - if (unlikely(address + 65536 + 32 * sizeof(unsigned long) < > > regs->sp)) { > > +if (0) if (unlikely(address + 65536 + 32 * sizeof(unsigned long) < > > regs->sp)) { > > This smells bad. Yes. > That test is not about grow-down or even the guard page. That test is > that it's always wrong to grow down the stack below %esp. Sure. but let me repeat that this test was essentially dismissed when the stack guard page was introduced. Simply because do_page_pault() never hits (before the recent patch) this need-to-grow-VM_GROWSDOWN-vma path if the stack grows by less than PAGE_SIZE. IOP. Suppose that an application does char * p = mmap(MAP_GROWSDOWN); for (;;) *p-- = 'x'; before the "larger stack guard gap, between vmas" change the stack was enlarged by do_anonymous_page(), __do_page_fault() didn't hit this path. Now __do_page_fault() tries to expand the stack itself, and this check fails. Oleg.
Re: [lkp-robot] [mm] 1be7107fbe: kernel_BUG_at_mm/mmap.c
On Wed, 21 Jun 2017, Linus Torvalds wrote: > On Tue, Jun 20, 2017 at 7:41 PM, Hugh Dickins wrote: > > On Wed, 21 Jun 2017, kernel test robot wrote: > > > > Thanks for the report: yes, this is the same one as Dave Jones > > found yesterday, which is fixed by this patch posted last night: > > .. and that patch is in current -git now, so hopefully we're all good. > > Hugh, Michal - I also merged Helge's drop-up cleanup, is there > anything I've missed? I think Oleg had something, but I can't recall > right now, and I might just have missed it. No, the tree looks good now, thanks. You're right that Oleg raised a concern about how mmap(MAP_GROWSDOWN) behaves, and is wondering about how to make it more sensible: but that's neither a vulnerability nor a regression - he agreed minor, to think about another time. He's also been fielding a concern from the CRIU people, we're hoping that's just something they need to fix at their end - ah, no, I see Oleg has just replied ahead of me, not digested yet, I'll look into that later... Hugh
Re: [lkp-robot] [mm] 1be7107fbe: kernel_BUG_at_mm/mmap.c
On Wed, Jun 21, 2017 at 12:33 PM, Oleg Nesterov wrote: > - if (unlikely(address + 65536 + 32 * sizeof(unsigned long) < > regs->sp)) { > +if (0) if (unlikely(address + 65536 + 32 * sizeof(unsigned long) < > regs->sp)) { This smells bad. That test is not about grow-down or even the guard page. That test is that it's always wrong to grow down the stack below %esp. Except we allow some slop, because certain instructions take the page fault before actually updating %rsp. So that patch is not correct. We want a page fault (and *not* expand the stack) if somebody accesses below the stack pointer. If we had a regression, it's due to something else. Linus
Re: [lkp-robot] [mm] 1be7107fbe: kernel_BUG_at_mm/mmap.c
On 06/21, Linus Torvalds wrote: > > Hugh, Michal - I also merged Helge's drop-up cleanup, is there > anything I've missed? I think Oleg had something, but I can't recall > right now, and I might just have missed it. Well, I meant, perhaps we need a bit more changes to ensure that a new GROWSDOWN vma can't come without a gap below. But this is really minor, we can do this later even if I am right. However, there is another regression reported by Cyrill. Fixed by the patch below. And yes, I think this check should either go away, or we need to make it more clever. In short, the vma created by mmap(MAP_GROWSDOWN) does not grow down automatically, because of this check. This worked before, because with the stack guard page at ->vm_start __do_page_fault() hits this expand-stack path only if the stack grows by more than PAGE_SIZE, now it is called every time. I'll send the patch tomorrow if nobody else does this before. Oleg. --- diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index 8ad91a0..edc5d68 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -1416,7 +1416,7 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code, * and pusha to work. ("enter $65535, $31" pushes * 32 pointers and then decrements %sp by 65535.) */ - if (unlikely(address + 65536 + 32 * sizeof(unsigned long) < regs->sp)) { +if (0) if (unlikely(address + 65536 + 32 * sizeof(unsigned long) < regs->sp)) { bad_area(regs, error_code, address); return; }
Re: [lkp-robot] [mm] 1be7107fbe: kernel_BUG_at_mm/mmap.c
On Tue, Jun 20, 2017 at 7:41 PM, Hugh Dickins wrote: > On Wed, 21 Jun 2017, kernel test robot wrote: > > Thanks for the report: yes, this is the same one as Dave Jones > found yesterday, which is fixed by this patch posted last night: .. and that patch is in current -git now, so hopefully we're all good. Hugh, Michal - I also merged Helge's drop-up cleanup, is there anything I've missed? I think Oleg had something, but I can't recall right now, and I might just have missed it. Linus
Re: [lkp-robot] [mm] 1be7107fbe: kernel_BUG_at_mm/mmap.c
On Wed, 21 Jun 2017, kernel test robot wrote: > > FYI, we noticed the following commit: > > commit: 1be7107fbe18eed3e319a6c3e83c78254b693acb ("mm: larger stack guard > gap, between vmas") > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master > > in testcase: trinity > with following parameters: > > runtime: 300s > > test-description: Trinity is a linux system call fuzz tester. > test-url: http://codemonkey.org.uk/projects/trinity/ > > > on test machine: qemu-system-i386 -enable-kvm -smp 2 -m 320M > > caused below changes (please refer to attached dmesg/kmsg for entire > log/backtrace): > > > +--+++ > | | 1132d5e7b6 | 1be7107fbe | > +--+++ > | boot_successes | 5 | 4 | > | boot_failures| 0 | 4 | > | kernel_BUG_at_mm/mmap.c | 0 | 4 | > | invalid_opcode:#[##] | 0 | 4 | > | EIP:unmapped_area_topdown| 0 | 4 | > | Kernel_panic-not_syncing:Fatal_exception | 0 | 4 | > +--+++ > > > > [ 87.792040] kernel BUG at mm/mmap.c:1963! > [ 87.793442] invalid opcode: [#1] DEBUG_PAGEALLOC > [ 87.794812] Modules linked in: > [ 87.795849] CPU: 0 PID: 424 Comm: trinity-c2 Not tainted > 4.12.0-rc5-00285-g1be7107f #1 > [ 87.798138] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > 1.9.3-20161025_171302-gandalf 04/01/2014 > [ 87.800657] task: ce6177c0 task.stack: cd0fc000 > [ 87.801877] EIP: unmapped_area_topdown+0x14b/0x15c > [ 87.803063] EFLAGS: 00010206 CPU: 0 > [ 87.804075] EAX: EBX: b520 ECX: EDX: b4feb000 > [ 87.805469] ESI: 00201000 EDI: b4feb000 EBP: cd0fde84 ESP: cd0fde60 > [ 87.806872] DS: 007b ES: 007b FS: GS: 0033 SS: 0068 > [ 87.808182] CR0: 80050033 CR2: 0004 CR3: 0d098c60 CR4: 06b0 > [ 87.809558] DR0: DR1: DR2: DR3: > [ 87.810919] DR6: fffe0ff0 DR7: 0400 > [ 87.812002] Call Trace: > [ 87.812857] arch_get_unmapped_area_topdown+0x74/0x11f > [ 87.814011] ? arch_get_unmapped_area+0xb4/0xb4 > [ 87.815095] get_unmapped_area+0x5b/0xae > [ 87.816103] do_mmap+0xc7/0x2ac > [ 87.817061] vm_mmap_pgoff+0x6b/0x94 > [ 87.818080] SYSC_mmap_pgoff+0x13f/0x162 > [ 87.819004] SyS_mmap_pgoff+0x1a/0x1c > [ 87.819873] do_int80_syscall_32+0x65/0x79 > [ 87.820791] entry_INT80_32+0x2a/0x2a > [ 87.821710] EIP: 0x8090aa2 > [ 87.822490] EFLAGS: 0246 CPU: 0 > [ 87.823345] EAX: ffda EBX: ECX: 00201000 EDX: 0003 > [ 87.824489] ESI: 0022 EDI: EBP: ESP: bff1c8c8 > [ 87.825650] DS: 007b ES: 007b FS: GS: 0033 SS: 007b > [ 87.826735] Code: 31 c9 e8 20 15 fb ff 39 7d ec 5a 76 02 0f 0b 31 d2 6a 00 > 39 fb 0f 97 c2 b8 c0 db b1 c1 31 c9 e8 03 15 fb ff 39 fb 89 fa 58 76 07 <0f> > 0b ba f4 ff ff ff 8d 65 f4 89 d0 5b 5e 5f 5d c3 55 89 e5 56 > [ 87.830175] EIP: unmapped_area_topdown+0x14b/0x15c SS:ESP: 0068:cd0fde60 > [ 87.831396] ---[ end trace 67da11e70888e7ec ]--- > > > To reproduce: > > git clone https://github.com/01org/lkp-tests.git > cd lkp-tests > bin/lkp qemu -k job-script # job-script is attached in > this email Thanks for the report: yes, this is the same one as Dave Jones found yesterday, which is fixed by this patch posted last night: [PATCH] mm: fix new crash in unmapped_area_topdown() Trinity gets kernel BUG at mm/mmap.c:1963! in about 3 minutes of mmap testing. That's the VM_BUG_ON(gap_end < gap_start) at the end of unmapped_area_topdown(). Linus points out how MAP_FIXED (which does not have to respect our stack guard gap intentions) could result in gap_end below gap_start there. Fix that, and the similar case in its alternative, unmapped_area(). Fixes: 1be7107fbe18 ("mm: larger stack guard gap, between vmas") Reported-by: Dave Jones Debugged-by: Linus Torvalds Signed-off-by: Hugh Dickins --- mm/mmap.c |6 -- 1 file changed, 4 insertions(+), 2 deletions(-) --- 4.12-rc6/mm/mmap.c 2017-06-19 09:06:10.035407505 -0700 +++ linux/mm/mmap.c 2017-06-19 21:09:28.616707311 -0700 @@ -1817,7 +1817,8 @@ unsigned long unmapped_area(struct vm_un /* Check if current node has a suitable gap */ if (gap_start > high_limit) return -ENOMEM; - if (gap_end >= low_limit && gap_end - gap_start >= length) + if (gap_end >= low_limit && + gap_end > gap_start && gap_end - gap_start >= length) goto found; /* Visit right subtree if it looks promising */ @@ -1920,7 +1921,8 @@ unsigned lon
[lkp-robot] [mm] 1be7107fbe: kernel_BUG_at_mm/mmap.c
FYI, we noticed the following commit: commit: 1be7107fbe18eed3e319a6c3e83c78254b693acb ("mm: larger stack guard gap, between vmas") https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master in testcase: trinity with following parameters: runtime: 300s test-description: Trinity is a linux system call fuzz tester. test-url: http://codemonkey.org.uk/projects/trinity/ on test machine: qemu-system-i386 -enable-kvm -smp 2 -m 320M caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace): +--+++ | | 1132d5e7b6 | 1be7107fbe | +--+++ | boot_successes | 5 | 4 | | boot_failures| 0 | 4 | | kernel_BUG_at_mm/mmap.c | 0 | 4 | | invalid_opcode:#[##] | 0 | 4 | | EIP:unmapped_area_topdown| 0 | 4 | | Kernel_panic-not_syncing:Fatal_exception | 0 | 4 | +--+++ [ 87.792040] kernel BUG at mm/mmap.c:1963! [ 87.793442] invalid opcode: [#1] DEBUG_PAGEALLOC [ 87.794812] Modules linked in: [ 87.795849] CPU: 0 PID: 424 Comm: trinity-c2 Not tainted 4.12.0-rc5-00285-g1be7107f #1 [ 87.798138] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-20161025_171302-gandalf 04/01/2014 [ 87.800657] task: ce6177c0 task.stack: cd0fc000 [ 87.801877] EIP: unmapped_area_topdown+0x14b/0x15c [ 87.803063] EFLAGS: 00010206 CPU: 0 [ 87.804075] EAX: EBX: b520 ECX: EDX: b4feb000 [ 87.805469] ESI: 00201000 EDI: b4feb000 EBP: cd0fde84 ESP: cd0fde60 [ 87.806872] DS: 007b ES: 007b FS: GS: 0033 SS: 0068 [ 87.808182] CR0: 80050033 CR2: 0004 CR3: 0d098c60 CR4: 06b0 [ 87.809558] DR0: DR1: DR2: DR3: [ 87.810919] DR6: fffe0ff0 DR7: 0400 [ 87.812002] Call Trace: [ 87.812857] arch_get_unmapped_area_topdown+0x74/0x11f [ 87.814011] ? arch_get_unmapped_area+0xb4/0xb4 [ 87.815095] get_unmapped_area+0x5b/0xae [ 87.816103] do_mmap+0xc7/0x2ac [ 87.817061] vm_mmap_pgoff+0x6b/0x94 [ 87.818080] SYSC_mmap_pgoff+0x13f/0x162 [ 87.819004] SyS_mmap_pgoff+0x1a/0x1c [ 87.819873] do_int80_syscall_32+0x65/0x79 [ 87.820791] entry_INT80_32+0x2a/0x2a [ 87.821710] EIP: 0x8090aa2 [ 87.822490] EFLAGS: 0246 CPU: 0 [ 87.823345] EAX: ffda EBX: ECX: 00201000 EDX: 0003 [ 87.824489] ESI: 0022 EDI: EBP: ESP: bff1c8c8 [ 87.825650] DS: 007b ES: 007b FS: GS: 0033 SS: 007b [ 87.826735] Code: 31 c9 e8 20 15 fb ff 39 7d ec 5a 76 02 0f 0b 31 d2 6a 00 39 fb 0f 97 c2 b8 c0 db b1 c1 31 c9 e8 03 15 fb ff 39 fb 89 fa 58 76 07 <0f> 0b ba f4 ff ff ff 8d 65 f4 89 d0 5b 5e 5f 5d c3 55 89 e5 56 [ 87.830175] EIP: unmapped_area_topdown+0x14b/0x15c SS:ESP: 0068:cd0fde60 [ 87.831396] ---[ end trace 67da11e70888e7ec ]--- To reproduce: git clone https://github.com/01org/lkp-tests.git cd lkp-tests bin/lkp qemu -k job-script # job-script is attached in this email Thanks, Xiaolong # # Automatically generated file; DO NOT EDIT. # Linux/i386 4.12.0-rc5 Kernel Configuration # # CONFIG_64BIT is not set CONFIG_X86_32=y CONFIG_X86=y CONFIG_INSTRUCTION_DECODER=y CONFIG_OUTPUT_FORMAT="elf32-i386" CONFIG_ARCH_DEFCONFIG="arch/x86/configs/i386_defconfig" CONFIG_LOCKDEP_SUPPORT=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_MMU=y CONFIG_ARCH_MMAP_RND_BITS_MIN=8 CONFIG_ARCH_MMAP_RND_BITS_MAX=16 CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=8 CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16 CONFIG_NEED_DMA_MAP_STATE=y CONFIG_NEED_SG_DMA_LENGTH=y CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_BUG=y CONFIG_GENERIC_HWEIGHT=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y CONFIG_RWSEM_XCHGADD_ALGORITHM=y CONFIG_GENERIC_CALIBRATE_DELAY=y CONFIG_ARCH_HAS_CPU_RELAX=y CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y CONFIG_HAVE_SETUP_PER_CPU_AREA=y CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y CONFIG_ARCH_HIBERNATION_POSSIBLE=y CONFIG_ARCH_SUSPEND_POSSIBLE=y CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y CONFIG_ARCH_WANT_GENERAL_HUGETLB=y CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y CONFIG_X86_32_LAZY_GS=y CONFIG_ARCH_SUPPORTS_UPROBES=y CONFIG_FIX_EARLYCON_MEM=y CONFIG_PGTABLE_LEVELS=3 CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config" CONFIG_IRQ_WORK=y CONFIG_BUILDTIME_EXTABLE_SORT=y CONFIG_THREAD_INFO_IN_TASK=y # # General setup # CONFIG_BROKEN_ON_SMP=y CONFIG_INIT_ENV_ARG_LIMIT=32 CONFIG_CROSS_COMPILE="" # CONFIG_COMPILE_TEST is not set CONFIG_LOCALVERSION="" CONFIG_LOCALVERSION_AUTO=y CONFIG_HAVE_KERNEL_GZIP=y CONFIG_HAVE_KERNEL_BZIP2=y CONFIG_HAVE_KERNEL_LZMA=y CONFIG_HAVE_KERNEL_XZ=y CONFIG_HAV