On Tue, 2013-03-26 at 12:43 -0700, [email protected] wrote:
> The patch titled
> Subject: revert "ipc: don't allocate a copy larger than max"
> has been added to the -mm tree. Its filename is
> revert-ipc-dont-allocate-a-copy-larger-than-max.patch
>
> Before you just go and hit "reply", please:
> a) Consider who else should be cc'ed
> b) Prefer to cc a suitable mailing list as well
> c) Ideally: find the original patch on the mailing list and do a
> reply-to-all to that, adding suitable additional cc's
>
> *** Remember to use Documentation/SubmitChecklist when testing your code ***
>
> The -mm tree is included into linux-next and is updated
> there every 3-4 working days
>
> ------------------------------------------------------
> From: Andrew Morton <[email protected]>
> Subject: revert "ipc: don't allocate a copy larger than max"
>
> Revert 88b9e456b164. Dave has confirmed that this was causing oopses
> during trinity testing.
No, he didn't.
Here's a copy of Dave Jones's original report [1] on this very same bug
in linux-next on Feb 19, __6 days before__ I even submitted the series
that fixes this bug.
Note that the faulting instruction is __identical__ to Dave's most
recent report on 3.9-rc4:
On Mon, 2013-03-25 at 12:37 -0400, Dave Jones wrote:
Call Trace:
> [<ffffffff812c1b40>] ? msg_security+0x10/0x10
> [<ffffffff810b6bc5>] ? trace_hardirqs_on_caller+0x115/0x1a0
> [<ffffffff8134aa6e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
> [<ffffffff812c32b5>] sys_msgrcv+0x15/0x20
> [<ffffffff816cda02>] system_call_fastpath+0x16/0x1b
> Code: cc 83 fb 04 0f 84 f3 00 00 00 8b 74 24 4c 85 f6 0f 84 18 02 00
00 48 8b 44 24 38 48 39 44 24 50 0f 84 12 02 00 00 4c 89 7c 24 60 <4d> 8b 3f 48
ff 44 24 50 4d 39 ef 75 9d 0f 1f 44 00 00 48 81 7c
>
>
> 2b:* 4d 8b 3f mov (%r15),%r15 <--
trapping instruction
> 2e: 48 ff 44 24 50 incq 0x50(%rsp)
> 33: 4d 39 ef cmp %r13,%r15
> 36: 75 9d jne 0xffffffffffffffd5
> 38: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
> 3d: 48 rex.W
> 3e: 81 .byte 0x81
> 3f: 7c .byte 0x7c
>
> objdump -S output shows that this is here in do_msgrcv()
>
> 875 } else
> 876 break;
> 877 msg_counter++;
> 878 }
> 879 tmp = tmp->next;
> 880 }
> 881 if (!IS_ERR(msg)) {
>
> the tmp->next deref goes chasing a freed pointer.
My recommendation is to either:
1) apply my entire 'ipc MSG_COPY fixes' series
--or--
2) revert the entire ipc MSG_COPY implementation that introduced this
bug to begin with.
Regards,
Peter Hurley
[1]
On Tue, 2013-02-19 at 13:04 -0500, Dave Jones wrote:
> general protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
> Modules linked in: can af_rxrpc binfmt_misc scsi_transport_iscsi ax25
> ipt_ULOG decnet nfc appletalk x25 rds ipx p8023 psnap p8022 llc irda
> crc_ccitt atm lockd sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6
> xt_conntrack nf_conntrack ip6table_filter ip6_tables btusb bluetooth
> snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_pcm edac_core
> snd_page_alloc snd_timer microcode rfkill usb_debug serio_raw pcspkr snd
> soundcore vhost_net r8169 mii tun macvtap macvlan kvm_amd kvm
> CPU 2
> Pid: 887, comm: trinity-child2 Not tainted 3.8.0+ #57 Gigabyte Technology
> Co., Ltd. GA-MA78GM-S2H/GA-MA78GM-S2H
> RIP: 0010:[<ffffffff812aebba>] [<ffffffff812aebba>] do_msgrcv+0x22a/0x670
> RSP: 0018:ffff88011892be88 EFLAGS: 00010297
> RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000004000
> RDX: 000000007adea6f6 RSI: 6b6b6b6b6b6b6b6b RDI: ffff8801189ffb60
> RBP: ffff88011892bf68 R08: 0000000000000001 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> R13: ffff8801189ffc10 R14: ffff8801189ffb60 R15: 6b6b6b6b6b6b6b6b
> FS: 00007f681e955740(0000) GS:ffff88012f200000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f681e846064 CR3: 000000012553d000 CR4: 00000000000007e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process trinity-child2 (pid: 887, threadinfo ffff88011892a000, task
> ffff88010bc82490)
> Stack:
> ffff88011892beb8 ffff88010bc82490 ffff88010bc82490 ffff88010bc82490
> ffff8801186d8000 ffffffff812ad5f0 0000000001aba000 ffffffff81c688c0
> 000000007adea6f6 00000000001fffff 0000400046a9467e 6b6b6b6b6b6b6b6b
> Call Trace:
> [<ffffffff812ad5f0>] ? load_msg+0x180/0x180
> [<ffffffff810b8395>] ? trace_hardirqs_on_caller+0x115/0x1a0
> [<ffffffff813347be>] ? trace_hardirqs_on_thunk+0x3a/0x3f
> [<ffffffff812af015>] sys_msgrcv+0x15/0x20
> [<ffffffff816a8482>] system_call_fastpath+0x16/0x1b
> Code: 84 14 01 00 00 8b 8d 74 ff ff ff 85 c9 0f 84 52 02 00 00 48 8b 95 60 ff
> ff ff 48 39 55 80 0f 84 4d 02 00 00 4c 89 bd 78 ff ff ff <4d> 8b 3f 48 ff 45
> 80 4d 39 ef 75 9a 66 90 48 81 bd 78 ff ff ff
> RIP [<ffffffff812aebba>] do_msgrcv+0x22a/0x670
> RSP <ffff88011892be88>
> ---[ end trace d3cc044a84b1d828 ]---
>
> oopsing instruction is..
>
> 0: 4d 8b 3f mov (%r15),%r15
>
> Looks like a use-after-free.
>
> Disassembly of ipc/msg.o shows this happens here..
>
> msg = ERR_PTR(-EAGAIN);
> tmp = msq->q_messages.next;
> 1537: 4d 8b be b0 00 00 00 mov 0xb0(%r14),%r15
> while (tmp != &msq->q_messages) {
> 153e: 4d 8d ae b0 00 00 00 lea 0xb0(%r14),%r13
> 1545: 4d 39 ef cmp %r13,%r15
> 1548: 0f 84 5f 03 00 00 je 18ad <do_msgrcv+0x50d>
> 154e: 48 c7 45 80 00 00 00 movq $0x0,-0x80(%rbp)
> 1555: 00
> 1556: 48 c7 85 78 ff ff ff movq $0xfffffffffffffff5,-0x88(%rbp)
> 155d: f5 ff ff ff
> 1561: eb 0d jmp 1570 <do_msgrcv+0x1d0>
> 1563: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
> }
> } else
> break;
> msg_counter++;
> }
> tmp = tmp->next;
> 1568: 4d 8b 3f mov (%r15),%r15
> if (ipcperms(ns, &msq->q_perm, S_IRUGO))
> goto out_unlock;
>
> msg = ERR_PTR(-EAGAIN);
> tmp = msq->q_messages.next;
> while (tmp != &msq->q_messages) {
>
> Looks like Stanislav recently changed this code, so problem was likely
> introduced
> in those changes.
>
> Dave
>
--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html