On Sat, May 03, 2014 at 09:16:00AM -0400, Sasha Levin wrote: > Hi all, > > While fuzzing with trinity inside a KVM tools guest running latest -next > kernel I've stumbled on the following: >
Cute.. not making sense.. :-) > [ 1796.591361] BUG: unable to handle kernel paging request at fffffffedf97f040 > [ 1796.592665] IP: __cpu_to_node (arch/x86/mm/numa.c:777) I suppose you've scripted this addr2line -ie vmlinux for all addresses in this splat? > [ 1796.593710] PGD 21e30067 PUD 0 > [ 1796.594174] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC > [ 1796.594937] Dumping ftrace buffer: > [ 1796.595678] (ftrace buffer empty) > [ 1796.596329] Modules linked in: > [ 1796.596733] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G W > 3.15.0-rc3-next-20140502-sasha-00019-g5cb1c98 #431 > [ 1796.598143] task: ffff8803345b8000 ti: ffff880035fc0000 task.ti: > ffff880035fc0000 > [ 1796.598975] RIP: __cpu_to_node (arch/x86/mm/numa.c:777) > [ 1796.600093] RSP: 0018:ffff8800a6c03b88 EFLAGS: 00010046 > [ 1796.600197] RAX: ffff8806e791a000 RBX: ffffffffe791a028 RCX: > 0000000000000000 > [ 1796.600197] RDX: 0000000000000001 RSI: ffff8806cdc68068 RDI: > 00000000e791a028 > [ 1796.600197] RBP: ffff8800a6c03b98 R08: ffff880496183078 R09: > 00000000000151c6 > [ 1796.600197] R10: 000000000000b731 R11: 0000000000000001 R12: > ffff8801b4dd7840 > [ 1796.600197] R13: 0000000000000000 R14: 000000000000001e R15: > ffff8801b34ac1a0 > [ 1796.600197] FS: 0000000000000000(0000) GS:ffff8800a6c00000(0000) > knlGS:0000000000000000 > [ 1796.600197] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 1796.600197] CR2: fffffffedf97f040 CR3: 0000000021e2d000 CR4: > 00000000000006a0 > [ 1796.610323] Stack: > [ 1796.610323] 0000000000000000 ffff8801b34ac1a0 ffff8800a6c03bd8 > ffffffff9d1a9646 > [ 1796.610323] ffff8800a6c03bd8 ffff8806cdc68068 ffff8806cdc68068 > ffff8801b34ac1a0 > [ 1796.610323] 0000000000000000 000000000000b7db ffff8800a6c03c38 > ffffffff9d1ae987 > [ 1796.610323] Call Trace: > [ 1796.610323] <IRQ> > [ 1796.610323] account_entity_dequeue (kernel/sched/fair.c:859 > kernel/sched/fair.c:2009) > [ 1796.610323] dequeue_entity (kernel/sched/fair.c:2827) > [ 1796.610323] dequeue_task_fair (kernel/sched/fair.c:3907 > include/linux/jump_label.h:105 kernel/sched/fair.c:3041 > kernel/sched/fair.c:3217 kernel/sched/fair.c:3915) > [ 1796.610323] dequeue_task (kernel/sched/core.c:793) > [ 1796.610323] deactivate_task (kernel/sched/core.c:809) > [ 1796.610323] move_task (kernel/sched/fair.c:5032) > [ 1796.610323] load_balance (kernel/sched/fair.c:5305 > kernel/sched/fair.c:6485) > [ 1796.610323] ? debug_smp_processor_id (lib/smp_processor_id.c:57) > [ 1796.610323] rebalance_domains (kernel/sched/fair.c:7032) > [ 1796.610323] ? rebalance_domains (kernel/sched/fair.c:6975) > [ 1796.610323] run_rebalance_domains (kernel/sched/fair.c:7105 > kernel/sched/fair.c:7198) > [ 1796.610323] __do_softirq (kernel/softirq.c:269 > include/linux/jump_label.h:105 include/trace/events/irq.h:126 > kernel/softirq.c:270) > [ 1796.610323] ? irq_exit (include/linux/vtime.h:82 include/linux/vtime.h:121 > kernel/softirq.c:384) > [ 1796.610323] irq_exit (kernel/softirq.c:346 kernel/softirq.c:387) > [ 1796.610323] scheduler_ipi (kernel/sched/core.c:1545) > [ 1796.610323] smp_reschedule_interrupt (arch/x86/kernel/smp.c:266) > [ 1796.610323] reschedule_interrupt (arch/x86/kernel/entry_64.S:1178) > [ 1796.610323] <EOI> > [ 1796.610323] ? native_safe_halt (arch/x86/include/asm/irqflags.h:50) > [ 1796.610323] ? trace_hardirqs_on (kernel/locking/lockdep.c:2607) > [ 1796.637135] default_idle (arch/x86/include/asm/paravirt.h:111 > arch/x86/kernel/process.c:310) > [ 1796.637135] arch_cpu_idle (arch/x86/kernel/process.c:302) > [ 1796.637135] cpu_idle_loop (kernel/sched/idle.c:179 kernel/sched/idle.c:226) > [ 1796.637135] cpu_startup_entry (??:?) > [ 1796.637135] start_secondary (arch/x86/kernel/smpboot.c:267) > [ 1796.637135] Code: 3a ea 05 00 74 25 89 de 48 c7 c7 08 b4 6c a1 31 c0 e8 99 > 6c 45 03 e8 7c 39 46 03 48 8b 05 71 3a ea 05 8b 04 98 eb 16 0f 1f 40 00 <48> > 8b 14 dd 00 ef 0a a3 48 c7 c0 d8 f4 00 00 8b 04 10 48 83 c4 Could you maybe also do the same with the Code? -- that is, script an auto-decode for it? Obviously scripts/decodecode doesn't actually work right anymore: # echo [ 1796.637135] Code: 3a ea 05 00 74 25 89 de 48 c7 c7 08 b4 6c a1 31 c0 e8 99 6c 45 03 e8 7c 39 46 03 48 8b 05 71 3a ea 05 8b 04 98 eb 16 0f 1f 40 00 <48> 8b 14 dd 00 ef 0a a3 48 c7 c0 d8 00 00 8b 04 10 48 83 c4 | ./scripts/decodecode -bash: syntax error near unexpected token `48' But if I remove the <> by hand I get: # echo [ 1796.637135] Code: 3a ea 05 00 74 25 89 de 48 c7 c7 08 b4 6c a1 31 c0 e8 99 6c 45 03 e8 7c 39 46 03 48 8b 05 71 3a ea 05 8b 04 98 eb 16 0f 1f 40 00 48 8b 14 dd 00 ef 0a a3 48 c7 c0 d8 00 00 8b 04 10 48 83 c4 | ./scripts/decodecode [ 1796.637135] Code: 3a ea 05 00 74 25 89 de 48 c7 c7 08 b4 6c a1 31 c0 e8 99 6c 45 03 e8 7c 39 46 03 48 8b 05 71 3a ea 05 8b 04 98 eb 16 0f 1f 40 00 48 8b 14 dd 00 ef 0a a3 48 c7 c0 d8 00 00 8b 04 10 48 83 c4 sed: -e expression #1, char 1: unknown command: `-' Code starting with the faulting instruction =========================================== 0: 3a ea cmp %dl,%ch 2: 05 00 74 25 89 add $0x89257400,%eax 7: de 48 c7 fimul -0x39(%rax) a: c7 (bad) b: 08 b4 6c a1 31 c0 e8 or %dh,-0x173fce5f(%rsp,%rbp,2) 12: 99 cltd 13: 6c insb (%dx),%es:(%rdi) 14: 45 03 e8 add %r8d,%r13d 17: 7c 39 jl 0x52 19: 46 03 48 8b rex.RX add -0x75(%rax),%r9d 1d: 05 71 3a ea 05 add $0x5ea3a71,%eax 22: 8b 04 98 mov (%rax,%rbx,4),%eax 25: eb 16 jmp 0x3d 27: 0f 1f 40 00 nopl 0x0(%rax) 2b: 48 8b 14 dd 00 ef 0a mov -0x5cf51100(,%rbx,8),%rdx 32: a3 33: 48 c7 c0 d8 00 00 8b mov $0xffffffff8b0000d8,%rax 3a: 04 10 add $0x10,%al 3c: 48 rex.W 3d: 83 .byte 0x83 3e: c4 .byte 0xc4 And 2b is the offset where the <> was. Anyway, the reason I did this was that I was hoping to find the cpu argument in one of the registers, but looking at your RBX value doesn't really help. If I compile this function with a defconfig based .config, I get something like: 00000000000000a0 <__cpu_to_node>: a0: 48 83 3d 00 00 00 00 cmpq $0x0,0x0(%rip) # a8 <__cpu_to_node+0x8> a7: 00 a8: 55 push %rbp a9: 48 89 e5 mov %rsp,%rbp ac: 53 push %rbx ad: 48 63 df movslq %edi,%rbx b0: 75 15 jne c7 <__cpu_to_node+0x27> b2: 48 8b 14 dd 00 00 00 mov 0x0(,%rbx,8),%rdx b9: 00 ba: 48 c7 c0 00 00 00 00 mov $0x0,%rax c1: 8b 04 10 mov (%rax,%rdx,1),%eax c4: 5b pop %rbx c5: 5d pop %rbp c6: c3 retq c7: 89 de mov %ebx,%esi c9: 48 c7 c7 00 00 00 00 mov $0x0,%rdi d0: 31 c0 xor %eax,%eax d2: e8 00 00 00 00 callq d7 <__cpu_to_node+0x37> d7: e8 00 00 00 00 callq dc <__cpu_to_node+0x3c> dc: 48 8b 05 00 00 00 00 mov 0x0(%rip),%rax # e3 <__cpu_to_node+0x43> e3: 8b 04 98 mov (%rax,%rbx,4),%eax e6: eb dc jmp c4 <__cpu_to_node+0x24> e8: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1) ef: 00 And the b2 offset matches up fairly nicely, although the rest of the decode appears to be crap. Still no hints though. However, calling convention puts the first argument in EAX, and at b2 EAX should still contain the original value, however your RAX value is complete nonsense again :/ Of course, the cpu argument being complete crap is a good reason for this to happen. Which would make thread_info::cpu of the task in question be complete crap.. and I'm not sure I can explain that either. la-la-la..
pgpwTV4gpqBiE.pgp
Description: PGP signature

