I took a look at this crash. The oops is in the following code, using
debug symbols (ddeb) from the affected kernel:
(gdb) l *(do_writepages+0x16)
0xffffffff81117146 is in do_writepages
(/build/buildd/linux-2.6.38/mm/page-writeback.c:1057).
1052 {
1053 int ret;
1054
1055 if (wbc->nr_to_write <= 0)
1056 return 0;
1057 if (mapping->a_ops->writepages)
1058 ret = mapping->a_ops->writepages(mapping, wbc);
1059 else
1060 ret = generic_writepages(mapping, wbc);
1061 return ret;
So the crash seems in the dereferece of either mapping->a_ops or
a_ops->writepages. To check its indeed the case, lets see the offset of
the pointer derefereces in nearby code and compare with the decodecode
output:
(gdb) p &((struct address_space *)0)->a_ops
$1 = (const struct address_space_operations **) 0x58
(gdb) p &((struct address_space_operations *)0)->writepages
$2 = (int (**)(struct address_space *, struct writeback_control *)) 0x18
(gdb) p &((struct writeback_control *)0)->nr_to_write
$3 = (long int *) 0x18
Relevant decodecode:
All code
========
0: e4 48 in $0x48,%al
2: c7 05 e9 a3 91 00 00 movl $0x400,0x91a3e9(%rip) # 0x91a3f5
9: 04 00 00
c: 48 83 c4 08 add $0x8,%rsp
10: 5b pop %rbx
11: c9 leaveq
12: c3 retq
13: 66 90 xchg %ax,%ax
15: 55 push %rbp
16: 48 89 e5 mov %rsp,%rbp
19: 66 66 66 66 90 data32 data32 data32 xchg %ax,%ax
1e: 31 c0 xor %eax,%eax
20: 48 83 7e 18 00 cmpq $0x0,0x18(%rsi)
25: 7e 0f jle 0x36
27: 48 8b 47 58 mov 0x58(%rdi),%rax
2b:* 48 8b 40 18 mov 0x18(%rax),%rax <-- trapping
instruction
2f: 48 85 c0 test %rax,%rax
32: 74 09 je 0x3d
34: ff d0 callq *%rax
mov 0x58(%rdi),%rax == mapping->a_ops
mov 0x18(%rax),%rax == ...a_ops->writepages
So indeed the crash happens on a_ops->writepages dereference, which
means a_ops has an invalid value.
This is strange. Looking at the dmesg, we can see that ext4 is being
used, and that the i_mapping->a_ops probably is set by ext4. Looking at
the ext4 code, I found something interesting:
- ext4 sets a_ops, probably in case above to ext4_da_aops
- Using the same debug symbols, we can get the same address as the running
kernel for this ext4_da_aops case, here I get:
(gdb) p &ext4_da_aops
$11 = (const struct address_space_operations *) 0xffffffff81627fc0
(gdb) p/x 0xffffffff81627fc0 + 0x18
$12 = 0xffffffff81627fd8
Now look at the last value, which is the address from a_ops->writepages
dereference. It's *very similar* to the value in the invalid dereference which
triggers the oops:
ffffffff81627fd8 --> valid address of ext4_da_aops
ffffffef81627fd8 --> invalid address which triggers the oops
Note that there is one bit flipped, "ef" instead of "ff" in the middle.
So, something corrupted the a_ops pointer in the code, or the machine
has some hardware problem (may be memory issue) which flipped the bit.
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/766334
Title:
BUG: unable to handle kernel paging request at ffffffef81627fd8
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs