Because Precise 3.2.0-79 is missing debug symbols froms ddebs.ubuntu.com I had
to
compile a 3.2.0-79 kernel in a PPA and expect the symbols to be close to what
they
used to be in that version.
That led me to a wrong initial analysis that I document here for
historical purposes:
> 178 2 18 ffff881f716bdc00 RU 0.0 0 0 [khungtaskd]
> 3680 2808 38 ffff881f71a5c500 RU 1.6 6629520 4303188 java
> 50279 49370 31 ffff883f0c7e8000 RU 0.0 4121160 111120 java
> 50757 50322 23 ffff881ef27eae00 RU 0.3 4149720 870892 java
crash> bt ffff881ef27eae00
PID: 50757 TASK: ffff881ef27eae00 CPU: 23 COMMAND: "java"
#0 [ffff881fbfba6ee0] crash_nmi_callback at ffffffff81031ac9
#1 [ffff881fbfba6ef0] default_do_nmi at ffffffff81666079
#2 [ffff881fbfba6f30] do_nmi at ffffffff816662b0
#3 [ffff881fbfba6f50] nmi at ffffffff81665620
[exception RIP: next_tgid+40]
RIP: ffffffff811df248 RSP: ffff881cd4b67da8 RFLAGS: 00000202
RAX: 0000000000000000 RBX: ffffffff81c281a0 RCX: 0000000000000000
RDX: ffff881f72830000 RSI: 0000000000000074 RDI: ffffffff81c281a0
RBP: ffff881cd4b67df8 R8: 000000000000a88d R9: 0000000000000004
R10: ffff883f70922540 R11: 0001f8f579768213 R12: 0000000000000074
R13: ffff881f00000073 R14: ffffffff81c281a0 R15: ffff881f73138000
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
--- <DOUBLEFAULT exception stack> ---
#4 [ffff881cd4b67da8] next_tgid at ffffffff811df248
#5 [ffff881cd4b67e00] proc_pid_readdir at ffffffff811e117e
#6 [ffff881cd4b67eb0] proc_root_readdir at ffffffff811dbe0a
#7 [ffff881cd4b67ee0] vfs_readdir at ffffffff8118e1d0
#8 [ffff881cd4b67f30] sys_getdents at ffffffff8118e4a9
#9 [ffff881cd4b67f80] system_call_fastpath at ffffffff8166d2c2
RIP: 00007f855e94d605 RSP: 00007f853a457bf0 RFLAGS: 00000283
RAX: 000000000000004e RBX: ffffffff8166d2c2 RCX: 0000000000000010
RDX: 0000000000008000 RSI: 00007f8548057980 RDI: 00000000000000fb
RBP: 00007f854846b140 R8: 00007f8548057980 R9: 0000000000000008
R10: 0000000000000010 R11: 0000000000000246 R12: 0000000000000016
R13: ffffffffffffffa0 R14: 00007f853a45a4d0 R15: 00007f8548057950
ORIG_RAX: 000000000000004e CS: 0033 SS: 002b
1) khungtaskd is complaining about a hung task
2) task ffff881ef27eae00 is in "next_tgid" from procfs vfs subsystem:
We are probably stuck here:
rcu_read_lock();
retry:
iter.task = NULL;
pid = find_ge_pid(iter.tgid, ns);
if (pid) {
iter.tgid = pid_nr_ns(pid, ns);
iter.task = pid_task(pid, PIDTYPE_PID);
/* What we to know is if the pid we have find is the
* pid of a thread_group_leader. Testing for task
* being a thread_group_leader is the obvious thing
* todo but there is a window when it fails, due to
* the pid transfer logic in de_thread.
*
* So we perform the straight forward test of seeing
* if the pid we have found is the pid of a thread
* group leader, and don't worry if the task we have
* found doesn't happen to be a thread group leader.
* As we don't care in the case of readdir.
*/
if (!iter.task || !has_group_leader_pid(iter.task)) {
iter.tgid += 1;
goto retry;
}
get_task_struct(iter.task);
}
rcu_read_unlock();
Trying to find a task group leader when reading procfs structure (by the
JVM process).
I'm still analysing code.
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1534413
Title:
Precise: lockup during fadvise syscall with POSIX_FADV_DONTNEED
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1534413/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs