[Bug 1531768] Re: [arm64] multithreaded processes get locked up in futexes

2016-02-16 Thread Martin Pitt
For the record, this "auto-destruct" behaviour with the xenial kernel
happens just by itself: reboot the instance, let it sit there for 15 or
60 minutes, then this kernel  spew starts happening and it gets locked
up with losing network/ssh access. There was no actual  payload on
these.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1531768

Title:
  [arm64] multithreaded processes get locked up in futexes

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1531768/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1531768] Re: [arm64] multithreaded processes get locked up in futexes

2016-02-02 Thread Martin Pitt
** Summary changed:

- lxd and other commands get stuck on arm64 kernel and multiple CPUs
+ [arm64] multithreaded processes get locked up in futexes

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1531768

Title:
  [arm64] multithreaded processes get locked up in futexes

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1531768/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1531768] Re: [arm64] multithreaded processes get locked up in futexes

2016-02-02 Thread Martin Pitt
Darn, I now get the "instance kills itself after some time" on the 4x
CPU as well. nova console-log shows the blurb below and ssh and lxd
ports are dead (so I can't learn anything further from the box than
console-log).

Ubuntu Xenial Xerus (development branch) lxd-armhf2 ttyAMA0

lxd-armhf2 login: [  954.144506] INFO: rcu_sched detected stalls on CPUs/tasks:
[  954.145743]  1-...: (79 GPs behind) idle=284/0/0 softirq=407182/407182 fqs=1 
[  954.147202]  (detected by 3, t=15002 jiffies, g=21817, c=21816, q=1563)
[  954.148590] Call trace:
[  954.149123] rcu_sched kthread starved for 15002 jiffies! g21817 c21816 f0x0 
s3 ->state=0x1
[ 3000.217089] INFO: task systemd:1 blocked for more than 120 seconds.
[ 3000.218529]   Not tainted 4.4.0-2-generic #16-Ubuntu
[ 3000.219628] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[ 3000.221310] Call trace:
[ 3000.222562] INFO: task kworker/0:2:12463 blocked for more than 120 seconds.
[ 3000.223985]   Not tainted 4.4.0-2-generic #16-Ubuntu
[ 3000.225146] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[ 3000.226741] Call trace:
[ 3000.227306] INFO: task (d-logind):15441 blocked for more than 120 seconds.
[ 3000.228685]   Not tainted 4.4.0-2-generic #16-Ubuntu
[ 3000.229834] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[ 3000.231469] Call trace:
[ 3120.231067] INFO: task systemd:1 blocked for more than 120 seconds.
[ 3120.232501]   Not tainted 4.4.0-2-generic #16-Ubuntu
[ 3120.233629] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[ 3120.235393] Call trace:
[ 3120.236702] INFO: task kworker/0:2:12463 blocked for more than 120 seconds.
[ 3120.238188]   Not tainted 4.4.0-2-generic #16-Ubuntu
[ 3120.239398] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[ 3120.241140] Call trace:
[ 3120.241716] INFO: task (d-logind):15441 blocked for more than 120 seconds.
[ 3120.243223]   Not tainted 4.4.0-2-generic #16-Ubuntu
[ 3120.244366] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[ 3120.245945] Call trace:
[ 3240.244955] INFO: task systemd:1 blocked for more than 120 seconds.
[ 3240.246398]   Not tainted 4.4.0-2-generic #16-Ubuntu
[ 3240.247526] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[ 3240.249272] Call trace:
[ 3240.250568] INFO: task kworker/0:2:12463 blocked for more than 120 seconds.
[ 3240.252060]   Not tainted 4.4.0-2-generic #16-Ubuntu
[ 3240.253280] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[ 3240.254966] Call trace:
[ 3240.255549] INFO: task (d-logind):15441 blocked for more than 120 seconds.
[ 3240.257073]   Not tainted 4.4.0-2-generic #16-Ubuntu
[ 3240.258259] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[ 3240.259906] Call trace:
[ 3360.258901] INFO: task systemd:1 blocked for more than 120 seconds.
[ 3360.260349]   Not tainted 4.4.0-2-generic #16-Ubuntu
[ 3360.261475] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[ 3360.263224] Call trace:

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1531768

Title:
  [arm64] multithreaded processes get locked up in futexes

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1531768/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1531768] Re: [arm64] multithreaded processes get locked up in futexes

2016-02-02 Thread Martin Pitt
Some good news: With bug 1534545 fixed I was now able to upgrade to the
Xenial 4.4 kernel. On the 4x CPU instance two parallel adt-run loops
have now run for about two hours without any dmesg spew. Stéphane has
run "lxc-test-concurrent -j 16 -i 10" twice on the 8x CPU instance
successfully too.

Bad news: I rebooted the 8x CPU instance (also xenial du jour with 4.4
kernel), and didn't do anything on it. After just sitting idle for an
hour or two ssh stopped responding and nova console-log shows
http://paste.ubuntu.com/14857144/ (only a hard reboot helped). So it wet
its pants without actually doing any action.

So it appears it's not fully fixed yet, but muuch better. I'll do some
more smoke testing, and if 4x CPU instances work, this is good enough to
put this into production. I'll keep the old Calxeda instances alive as a
fallback for a while, of course.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1531768

Title:
  [arm64] multithreaded processes get locked up in futexes

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1531768/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs