Public bug reported:

Running 2.6.32-37-server on Ubuntu 10.04. The issue is a bit hard to
explain, so bear with me.

- I've got a DRBD setup running, but my DRBD device got messed up. I'm not 
entirely sure how, but it happend.
- The DRBD resource is configured with "verify-alg sha1" and "csums-alg sha1".
- Now when drbdsetup runs to initialize the device, the command doesn't exit, 
and there's some crash information in syslog.
- I can also see a command "/sbin/modprobe -q -- sha1_all" being run that also 
doesn't seem to exit, which on the DRBD peer node (identical setup) exits 
immediately.
- I also see errors about VIA Padlock devices not existing.

My assumption is that because there's some bad data on the block device
underlying the DRBD resource, DRBD tries to check the device, tries to
use SHA1 as configured, but somehow loading the SHA1 module deadlocks
with whatever DRBD is trying to do. If I remove the kernel module
"padlock-sha" and hard reboot, everything works as expected.

This is what's in syslog:

Jan 16 13:29:21 htz0 kernel: [  178.247889] BUG: soft lockup - CPU#7 stuck for 
61s! [kstop/7:1657]
Jan 16 13:29:21 htz0 kernel: [  178.248206] Modules linked in: padlock_sha(-) 
sha1_generic drbd ipt_REJECT ipt_LOG xt_limit xt_tcpudp ipt_addrtype xt_state 
ip6table_filter ip6_tables nf_nat_irc nf_conntrack_irc nf_nat_ftp nf_nat 
nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack_ftp nf_conntrack iptable_filter 
ip_tables x_tables fbcon tileblit font lp bitblit softcursor vga16fb video 
parport xhci vgastate output multipath linear aacraid 3w_9xxx 3w_xxxx raid10 
raid456 async_pq async_xor xor async_memcpy async_raid6_recov raid6_pq async_tx 
raid1 raid0 sata_nv r8169 ahci mii sata_sil sata_via
Jan 16 13:29:21 htz0 kernel: [  178.248245] CPU 7:
Jan 16 13:29:21 htz0 kernel: [  178.248247] Modules linked in: padlock_sha(-) 
sha1_generic drbd ipt_REJECT ipt_LOG xt_limit xt_tcpudp ipt_addrtype xt_state 
ip6table_filter ip6_tables nf_nat_irc nf_conntrack_irc nf_nat_ftp nf_nat 
nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack_ftp nf_conntrack iptable_filter 
ip_tables x_tables fbcon tileblit font lp bitblit softcursor vga16fb video 
parport xhci vgastate output multipath linear aacraid 3w_9xxx 3w_xxxx raid10 
raid456 async_pq async_xor xor async_memcpy async_raid6_recov raid6_pq async_tx 
raid1 raid0 sata_nv r8169 ahci mii sata_sil sata_via
Jan 16 13:29:21 htz0 kernel: [  178.248282] Pid: 1657, comm: kstop/7 Not 
tainted 2.6.32-37-server #81-Ubuntu System Product Name
Jan 16 13:29:21 htz0 kernel: [  178.248284] RIP: 0010:[<ffffffff810b7ca5>]  
[<ffffffff810b7ca5>] stop_cpu+0x85/0xf0
Jan 16 13:29:21 htz0 kernel: [  178.248291] RSP: 0018:ffff8804170e3df0  EFLAGS: 
00000293
Jan 16 13:29:21 htz0 kernel: [  178.248293] RAX: 0000000000000001 RBX: 
ffff8804170e3e00 RCX: 0000000000000000
Jan 16 13:29:21 htz0 kernel: [  178.248296] RDX: ffffffff81869248 RSI: 
0000000000000100 RDI: ffffffff81869240
Jan 16 13:29:21 htz0 kernel: [  178.248298] RBP: ffffffff81013c6e R08: 
ffff8804170e2000 R09: 0000000000000000
Jan 16 13:29:21 htz0 kernel: [  178.248300] R10: 0000000000000000 R11: 
0000000000000000 R12: 0000000000000001
Jan 16 13:29:21 htz0 kernel: [  178.248302] R13: 0000000000000001 R14: 
0000000000000001 R15: 0000000000000400
Jan 16 13:29:21 htz0 kernel: [  178.248305] FS:  0000000000000000(0000) 
GS:ffff88000ffc0000(0000) knlGS:0000000000000000
Jan 16 13:29:21 htz0 kernel: [  178.248308] CS:  0010 DS: 0018 ES: 0018 CR0: 
000000008005003b
Jan 16 13:29:21 htz0 kernel: [  178.248310] CR2: 00007fa7f1c65beb CR3: 
0000000001001000 CR4: 00000000000406e0
Jan 16 13:29:21 htz0 kernel: [  178.248312] DR0: 0000000000000000 DR1: 
0000000000000000 DR2: 0000000000000000
Jan 16 13:29:21 htz0 kernel: [  178.248314] DR3: 0000000000000000 DR6: 
00000000ffff0ff0 DR7: 0000000000000400
Jan 16 13:29:21 htz0 kernel: [  178.248316] Call Trace:
Jan 16 13:29:21 htz0 kernel: [  178.248323]  [<ffffffff81081597>] ? 
run_workqueue+0xc7/0x1a0
Jan 16 13:29:21 htz0 kernel: [  178.248328]  [<ffffffff81081713>] ? 
worker_thread+0xa3/0x110
Jan 16 13:29:21 htz0 kernel: [  178.248332]  [<ffffffff81086140>] ? 
autoremove_wake_function+0x0/0x40
Jan 16 13:29:21 htz0 kernel: [  178.248337]  [<ffffffff81081670>] ? 
worker_thread+0x0/0x110
Jan 16 13:29:21 htz0 kernel: [  178.248340]  [<ffffffff81085dc6>] ? 
kthread+0x96/0xa0
Jan 16 13:29:21 htz0 kernel: [  178.248344]  [<ffffffff810141aa>] ? 
child_rip+0xa/0x20
Jan 16 13:29:21 htz0 kernel: [  178.248348]  [<ffffffff81085d30>] ? 
kthread+0x0/0xa0
Jan 16 13:29:21 htz0 kernel: [  178.248351]  [<ffffffff810141a0>] ? 
child_rip+0x0/0x20

Hope that helps; feel free to ask for more information.

Jens

** Affects: drbd8 (Ubuntu)
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to drbd8 in Ubuntu.
https://bugs.launchpad.net/bugs/917134

Title:
  dbrd8 kernel module and padlock-sha kernel module in deadlock

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/drbd8/+bug/917134/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs

Reply via email to