On Mon, 27 Jul 2015 15:54:01 +0300 Nikolay Borisov <[email protected]> wrote:
> Hello, > > I have a machine running kernel 3.13.3, in fact I have 3 such machines 3.13.3 is a very old kernel. A lot has changed. Also there's newer stable kernels of that release too (3.13.11). > which are connected via corosync in a cluster. On one of the machines > I observed the following lock-up: > > Jul 26 06:01:14 shrek kernel: ------------[ cut here ]------------ > Jul 26 06:01:14 shrek kernel: NMI backtrace for cpu 8 > Jul 26 06:01:14 shrek kernel: CPU: 8 PID: 29979 Comm: sg-queue Tainted: > W O 3.13.3-sg1 #6 > Jul 26 06:01:14 shrek kernel: Hardware name: Supermicro > X9DRD-7LN4F(-JBOD)/X9DRD-EF/X9DRD-7LN4F, BIOS 3.0a 12/05/2013 > Jul 26 06:01:14 shrek kernel: task: ffff880622796e40 ti: ffff880622797398 > task.ti: ffff880622797398 > Jul 26 06:01:14 shrek kernel: RIP: 0010:[<ffffffff815ab0bd>] > [<ffffffff815ab0bd>] ffffffff815ab0bd > Jul 26 06:01:14 shrek kernel: RSP: 0018:ffff88067e6a1e68 EFLAGS: 00000097 > Jul 26 06:01:14 shrek kernel: RAX: 000000000000002e RBX: ffff880622796e40 > RCX: ffff880622796e40 > Jul 26 06:01:14 shrek kernel: RDX: 0000000000000000 RSI: ffff88067e6a1eb8 > RDI: ffff880621f44a08 > Jul 26 06:01:14 shrek kernel: RBP: ffff88067e6a1e68 R08: ffff88067e6a1eb8 > R09: 000000000000751a > Jul 26 06:01:14 shrek kernel: R10: 0000000000000008 R11: 0000000000000246 > R12: ffff88067e6a1e98 > Jul 26 06:01:14 shrek kernel: R13: ffff88067e6a1eb8 R14: 0000000002d7c680 > R15: 0000000002d7ccb0 > Jul 26 06:01:14 shrek kernel: FS: 0000037e10229700(0000) > GS:ffff88087fc40000(0000) knlGS:0000000000000000 > Jul 26 06:01:14 shrek kernel: CS: 0010 DS: 0000 ES: 0000 CR0: > 0000000080050033 > Jul 26 06:01:14 shrek kernel: CR2: 0000037e0f8978d0 CR3: 00000000015c8000 > CR4: 00000000001607f0 > Jul 26 06:01:14 shrek kernel: Stack: > Jul 26 06:01:14 shrek kernel: ffff88067e6a1e88 ffffffff81053620 > 0000000000000000 0000000000000002 > Jul 26 06:01:14 shrek kernel: ffff88067e6a1ea8 ffffffff8105369a > 0000000000000000 0000000002d7ccb0 > Jul 26 06:01:14 shrek kernel: ffff88067e6a1ef8 ffffffff81056fcb > 0000000000000000 ffffffffffff4111 > Jul 26 06:01:14 shrek kernel: Call Trace: > Jul 26 06:01:14 shrek kernel: [<ffffffff81053620>] > __set_current_blocked+0x30/0x60 > Jul 26 06:01:14 shrek kernel: [<ffffffff8105369a>] sigprocmask+0x4a/0x80 > Jul 26 06:01:14 shrek kernel: [<ffffffff81056fcb>] > SyS_rt_sigprocmask+0xbb/0xd0 > Jul 26 06:01:14 shrek kernel: [<ffffffff815ac028>] > system_call_fastpath+0x16/0x1b > Jul 26 06:01:14 shrek kernel: [<ffffffff815ab6bd>] ? retint_swapgs+0xe/0x11 > Jul 26 06:01:14 shrek kernel: Code: 48 83 c4 08 b8 01 00 00 00 5b c9 c3 66 90 > 55 48 89 e5 fa b8 00 01 00 00 f0 66 0f c1 07 89 c2 66 c1 ea 08 38 c2 75 04 c9 > c3 f3 90 <0f> b6 07 38 c2 75 f7 c9 c3 66 2e 0f 1f 84 00 00 00 00 00 55 48 > My investigation showed up that the CPUs are in the following state: > CPUS 1, 2, 3, 6, 7, 8, 9, 10, 11, 18, 21, 22 are locked on the spinlock > inside __lock_task_sighand, in a rcu_read_lock() section. Whereas Except CPU 8 is blocked in sigprocmask() not in __lock_task_sighand(). > CPUS: 4, 5, 13, 14, 15, 16, 17, 19, 20, 23 are staying idle. > > >From the logs I gather that 2 different subsystems detected that: > the NMI watchdog and the rcu stall detector (cpu's 8 and 18). > While looking at the code and googling around I discovered the > following commit from PeterZ: https://lkml.org/lkml/2008/11/13/157 > > Based on that I think could be happening is that the sighand itself is > being freed while we are in the grace period inside __lock_task_sighand > but the slab page itself is not freed as per the semantics of > SLAB_DESTROY_BY_RCU. I looked up the source of this function in the > latest kernels and saw that Oleg had put a comment clarifying the > semantics but I'm still not convinced that it is safe. What if > we are trying to lock the spinlock before this particular slab is > initialised with sighand_ctor? Looking at the code and reading > Peter's explanation of SLAB_RCU I cannot identify which portion This confused me a while back but it is safe. See this thread: https://lkml.org/lkml/2014/6/3/759 The ctor isn't called by new allocations, but when the slab is created. But as I said, your kernel is very old and lots of things have changed. Unless you can reproduce on a newer kernel, most wont spend time looking at this. -- Steve > of __lock_task_sighand implements the following logic: > > if (!try_get_ref(obj)) // might fail for free objects > goto again; > > > Also what's more peculiar was the fact that when "Shrek" experienced > this issue, resources were migrated to the 2nd machine and it > also experienced the same behavior - CPUs locked on the sighand > spinlock. > > > That's the first time I've seen this particular issue. And I'm not > really able to come up with a way where a task's sighand could be > corrupted in a such a way as to prevent acquiring the lock. > > Regards, > Nikolay -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

