On Mon, 27 Jul 2015 15:54:01 +0300
Nikolay Borisov <[email protected]> wrote:

> Hello, 
> 
> I have a machine running kernel 3.13.3, in fact I have 3 such machines 

3.13.3 is a very old kernel. A lot has changed. Also there's newer
stable kernels of that release too (3.13.11).

> which are connected via corosync in a cluster. On one of the machines 
> I observed the following lock-up: 
> 
> Jul 26 06:01:14 shrek kernel: ------------[ cut here ]------------


> Jul 26 06:01:14 shrek kernel: NMI backtrace for cpu 8
> Jul 26 06:01:14 shrek kernel: CPU: 8 PID: 29979 Comm: sg-queue Tainted:       
>    W  O 3.13.3-sg1 #6
> Jul 26 06:01:14 shrek kernel: Hardware name: Supermicro 
> X9DRD-7LN4F(-JBOD)/X9DRD-EF/X9DRD-7LN4F, BIOS 3.0a 12/05/2013
> Jul 26 06:01:14 shrek kernel: task: ffff880622796e40 ti: ffff880622797398 
> task.ti: ffff880622797398
> Jul 26 06:01:14 shrek kernel: RIP: 0010:[<ffffffff815ab0bd>]  
> [<ffffffff815ab0bd>] ffffffff815ab0bd
> Jul 26 06:01:14 shrek kernel: RSP: 0018:ffff88067e6a1e68  EFLAGS: 00000097
> Jul 26 06:01:14 shrek kernel: RAX: 000000000000002e RBX: ffff880622796e40 
> RCX: ffff880622796e40
> Jul 26 06:01:14 shrek kernel: RDX: 0000000000000000 RSI: ffff88067e6a1eb8 
> RDI: ffff880621f44a08
> Jul 26 06:01:14 shrek kernel: RBP: ffff88067e6a1e68 R08: ffff88067e6a1eb8 
> R09: 000000000000751a
> Jul 26 06:01:14 shrek kernel: R10: 0000000000000008 R11: 0000000000000246 
> R12: ffff88067e6a1e98
> Jul 26 06:01:14 shrek kernel: R13: ffff88067e6a1eb8 R14: 0000000002d7c680 
> R15: 0000000002d7ccb0
> Jul 26 06:01:14 shrek kernel: FS:  0000037e10229700(0000) 
> GS:ffff88087fc40000(0000) knlGS:0000000000000000
> Jul 26 06:01:14 shrek kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 
> 0000000080050033
> Jul 26 06:01:14 shrek kernel: CR2: 0000037e0f8978d0 CR3: 00000000015c8000 
> CR4: 00000000001607f0
> Jul 26 06:01:14 shrek kernel: Stack:
> Jul 26 06:01:14 shrek kernel: ffff88067e6a1e88 ffffffff81053620 
> 0000000000000000 0000000000000002
> Jul 26 06:01:14 shrek kernel: ffff88067e6a1ea8 ffffffff8105369a 
> 0000000000000000 0000000002d7ccb0
> Jul 26 06:01:14 shrek kernel: ffff88067e6a1ef8 ffffffff81056fcb 
> 0000000000000000 ffffffffffff4111
> Jul 26 06:01:14 shrek kernel: Call Trace:
> Jul 26 06:01:14 shrek kernel: [<ffffffff81053620>] 
> __set_current_blocked+0x30/0x60
> Jul 26 06:01:14 shrek kernel: [<ffffffff8105369a>] sigprocmask+0x4a/0x80
> Jul 26 06:01:14 shrek kernel: [<ffffffff81056fcb>] 
> SyS_rt_sigprocmask+0xbb/0xd0
> Jul 26 06:01:14 shrek kernel: [<ffffffff815ac028>] 
> system_call_fastpath+0x16/0x1b
> Jul 26 06:01:14 shrek kernel: [<ffffffff815ab6bd>] ? retint_swapgs+0xe/0x11
> Jul 26 06:01:14 shrek kernel: Code: 48 83 c4 08 b8 01 00 00 00 5b c9 c3 66 90 
> 55 48 89 e5 fa b8 00 01 00 00 f0 66 0f c1 07 89 c2 66 c1 ea 08 38 c2 75 04 c9 
> c3 f3 90 <0f> b6 07 38 c2 75 f7 c9 c3 66 2e 0f 1f 84 00 00 00 00 00 55 48



> My investigation showed up that the CPUs are in the following state: 
> CPUS 1, 2, 3, 6, 7, 8, 9, 10, 11, 18, 21, 22 are locked on the spinlock 
> inside __lock_task_sighand, in a rcu_read_lock() section. Whereas 

Except CPU 8 is blocked in sigprocmask() not in __lock_task_sighand().

> CPUS: 4, 5, 13, 14, 15, 16, 17, 19, 20, 23 are staying idle.
> 
> >From the logs I gather that 2 different subsystems detected that:
> the NMI watchdog and the rcu stall detector (cpu's 8 and 18). 
> While looking at the code and googling around I discovered the 
> following commit from PeterZ: https://lkml.org/lkml/2008/11/13/157 
> 
> Based on that I think could be happening is that the sighand itself is 
> being freed while we are in the grace period inside __lock_task_sighand 
> but the slab page itself is not freed as per the semantics of 
> SLAB_DESTROY_BY_RCU. I looked up the source of this function in the 
> latest kernels and saw that Oleg had put a comment clarifying the 
> semantics but I'm still not convinced that it is safe. What if 
> we are trying to lock the spinlock before this particular slab is 
> initialised with sighand_ctor? Looking at the code and reading
> Peter's explanation of SLAB_RCU I cannot identify which portion

This confused me a while back but it is safe.

See this thread:
 https://lkml.org/lkml/2014/6/3/759

The ctor isn't called by new allocations, but when the slab is created.

But as I said, your kernel is very old and lots of things have changed.
Unless you can reproduce on a newer kernel, most wont spend time
looking at this.

-- Steve

 
> of __lock_task_sighand implements the following logic:
> 
> if (!try_get_ref(obj)) // might fail for free objects
>       goto again;
> 
> 
> Also what's more peculiar was the fact that when "Shrek" experienced
> this issue, resources were migrated to the 2nd machine and it 
> also experienced the same behavior - CPUs locked on the sighand
> spinlock. 
> 
> 
> That's the first time I've seen this particular issue. And I'm not
> really able to come up with a way where a task's sighand could be 
> corrupted in a such a way as to prevent acquiring the lock. 
> 
> Regards, 
> Nikolay

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to