On Thu, May 19, 2016 at 10:34:05AM -0400, GUNA wrote: > One of the card in my system is dead and rebooted to recover it. > The system is running on Kernel 4.4.0 + some latest TIPC patches. > Your earliest feedback of the issue is recommended. > At first i thought this might be a spinlock contention problem.
CPU2 is receiving TIPC traffic on a socket, and is trying to grab a spinlock in tipc_sk_rcv context (probably sk->sk_lock.slock) First argument to spin_trylock_bh() is passed in RDI: ffffffffa01546cc CPU3 is sending TIPC data, tipc_node_xmit()->tipc_sk_rcv() indicates that it's traffic between sockets on the same machine. And i think this is the same socket as on CPU2, because we see the same address in RDI: ffffffffa01546cc But this made me unsure: [686798.930348] ixgbe 0000:01:00.0 p19p2: initiating reset due to tx timeout Is it contributing to the problem, or is it a side effect of a spinlock contention? Driver (or HW) bugs _are_ fatal for a network stack, but why would a lock contention in a network stack cause NIC TX timeouts? Does all cards in your system have similar workloads? Do you see this on multiple cards? //E ------------------------------------------------------------------------------ Mobile security can be enabling, not merely restricting. Employees who bring their own devices (BYOD) to work are irked by the imposition of MDM restrictions. Mobile Device Manager Plus allows you to control only the apps on BYO-devices by containerizing them, leaving personal data untouched! https://ad.doubleclick.net/ddm/clk/304595813;131938128;j _______________________________________________ tipc-discussion mailing list tipc-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/tipc-discussion