Hi This likely to tcp stack tuning. Possible OSS node not have enough free sockets for connect.
On Tue, 2009-11-24 at 09:35 +0100, Heiko Schröter wrote: > Hello, > > on three of eight OSTs i can see sporadic messages like these: > > sadosrd21 > Nov 24 09:11:52 sadosrd21 LustreError: > 5518:0:(socklnd_cb.c:2167:ksocknal_recv_hello()) Error -11 reading HELLO from > 192.168.16.133 > Nov 24 09:12:01 sadosrd21 LustreError: > 5516:0:(socklnd_cb.c:2167:ksocknal_recv_hello()) Error -11 reading HELLO from > 192.168.16.19 > sadosrd24 > Nov 21 01:42:13 sadosrd24 LustreError: > 9097:0:(socklnd_cb.c:2167:ksocknal_recv_hello()) Error -11 reading HELLO from > 192.168.16.111 > Nov 21 01:42:13 sadosrd24 LustreError: > 9098:0:(socklnd_cb.c:2167:ksocknal_recv_hello()) Error -11 reading HELLO from > 192.168.16.114 > Nov 22 04:01:59 sadosrd24 LustreError: > 9096:0:(socklnd_cb.c:2167:ksocknal_recv_hello()) Error -11 reading HELLO from > 192.168.16.116 > Nov 23 01:42:16 sadosrd24 LustreError: > 9099:0:(socklnd_cb.c:2167:ksocknal_recv_hello()) Error -11 reading HELLO from > 192.168.16.34 > Nov 23 01:42:27 sadosrd24 LustreError: > 9096:0:(socklnd_cb.c:2167:ksocknal_recv_hello()) Error -104 reading HELLO > from 192.168.16.34 > Nov 23 01:42:59 sadosrd24 LustreError: > 9096:0:(socklnd_cb.c:2167:ksocknal_recv_hello()) Error -104 reading HELLO > from 192.168.16.116 > sadosrd25 > Nov 22 04:02:06 sadosrd25 LustreError: > 5050:0:(socklnd_cb.c:2167:ksocknal_recv_hello()) Error -11 reading HELLO from > 192.168.16.19 > Nov 23 04:00:53 sadosrd25 LustreError: > 5050:0:(socklnd_cb.c:2167:ksocknal_recv_hello()) Error -11 reading HELLO from > 192.168.16.114 > Nov 23 04:01:01 sadosrd25 LustreError: > 5049:0:(socklnd_cb.c:2167:ksocknal_recv_hello()) Error -11 reading HELLO from > 192.168.16.115 > Nov 23 04:01:02 sadosrd25 LustreError: > 5048:0:(socklnd_cb.c:2167:ksocknal_recv_hello()) Error -11 reading HELLO from > 192.168.16.109 > Nov 23 09:12:57 sadosrd25 LustreError: > 5050:0:(socklnd_cb.c:2167:ksocknal_recv_hello()) Error -11 reading HELLO from > 192.168.16.111 > Nov 24 01:41:40 sadosrd25 LustreError: > 5048:0:(socklnd_cb.c:2167:ksocknal_recv_hello()) Error -11 reading HELLO from > 192.168.16.110 > Nov 24 01:42:57 sadosrd25 LustreError: > 5051:0:(socklnd_cb.c:2167:ksocknal_recv_hello()) Error -11 reading HELLO from > 192.168.16.111 > Nov 24 01:43:03 sadosrd25 LustreError: > 5049:0:(socklnd_cb.c:2167:ksocknal_recv_hello()) Error -104 reading HELLO > from 192.168.16.110 > Nov 24 01:43:08 sadosrd25 LustreError: > 5051:0:(socklnd_cb.c:2167:ksocknal_recv_hello()) Error -11 reading HELLO from > 192.168.16.100 > Nov 24 01:43:11 sadosrd25 LustreError: > 5050:0:(socklnd_cb.c:2167:ksocknal_recv_hello()) Error -11 reading HELLO from > 192.168.16.122 > > Error Number: > /usr/include/asm-generic/errno-base.h:#define EAGAIN 11 /* > Try again */ > /usr/include/asm-generic/errno.h:#define ECONNRESET 104 /* > Connection reset by peer */ > > They seem to be related to heavy network traffic to and from this OST. > Network driver e1000. > _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
