Re: [CentOS] Help needed with NFS issue

2012-04-19 Thread Giovanni Tirloni
Jumping late on this thread, pardon my ignorance of some details... On Wed, Apr 18, 2012 at 4:35 PM, Steve Thompson s...@vgersoft.com wrote: Interesting. It looks like some kind of RPC failure. During the hang, I cannot contact the nfs service via RPC: # rpcinfo -t server nfs rpcinfo: RPC:

Re: [CentOS] Help needed with NFS issue

2012-04-19 Thread Steve Thompson
On Thu, 19 Apr 2012, Giovanni Tirloni wrote: Did you run this command during the hang or is it constantly returning you that? It is returning the time out only during the hang; the rest of the time it works normally. If the later, are you blocking UDP on either the server or the client? No

Re: [CentOS] Help needed with NFS issue

2012-04-19 Thread Nataraj
Have you looked at the rpcd process with top or ps to see what state it is in? What about running strace? What about your dns server or any other (reverse) client lookup services that you might have enabled? Nataraj ___ CentOS mailing list

Re: [CentOS] Help needed with NFS issue

2012-04-19 Thread Steve Thompson
All, Many thanks to everyone who commented on this issue. I believe that I have solved it. It turns out that the number of nfsd's that I was running (32) was way too low. I observed that adding more nfsd's when NFS was hung always caused the hang to go away immediately. Now I am in the tuning

Re: [CentOS] Help needed with NFS issue

2012-04-18 Thread Steve Thompson
Interesting. It looks like some kind of RPC failure. During the hang, I cannot contact the nfs service via RPC: # rpcinfo -t server nfs rpcinfo: RPC: Timed out program 13 version 0 is not available even though it is supposedly available: # rpcinfo -p server program vers proto port

Re: [CentOS] Help needed with NFS issue

2012-04-18 Thread Ross Walker
On Apr 18, 2012, at 3:35 PM, Steve Thompson s...@vgersoft.com wrote: Interesting. It looks like some kind of RPC failure. During the hang, I cannot contact the nfs service via RPC: # rpcinfo -t server nfs rpcinfo: RPC: Timed out program 13 version 0 is not available even though it

Re: [CentOS] Help needed with NFS issue

2012-04-18 Thread Steve Thompson
On Wed, 18 Apr 2012, Ross Walker wrote: Is iptables disabled? If not, problem with rules or RPC helper? Yes, iptables is not in use. What about selinux? Disabled. -Steve ___ CentOS mailing list CentOS@centos.org

Re: [CentOS] Help needed with NFS issue

2012-04-17 Thread Ross Walker
On Apr 17, 2012, at 5:40 PM, Steve Thompson s...@vgersoft.com wrote: I have four NFS servers running on Dell hardware (PE2900) under CentOS 5.7, x86_64. The number of NFS clients is about 170. A few days ago, one of the four, with no apparent changes, stopped responding to NFS requests

Re: [CentOS] Help needed with NFS issue

2012-04-17 Thread Ross Walker
On Apr 17, 2012, at 6:49 PM, Ross Walker rswwal...@gmail.com wrote: Just a shot in the dark here. Take a look at the NIC and switch port flow control status during an outage, they may be paused due to switch load. Is there anything else on the network switches that might flood them every

Re: [CentOS] Help needed with NFS issue

2012-04-17 Thread Steve Thompson
On Tue, 17 Apr 2012, Ross Walker wrote: Take a look at the NIC and switch port flow control status during an outage, they may be paused due to switch load. Is there anything else on the network switches that might flood them every half hour for a two minute duration? Unfortunately not. All

Re: [CentOS] Help needed with NFS issue

2012-04-17 Thread Fajar Priyanto
Also shot in the dark from me. There maybe some IP conflict in the network. Sent from my iPhone ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Help needed with NFS issue

2012-04-17 Thread Steve Thompson
On Tue, 17 Apr 2012, Ross Walker wrote: Let me also add that constant spanning tree convergence can cause this too. Make sure your choice of protocol and priority suit your topology and equipment. Gives me an idea! The switch is under control of different people. I did have a new VLAN

Re: [CentOS] Help needed with NFS issue

2012-04-17 Thread Steve Thompson
On Wed, 18 Apr 2012, Fajar Priyanto wrote: Also shot in the dark from me. There maybe some IP conflict in the network. Yes, I thought of that one too. I am in control of all IP's on the network, so I am sure that nothing changed around the time that the trouble started. I checked for that

Re: [CentOS] Help needed with NFS issue

2012-04-17 Thread Ross Walker
On Apr 17, 2012, at 6:57 PM, Steve Thompson wrote: On Tue, 17 Apr 2012, Ross Walker wrote: Let me also add that constant spanning tree convergence can cause this too. Make sure your choice of protocol and priority suit your topology and equipment. Gives me an idea! The switch is under