Our NFS service hangs periodically. The symptoms are that all the user desktops and shells freeze between 2 and 17 seconds.
The following clients are connected to our Solaris NFS server: QTY DESCRIPTION 100+ Solaris clients NFSv3 and NFSv4 050+ Linux clients NFSv3 The hangs usually occur when the filesystems are shared on the server. However some hangs can occur when the shares are not running. These problems also depend on our NFS load. All of the clients hang at the same time, snoop shows that there are no NFS replies from the server. After a few seconds, the server recovers. We recently switched from NIS+ to ldap and after opening an SO and an escalation we determined that the problem seemed to be related to slow response times from our LDAP servers. We upgraded our V100 to a SunBlade 2000 with 2x750MHz and this has helped the problem. We also determined that the server hangs more frequently when the HA-NFS monitor determines that it needs to share our 180 filesystems again which it does every time that there is a mount/umount of any filesystem on the HA-NFS server. We have taken steps to limit the number of times a mount/umount is done on the server. We replaced all the netgroups in the dfstab with the list of machines that the netgroup represents and this helps but we still have the hangs. We are monitoring the hangs with a simple program that creates a randomly named 8k file then unlinks it. If the time it takes is greater than 1 second we report the length. Is there any ideas/suggestions of what might be going on here? Thanks! This message posted from opensolaris.org