5 identical V440s, Solaris 10, storage on a Netapp via NFS, providing access to 
a mailstore (so 80% read, 20% write).

During the day, randomly a machine will start to climb its load from the 
baseline of 2-3 up to 50-60.  Under heavy loading, I've seen it go up to 300.  
All the time will be in split almost 50/50 user and kernel, no idle, nothing in 
I/O (according to top).

I'm suspecting NFS problems, but the Netapp and switch traffic graphics look 
clean and consistent.  Nothing shows network errors, not nfsstat, not the 
switch ports, not the netapp.

And like I mentioned, the problem moves.  One day on machine 1, tomorrow on 4, 
etc No real pattern.

How would I go about trying to discover what the kernel is doing when this is 
happening.  Some of the simple dtrace stuff I've tried have just shown me alot 
lof lwp_parks (the main apps is heavily multithreaded, so that figures).

Anyone got any key dtrace probes they look at for NFS or dnlc problems?
 
 
This message posted from opensolaris.org

Reply via email to