Hi -- We have a fairly loaded PowerEdge 2900 server running the latest OpenSolaris build (snv_98) - handling things like mail scanning, some J2EE applications, Apache, Squid, ...
Seemingly on a random basis, we get "hangs" on this machine where shell response slows down and becomes unresponsive over the space of about 30-60 sec, as do all the daemon processes on the machine. We can still get ping responses from the IPs so the kernel must be at least partially alive, but userspace is just hung. Watching vmstat and logs, it looks like there is plenty of spare swap and I don't see a sign of kernel panics in the logs or serial console. How can we diagnose this? One complication is that this server is in a remote data center connected via a console server to its ttya/COM1 serial port. Thanks for any pointers, -=- D. J. -- This message posted from opensolaris.org