how is the I/O wait states at the time?  can you try a netdump to a remote
syslog?


On Thu, Apr 30, 2009 at 3:20 PM, Jason Edgecombe <ja...@rampaginggeek.com>wrote:

> Hi everyone,
>
> I administer Linux servers for a university. I have had two our over
> servers have become unresponsive three times (2 on one server) in the
> past week. These servers are general purpose timesharing machines and
> were under a steady load of around 8. We have students running compute
> jobs for last-minute homework assignments. I know that some students are
> working on an intro to threading class. the most telling data is that
> ganglia shows a load spike of 50 before one of the outages.
>
> The servers are Dell PowerEdge 860 with 8GB of RAM and a single
> quad-core Xeon CPUs. The OS is RHEL 5.2 64bit Desktop.
>
> I have the following limits in place:
> core file size          (blocks, -c) 0
> data seg size           (kbytes, -d) unlimited
> scheduling priority             (-e) 0
> file size               (blocks, -f) unlimited
> pending signals                 (-i) 16367
> max locked memory       (kbytes, -l) 32
> max memory size         (kbytes, -m) unlimited
> open files                      (-n) 1024
> pipe size            (512 bytes, -p) 8
> POSIX message queues     (bytes, -q) 819200
> real-time priority              (-r) 0
> stack size              (kbytes, -s) 10240
> cpu time               (seconds, -t) unlimited
> max user processes              (-u) 200
> virtual memory          (kbytes, -v) 2057564
> file locks                      (-x) unlimited
>
> I'm recording sar data one per minute. The only notable thing is a peak
> of context switches before the outage and the interrupts all go to core 0.
>
> How can prevent the servers from becoming unresponsive even under heavy
> load?
>
> What can I do to troubleshoot further?
>
> Thanks,
> Jason
>
> _______________________________________________
> rhelv5-list mailing list
> rhelv5-list@redhat.com
> https://www.redhat.com/mailman/listinfo/rhelv5-list
>
_______________________________________________
rhelv5-list mailing list
rhelv5-list@redhat.com
https://www.redhat.com/mailman/listinfo/rhelv5-list

Reply via email to