Maybe you should check vmstat -z while running the load testing to see if you get any errors. On Sep 2, 2012 1:58 AM, "Ragnar Lonn" <[email protected]> wrote:
> Hi Andy, > > I work for an online load testing service (loadimpact.com) and what we > see is that the most common cause when a server crashes during a load test, > is that it runs out of some vital system resource. Usually system memory, > but network connections (sockets/file descriptors) is also a likely cause. > > You should have gotten some kind of error messages in the system log, but > if the problem is easily repeatable I would set up monitoring of at least > memory and file descriptors, and see if you are near the limits when the > machine freezes. > > Regards, > > /Ragnar > > > On 09/01/2012 10:14 PM, Andy Young wrote: > >> Last night one our servers went offline while I was load testing it. When >> I >> got to the datacenter to check on it, the server seemed perfectly fine. >> Everything was running on it, there were no panics or any other sign of a >> hard crash. The only problem is the network was unreachable. I couldn't >> connect to the box even from a laptop directly attached to the ethernet >> port. I couldn't connect to anything from the box either. It was if the >> network controller had seized up. I restarted netif and it didn't make a >> difference. Rebooting the machine however, solved the issue and everything >> went back to working great. I restarted the load testing and reproduced >> the >> problem twice more this morning so at least its repeatable. It feels like >> a >> network controller / driver issue to me for a couple reasons. First, the >> problem affects the entire system. We're running FreeBSD 9 with about a >> half dozen jails. Most of the jails are running Apache but the one I was >> load testing was running Jetty. However, if it was my application code >> crashing I would expect the problem to at least be isolated to the jail >> that hosts it. Instead, the entire machine and all jails in it lose access >> to the network. >> >> Apart from not being able to access the network, I don't see any other >> signs of problems. This is the first major problem I've had to debug in >> FreeBSD so I'm not a debugging expert by any means. There are no error >> messages in /var/log/messages or dmesg apart from syslogd not being able >> to >> reach the network. If anyone has ideas on where I can look for more >> evidence of what is going wrong, I would really appreciate it. >> >> We're running FreeBSD 9.0-RELEASE-p3. The network controller is a Intel(R) >> PRO/1000 Network Connection version - 2.2.5 configured with 6 ips using >> aliases, five of which are used for jails. >> >> Thank you for the help!! >> >> Andy >> ______________________________**_________________ >> [email protected] mailing list >> http://lists.freebsd.org/**mailman/listinfo/freebsd-**hardware<http://lists.freebsd.org/mailman/listinfo/freebsd-hardware> >> To unsubscribe, send any mail to "freebsd-hardware-unsubscribe@** >> freebsd.org <[email protected]>" >> > > ______________________________**_________________ > [email protected] mailing list > http://lists.freebsd.org/**mailman/listinfo/freebsd-**hardware<http://lists.freebsd.org/mailman/listinfo/freebsd-hardware> > To unsubscribe, send any mail to "freebsd-hardware-unsubscribe@** > freebsd.org <[email protected]>" > _______________________________________________ [email protected] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hardware To unsubscribe, send any mail to "[email protected]"
