Hello LOPSA, We have a number of virtual machines in a CentOS 5.5 box. The CentOS box is a dual Xeon system, with 24 gig of memory and about 7.5 terabytes of hard disk. Most of the hard disk (/data) is configured as an XFS filesystem
In each VM, the OS is Ubuntu 8.04, running our software. The /data filesystem above is NFS'd to each VM. Externally, there are a large number of Ubuntu 8.04 clients, connected to their respective VM via an OpenVPN tunnel. They are also connected to the main server via OpenVPN, to their location in /data Each VM, and the main server, are monitored via Zabbix. Randomly, about 1-4 times a week, one of the VMs will get locked up. The only symptoms I can see is that the process count starts climbing about 1-5 minutes before the machine gets completely hosed. It happens in the middle of the night, and during the middle of the day, so it doesn't appear to be load related. When the VM dies, I have a Zabbix process which restarts the VM, so the downtime is only about 1-2 minutes. I tried putting in a small script, called by cron once a minute, which would capture the output of "ps -ax" into a file, but that script stops running when the symptoms start. Frankly, I'm stumped. We've tried adding memory to the VM, adding additional CPUs to the VM, nothing seems to help. Does any have any ideas or suggestions? We would even entertain the idea of someone coming in for a day or so to help figure things out. Thanks in advance. JBB -- Enhancing your business through Technology Bayer Technology Group http://www.BayerTechnologyGroup.com Jonathan Bayer, CEO mailto:[email protected] Work: (609) 632-1200 Mobile: (609) 658-9408 292 Evanston Dr. East Windsor, NJ 08520 _______________________________________________ Discuss mailing list [email protected] https://lists.lopsa.org/cgi-bin/mailman/listinfo/discuss This list provided by the League of Professional System Administrators http://lopsa.org/
