On 02/14/2013 10:55 AM, Fábio Rabelo wrote:
Today my system starts to be unresponsive ... and then all VMs gone to
"unknown state" .
After some digging, all nfs mounts shows any content, but no errors in
log ?!?
Then, I rebooted one node to find out if things came to live again ...
Second problem, the system do not reboot !!
After send the command via web interface, the system returns msg of
stopping all VMs and containers, and stay there like forever !!!
After 20 minutes waiting, I decide to try a reboot via ssh, again,
receives msg like the system is shutdown, stays like that forever agian !
After another 20 minutes, I try to connect viaq ssh, connection works
and the "uptime" command returns 14 days uptime !!!
Then I presses reset button, after systems comes on line, the storage
do not connects, with this msg in log :
WARNING: mount error: mount.nfs: Unknown error 32768
Google returnsnothing referring to this error ...
I am lost here... don't know where to go ...
Any Ideas ?!?!?
Fábio Rabelo
_______________________________________________
pve-user mailing list
[email protected]
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
Excuse the late reply. Maybe you or someone can try this next time:
If you can't restart the node then check for a stuck rgmanager
process. It can get to an un-killable state.
so do
ps afx
222025 ? Ss 0:00 /bin/sh /etc/init.d/rc 6
222028 ? S 0:00 \_ startpar -p 4 -t 20 -T 3 -M stop -P 2 -R 6
222044 ? S 0:00 \_ /bin/bash /etc/init.d/rgmanager stop
225986 ? S 0:00 \_ sleep 1
rgmanager is in an un killable state.
kill -9 on the running rgmanager process did not work.
so kill the /etc/init.d/rgmanager stop process .
in the above you'd
kill -9 222044
here that allowed the machine to procede with a reboot.
_______________________________________________
pve-user mailing list
[email protected]
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user