On 05.06.2012 20:16, Bernhard Froehlich wrote:
On 05.06.2012 19:05, Steve Tuts wrote:
On Mon, Jun 4, 2012 at 4:11 PM, Rusty Nejdl <[email protected]> wrote:

On 2012-06-02 12:16, Steve Tuts wrote:

Hi, we have a Dell poweredge server with a dozen interfaces. It hosts a few guests of web app and email servers with VirtualBox-4.0.14. The host and all guests are FreeBSD 9.0 64bit. Each guest is bridged to a distinct interface. The host and all guests are set to 10.0.0.0 network NAT'ed to
a
cicso router.

This runs well for a couple months, until we added a new guest recently. Every few hours, none of the guests can be connected. We can only connect to the host from outside the router. We can also go to the console of the guests (except the new guest), but from there we can't ping the gateway
10.0.0.1 any more.  The new guest just froze.

Furthermore, on the host we can see a vboxheadless process for each guest, including the new guest. But we can not kill it, not even with "kill -9".
We looked around the web and someone suggested we should use "kill
-SIGCONT" first since the "ps" output has the "T" flag for that
vboxheadless process for that new guest, but that doesn't help. We also tried all the VBoxManager commands to poweroff/reset etc that new guest, but they all failed complaining that vm is in Aborted state. We also
tried
VBoxManager commands to disconnect the network cable for that new guest,
it
didn't complain, but there was no effect.

For a couple times, on the host we disabled the interface bridging that
new
guest, then that vboxheadless process for that new guest disappeared (we attempted to kill it before that). And immediately all other vms regained
connection back to normal.

But there is one time even the above didn't help - the vboxheadless
process
for that new guest stubbonly remains, and we had to reboot the host.

This is already a production server, so we can't upgrade virtualbox to the
latest version until we obtain a test server.

Would you advise:

1. is there any other way to kill that new guest instead of rebooting?
2. what might cause the problem?
3. what setting and test I can do to analyze this problem?
______________________________**_________________


I haven't seen any comments on this and don't want you to think you are being ignored but I haven't seen this but also, the 4.0 branch was buggier for me than the 4.1 releases so yeah, upgrading is probably what you are
looking at.

Rusty Nejdl
______________________________**_________________


sorry, just realize my reply yesterday didn't go to the list, so am
re-sending with some updates.

Yes, we upgraded all ports and fortunately everything went back and
especially all vms has run peacefully for two days now. So upgrading to
the latest virtualbox 4.1.16 solved that problem.

But now we got a new problem with this new version of virtualbox: whenever we try to vnc to any vm, that vm will go to Aborted state immediately. Actually, merely telnet from within the host to the vnc port of that vm will immediately Abort that vm. This prevents us from adding new vms.
Also, when starting vm with vnc port, we got this message:

rfbListenOnTCP6Port: error in bind IPv6 socket: Address already in use

, which we found someone else provided a patch at
http://permalink.gmane.org/gmane.os.freebsd.devel.emulation/10237

So looks like when there are multiple vms on a ipv6 system (we have 64bit
FreeBSD 9.0) will get this problem.

Glad to hear that 4.1.16 helps for the networking problem. The VNC problem is also a known one but the mentioned patch does not work at least for a few people. It seems the bug is somewhere in libvncserver so downgrading net/libvncserver to an earlier version (and rebuilding virtualbox) should
help until we come up with a proper fix.

You are right about the "Address already in use" problem and the patch for
it so I will commit the fix in a few moments.

I have also tried to reproduce the VNC crash but I couldn't. Probably because my system is IPv6 enabled. flo@ has seen the same crash and has no IPv6 in
his kernel which lead him to find this commit in libvncserver:


commit 66282f58000c8863e104666c30cb67b1d5cbdee3
Author: Kyle J. McKay <[email protected]>
Date:   Fri May 18 00:30:11 2012 -0700
libvncserver/sockets.c: do not segfault when listenSock/listen6Sock == -1

http://libvncserver.git.sourceforge.net/git/gitweb.cgi?p=libvncserver/libvncserver;a=commit;h=66282f5


It looks promising so please test this patch if you can reproduce the crash.

--
Bernhard Froehlich
http://www.bluelife.at/
_______________________________________________
[email protected] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-emulation
To unsubscribe, send any mail to "[email protected]"

Reply via email to