Further to my earlier report on this list of NAT adapters stopping working with svn and 3.1.0 beta versions, I believe I have identified the problem.
r23462 introduced the bug, by introducing a new thread to process "urgent" packets, the communication of which with the existing thread isn't properly synchronised. Basically, it's the classic bug of sending a signal to a thread (in this case, the existing, non-urgent thread) without using a critical section to ensure that the target thread is waiting for the signal when it's sent. So occasionally the non-urgent thread misses the signal, and thus (for all practical purposes) never wakes up, because it's waiting indefinitely. There *is* a lock taken inside RTSemEventSignal and RTSemEventWait, but that doesn't prevent this because it only covers the waiting and signalling itself, and not the real work. This (locking but making the critical section too small) is also a classic bug. I haven't tried to fix this because I'm not familiar with the code, but this is my best guess at what's going wrong. -- Robin 2009/11/26 Robin Green <[email protected]> > I think I've found something important. The responses stop appearing in the > pcap log file immediately after VirtualBox sends the first ICMP type 4 > packet (wireshark says this is "Source quench (flow control)"). The packet > is also malformed in the .pcap file, according to wireshark, which says it > has a wrong header checksum - not for the overall packet, but in a nested IP > header inside the ICMP packet. > > -- > Robin > > 2009/11/26 Robin Green <[email protected]> > > OK, this time I fortunately managed to reproduce the bug very quickly, >> *and* I got a host log file as well using wireshark. This time I used >> r24972. >> >> I had a quick look at both .pcap files in wireshark. If you look for >> example at my attempt to ping 137.44.1.27, you'll see that the ping is sent >> and the reply is received on the host, but the reply does *not* appear in >> VirtualBox's pcap log file. I was trying to ping this IP address (one of our >> DNS servers) because I could not access twitter.com in firefox in the >> guest. >> >> Thanks, >> -- >> Robin >> >> 2009/11/21 Vasily Levchenko <[email protected]> >> >>> >>> On Nov 20, 2009, at 7:43 PM, Robin Green wrote: >>> >>> I've attached the log file. What do you mean by host network file? >>> >>> >>> It's actually output of wireshark or any other traffic dumping tool >>> working on host. Rather useful to check how nat send/modify received from >>> the guest packets to remote host. >>> >>> 2009/11/19 Vasily Levchenko <[email protected]> >>> >>>> Hello Robin, >>>> Thanks for feedback. >>>> Could you please attach pcap file NAT<->guest (please see >>>> http://www.virtualbox.org/wiki/Network_tips) and host network and log >>>> file? >>>> On Nov 19, 2009, at 7:23 PM, Robin Green wrote: >>>> >>>> Bisection reveals that this bug was introduced at some point between >>>> r23223 and r23530 inclusive (I can't be more specific because all of those >>>> revisions except r23530 either failed to build or failed to boot). >>>> -- >>>> Robin >>>> >>>> 2009/11/18 Robin Green <[email protected]> >>>> >>>>> I tried 3.1 beta 1 and the NAT interfaces on both my VMs very quickly >>>>> stopped working. So I tried building from SVN, and with this (r24735) I >>>>> can't connect to, or even ping, anything other than the host. Guest is >>>>> Fedora rawhide (F13), but with Fedora 12 kernel (due to an uninteresting >>>>> bug >>>>> in the rawhide kernel). >>>>> >>>>> -- >>>>> Robin >>>>> >>>> >>>> _______________________________________________ >>>> vbox-dev mailing list >>>> [email protected] >>>> http://vbox.innotek.de/mailman/listinfo/vbox-dev >>>> >>>> >>>> >>> <VBox.log> >>> >>> >>> >> >
_______________________________________________ vbox-dev mailing list [email protected] http://vbox.innotek.de/mailman/listinfo/vbox-dev
