if you have time to reproduce this, I would be interested in the device forwarding table info.
if not I think we'll be looking at the same area in the next few days, although we're more in a test mode than a fix mode right now you can get the forwarding table using the different iwpriv commands documented here: http://wiki.laptop.org/go/Wireless_Driver_README it's all mac based at this level, so we'd also need the mac addresses of the devices involved. best regards, Bill -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Polychronis Ypodimatopoulos Sent: Tuesday, June 10, 2008 3:20 PM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; OLPC Developer's List Subject: Re: [OLPC Networking] TCP is broken in mesh mode nice report. Benjamin M. Schwartz wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Dear Networking experts, > > I have been fighting for several months with the fact that invitations > often seem not to work, when running on a serverless mesh. The > symptoms are quite strange. If an invitation works once between two > laptops, it continues to work between them reliably. If it fails > once, it continues to fail between them consistently. Sometimes, in > the same place, invitations will work on one mesh channel and not on > another. The same two XOs may be reliably successful in a particular > high-noise environment, and consistently fail in an area of virtual > radio silence, as well as the reverse. > > Even when invitations fail, other presence information continues to > flow correctly. Even activity sharing continues to work beautifully. > > With some help from Daf, we managed to get a tcpdump trace from two > XOs exhibiting this behavior at 1CC. The dumps are attached to ticket #6463. > ~ What we saw is bizarre, but also consistent with the behavior in the UI. > ~ The invitations are unicast, implemented using TCP. When machine A > sends an invitation to B, we see the following exchange: > > 1. A broadcasts an ARP request for B > 2. B sees the ARP request and replies to A 3. A receives the ARP reply > from B and sends a TCP SYN to B 4. B does not see the SYN packet (it > does not appear in B's dump) 5. A retries a total of three times, but > none of the SYN packets are seen by B. > 3b. In parallel, A broadcasts a presence-info update with mDNS, > indicating that it has shared the activity. > 4b. B receives this broadcast, updates its presence-info cache, and > even assigns B's XO icon a new location in the mesh view > > This behavior is fairly frightening. I have seen it occur in > low-noise network environments with a total of 3 XOs, so I suspect a > serious bug somewhere in the lowest levels of the network stack. Once > this failure occurs, it is extremely reproducible. All subsequent > invitations will continue to fail. I therefore suspect that the bug > involves the driver or firmware reaching an invalid state and becoming stuck there. > You have to keep in mind that the driver/firmware may very well have bugs, but: 1) the driver does not differentiate between different TCP/IP packets (but may wrongly differentiate between unicast and broadcast/multicast). Try establishing a separate TCP/IP connection when invitations reproducibly don't work. 2) the firmware (in terms of a route existing or not) does not differentiate between frames. Try pinging the other node when invitations reproducibly don't work. > Given the variety of critical services that run over TCP, including > the much-emphasized Read activity, I hope that people familiar with > the driver and firmware will take a look at this bug. > > - --Ben Schwartz > > P.S. All this info is present at ticket #6463. I am writing about it > here in an attempt to increase awareness. > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v2.0.9 (GNU/Linux) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org > > iEYEARECAAYFAkhOz+oACgkQUJT6e6HFtqSVBQCeKPWmqeoKOzVv55JS/HTAgf1r > bUYAoKCG+z1bBA+isc7Mun0VlQNGDars > =4w83 > -----END PGP SIGNATURE----- > _______________________________________________ > Networking mailing list > [EMAIL PROTECTED] > http://lists.laptop.org/listinfo/networking > -- Polychronis Ypodimatopoulos Graduate student Viral Communications MIT Media Lab Tel: +1 (617) 459-6058 http://www.mit.edu/~ypod/ _______________________________________________ Devel mailing list [email protected] http://lists.laptop.org/listinfo/devel _______________________________________________ Devel mailing list [email protected] http://lists.laptop.org/listinfo/devel
