I have finally discovered the cause of all my dropped packet and duplicate sequence issues. It's really very simple, and I feel a little dumb: use a switch instead of a hub.
I normally use switches, but was using a hub so I could monitor communications from my computer with Wireshark. Despite practically no other traffic, it still caused collisions, I guess. To make matters worse, Wireshark wasn't displaying bad packets; this wasn't a big surprise to me, but I underestimated how much of an impact this had on my troubleshooting. This really caused some confusion, and I was chasing after the wrong things. After a little research here and some general Internet searches, it appears that full duplex and hubs don't mix. I paid no attention to the duplex of the ethernet controller, but my guess is it's in full duplex right now, and if I switch modes to half duplex, a hub can be used. I'm not sure how a computer does this; apparently it's automatic, seeing as how it doesn't seem to care if it's connected to a hub or switch. Anyway, my audio streaming works great now! It's amazing...now I'm finally able to move on to other areas of need and leave behind my ethernet problems. --- On Fri, 8/21/09, Kieran Mansley <[email protected]> wrote: From: Kieran Mansley <[email protected]> Subject: Re: [lwip-users] Delayed ACK behavior To: "Mailing list for lwIP users" <[email protected]> Date: Friday, August 21, 2009, 10:59 AM On Fri, 2009-08-21 at 05:15 -0700, JM wrote: > SYN-SENT: ackno 6558 pcb->snd_nxt 6558 unacked 6557 > tcp_receive: ACK for 6662, unacked->seqno 6558:6662 > tcp_receive: removing 6558:6662 from pcb->unacked All good so far. > tcp_input: packet discarded due to failing checksum 0xe4db This must be the root of your (wider) problems: this packet (frame 6 or 7 I think) in the packet capture has got a bad checksum, so has most likely been corrupted by the driver. I would look into this in much more detail and work out why it has got the wrong checksum. If you can set a breakpoint here do so and examine the packet buffer to compare to the packet capture and see how they differ. > tcp_receive: duplicate seqno 2971771236 This is where it gets weird. This is most likely referring to frame 8 in the capture, the retransmission of frame 7 after it was dropped. We sent no ACK for the bad packet and this causes the other end to send the retransmission in 8. However, the stack is now apparently reporting that this is a duplicate, which it can't be, because we dropped the first one. > tcp_input: packet discarded due to failing checksum 0x3b18 > tcp_receive: duplicate seqno 2971772867 same again. Here's a hypothesis as to how this could be: Suppose that your driver sometimes got the payloads of the packets mixed up and attached them to the wrong packet headers. To do this it must at some point treat headers and payloads separately, e.g. with DMAs, or putting them in separate buffers, or something like that, and then associate them back together incorrectly. If it sometimes got the header of the Nth received frame with the payload of the (N+1)th received frame, it could plausibly explain the behaviour seen. When receiving the Nth frame (frame 6 in this case) it would have the wrong payload (from frame 7), and so fail the checksum and produce that message. Let us assume that frame 7 was then received properly. Then when frame 8 came in it would look like a duplicate. The only evidence to the contrary is that I'd expect the lwIP stack to send an ACK between frames 7 and 8 if this happened, but it doesn't. Perhaps there is another problem that produces that behaviour, or perhaps my hypothesis is wrong - it doesn't fit with the second retransmission so well where we ACK frames 10 and 11 as good. Even if it is wrong, I'll bet the problem is something like that. One pattern (although not very reliable on such a small sample) is that there are 3 received packets between each retransmission. Another possibility would be if your driver is just duplicating a packet and passing it to the stack, but that wouldn't explain the bad checksums. I think the key to your problem is working out where that first failed checksum comes from, and why. I'm guessing that if you look at the packet given to the stack, it will have (some) data from another frame instead of the correct payload. Kieran _______________________________________________ lwip-users mailing list [email protected] http://lists.nongnu.org/mailman/listinfo/lwip-users
_______________________________________________ lwip-users mailing list [email protected] http://lists.nongnu.org/mailman/listinfo/lwip-users
