Found the problem. Thanks to you and Kieran for the replies. It's kind of a strange effect. So just in case anyone else runs into a similar situation in the future, a quick description of what was happening.
My driver is written so that when a new packet arrives or a transmitted packet finishes transmitting, the NIC generates an interrupt. The interrupt handler merely sets a flag, masks off further interrupts and returns (primary purpose of the interrupt is simply to wake up the cpu from a 'wait' instruction). The application main loop, having been awakened, immediately calls the driver's poll routine. The poll routine cleans the Tx queue (i.e. frees up pbufs that are finished transmitting). Then it consumes each received packet directly from the Rx queue. Finally, it allocates new empty buffers for each consumed Rx descriptor. As Rx packets (ACKs) are being processed, they're passed up the stack and new data are generated (via the sent callback). As the last step of processing the ACK, two new data packets are created by tcp and passed to the driver's linkoutput function (two because the remote ftp client only generates an ACK on every other data packet). Those new data packets in turn will cause a new ACK packet to come back. Because of a very low latency to the remote client (and because lwIP is relatively slow in processing the packets [*]), the new ACK packet can be added to the Rx queue while the driver is *still in the original Rx queue processing loop*. Because of the way that loop figures out when to quit, it can also process the newly arrived ACK packets in the same call to the poll function until packets are no longer arriving or until all 256 available receive descriptors are used up. So I think the sequence is something like this: 1. Enqueue up to 8 packets for transmit (limited by SND_BUF). 2. First few packets finish transmitting and first one or two ack packets are received, causing an interrupt 3. Small number of Tx queue entries are cleaned (no more than the original 8) 4. Process ack packet from Rx queue, passing it to lwIP; the ack results in two new outgoing packets in Tx queue (meanwhile new ack packets continue to arrive and are added to Rx queue) Repeat (4) until about half of the Rx queue entries have been processed whereupon the Tx queue becomes full; after this, no new ACK packets will be triggered on the remote side so the Rx processing loop soon finishes. But the last several outgoing data packets will have been dropped on the floor due to the Tx queue full condition. Several fixes are possible. One easy one -- cleaning the Tx queue anew at the beginning of the driver's linkoutput function -- seems to work, but I think a more complete solution will be needed. (Anyway, now the entire 10MB file is transmitted in less than a third of a second instead of ~40 seconds. Woot!) Thanks again, Jeff [* BTW this is not intended to be a criticism of lwIP; far from it: lwIP rocks! One reason lwIP is slow in processing received packets here is because I never bothered to figure out how to enable the NIC's hardware checksum verification. Performance is not really very important for my application. Robustness is far more important.] On Thu, Oct 22, 2009 at 11:29 AM, [email protected] <[email protected]> wrote: > Jeff Barber wrote: >> >> Now 256 happens to correlate to the size of the Tx and Rx ring buffers >> in my driver so that was the obvious place to look. I notice that >> sometimes my driver's linkoutput function is being called when its Tx >> queue is full. I'm thinking that's the proximate cause of the >> problem. However, shouldn't the TCP_SND_QUEUELEN (32) and TCP_SND_BUF >> (8 * MSS) values be a limit to the maximum number of "outstanding" >> pbufs? I did turn on TCP_QLEN_DEBUG and according to that info, I >> never get a queue len > 8. So why do I end up with so many packets >> "in flight"? >> > > That's a good question. The queue len doesn't grow because SND_BUF limits it > (assuming you transmitted 8 full mss-sized segments). > > When your driver runs out of buffers, did you check these are only packets > from one tcp connection? Maybe there are other packets transmitted by your > board? >> >> And if I'm misunderstanding, what is the intended feedback mechanism >> from the driver? tcp_output always seems to ignore the return value >> of ip_output. Hence, if I understand it correctly, that means an >> attempt to send while the Tx queue is full is treated exactly the same >> as if the packet was dropped on the wire: i.e. it will rely on the >> retransmission process to recover. >> > > Unfortunately, this really is the case currently. A solution for this isto > just delay the lwip sending task until there's room in your MAC's Tx queue. > Unfortunately, this also suppresses processing Rx packets while waiting, but > at least other tasks can run. The problem is at one point you *have* to slow > down the CPU if it tries to send faster than the wire supports. Any input on > how this can best be solved in the lwIP code is of course welcome! > > Simon > > > _______________________________________________ > lwip-users mailing list > [email protected] > http://lists.nongnu.org/mailman/listinfo/lwip-users > _______________________________________________ lwip-users mailing list [email protected] http://lists.nongnu.org/mailman/listinfo/lwip-users
