What is your TCP_SND_QUEUELEN? If you tie up all of your pbufs to send queued packets, you won't have any pbufs left to support receiving packets.
Bill > -----Original Message----- > From: [email protected] > [mailto:[email protected]] On > Behalf Of john bougs > Sent: Tuesday, January 27, 2009 5:06 PM > To: Mailing list for lwIP users > Subject: Re: [lwip-users] lwip does not ack retransmissions > > --- On Mon, 1/26/09, Jonathan Larmour <[email protected]> wrote: > > > From: Jonathan Larmour <[email protected]> > > Subject: Re: [lwip-users] lwip does not ack retransmissions > > To: "Mailing list for lwIP users" <[email protected]> > > Date: Monday, January 26, 2009, 5:06 PM > > john bougs wrote: > > > I have been having problems with lwip receiving > > retransmitted packets. > > > > > > The attached wireshark capture is typical of the > > problem. > > > > > > .107 is lwip, .102 is a telnet session on Windows > > where I have cut and pasted a file. 40K file. A large > > portion of the file is transfered successfully before the > > capture. > > > > > > @ packet 37 relative sequence 192 packet is > > transmitted with len = 128 > > > @ packet 45 we start getting duplicate acks = 192 > > > @ packet 47 sequence 192 is resent ... this time only > > one byte > > > @ packet 52 we get ack = 193 > > > @ packet 53 we start getting duplicate ack = 193 > > > @ packet 56 seq 193 is transmitted len = 1 > > > @ packet 62 ack=194 > > > @ packet 63 we start getting duplicate ack = 194 > > > > > > up to this point things have been pretty consistent, > > after this point specifics may change slightly from run to > > run, but in all cases all we get are duplicate acks. > > > > > > I am using lwip 1.3 on NXP LPC2368 with the following > > settings: > > > > > > #define LWIP_DHCP 0 > > > #define MEM_ALIGNMENT 4 > > > #define NO_SYS 1 > > > #define LWIP_SOCKET 0 > > > #define LWIP_NETCONN 0 > > > #define TCP_WND 512 > > > #define PBUF_POOL_BUFSIZE (128+56) > > > #define PBUF_POOL_SIZE 8 > > > #define IP_REASS_MAX_PBUFS 6 > > > > > > > > > if I return PBUF_POOL_BUFSIZE, PBUF_POOL_SIZE and > > IP_REASS_MAX_PBUFS to the defaults, things recover after a > > few seconds rather than going completly dead. > > > > > > does anyone have any suggestion on what the problem > > may be? > > > > I doubt IP_REASS_MAX_PBUFS is relevant - there's no > > sign of fragmentation. > > > > The trigger for the problem does seem to be a 128 byte TCP > > segment. This is contained in a packet of, overall, 182 > > bytes (including frame and IP header) so it should fit in > > one pbuf (184 bytes). As you've given your window as 512 > > bytes that's fine for the remote end to do, and given > > your pbufs, it seems to me this should by rights be ok. > > > > Given the delay between the Push at 18:22:18.367728 and the > > ack at 18:22:18.408937 it seems like there are two > > possibilities: > > 1) your target might have gone off and done some processing > > at that point. Which should be ok, but caused more data to > > queue up in the meantime. > > 2) Possibly in addition, your application is still holding > > on to the pbufs - perhaps it hasn't finished processing > > earlier data and is waiting for more before it can. While it > > holds those pbufs, it reduces the available pool for more > > data. > > 3) Or you ran out of memory to send the ACK. For example, > > if your app has been allocating memory (PBUF_RAMs) to > > generate data for replies. For this you'd need to check > > the value of MEM_SIZE. But given what you say about an > > improvement with more/larger pbufs, maybe that isn't so > > relevant. > > > > My guess would be a combination of 1 and 2, although you > > might have to look at your application to get an insight. > > Another option you should look at is TCP_QUEUE_OOSEQ. You > > should probably disable this to reject out of sequence TCP > > packets. That can result in a bunch of packets queued up, > > and you haven't got many to do that with, so if you run > > out once, it's likely to exacerbate the problem as it > > holds some in a queue for a while, and there's no chance > > of more arriving as that may be all your pbufs! > > > > I think the most important thing for you to try is enabling > > LWIP_STATS and possibly LWIP_STATS_DISPLAY and look at the > > error fields for allocations, to find out if (as is very > > likely), you're running out of something, and if so > > what. From there, it may be easier to see why it's > > running out and where it's gone. > > > > With LWIP_STATS_DISPLAY on you can call this: > > #include "lwip/stats.h" > > void stats_display(void); > > > > Thanks for the help. > > 1) Yes I would guess that my application is occasionally going off and > erase a flash sector (not sure how long but < 400ms for whole device). > So that causes some queueing. > > 2) I checked and added a bunch of code to monitor my releasing of > pbufs, and I am doing that correctly. I am not holding any pbufs, but > something else grabs them all when everything goes haywire. > > 3) Yes I added the LWIP_STATS_DISPLAY and all the pbufs are being used, > so this look like it is the cause of the problem. > > LINK > xmit: 130 > rexmit: 0 > recv: 125 > fw: 0 > drop: 9 > . > . > MEM PBUF_POOL > avail: 8 > used: 8 > max: 8 > err: 9 > > 4) I disabled TCP_QUEUE_OOSEQ and that seems to resolve the problem. > (or does it just hide it?) Shouldn't the TCP code know that its out > of pbufs and free some of them? Or is something that is missing to > keep the product light weight? > > > > > > _______________________________________________ > lwip-users mailing list > [email protected] > http://lists.nongnu.org/mailman/listinfo/lwip-users _______________________________________________ lwip-users mailing list [email protected] http://lists.nongnu.org/mailman/listinfo/lwip-users
