> On Dec 30, 2016, at 10:43 AM, Daniel Pauli <[email protected]> wrote: > > I'm a little confused about the use of select in your application. Are you > using it with blocking sockets? > > I tested with both blocking and non-blocking send. I observed that > non-blocking send (MSG_DONTWAIT flag set) on sockets determined as > write-ready by select() sometimes returned ENOMEM when "stale sockets" are > around. After applying the patch from > http://lwip.100.n7.nabble.com/bug-49684-lwip-netconn-do-writemore-non-blocking-ERR-MEM-treated-as-failure-td27860.html > > <http://lwip.100.n7.nabble.com/bug-49684-lwip-netconn-do-writemore-non-blocking-ERR-MEM-treated-as-failure-td27860.html>, > I got EWOULDBLOCK errors instead. >
Thanks for including this information. The ENOMEM gives me a good clue of what’s most likely going on. My guess is that you’re experiencing a memory pool exhaustion and the stale socket has claimed memory from a pool for the segments which are queued for transmit. Since those segments are not being ACKed in the half open state, the claimed memory won’t be available until the segments are freed (happens during transmission timeout or when socket is aborted) Further, since it sounds like you initially had sockets configured in blocking mode, when the new socket tries to transmit, it will block trying to allocate TCP segments due to the exhausted memory pool. The blocking will continue until SO_SNDTIMEOUT is reached or the memory exhaustion is resolved If you have LwIP stats enabled, you can check the memory pools for errors to figure out which one is failing. You should be able to resolve this by sizing your memory pools to handle the number of supported connections. For example if you only support 5 simultaneous TCP connections, then your pools should be big enough to allocate 5 send buffers worth of segments. This is how I configure my products, which typically have plenty of RAM. Not sure what the recommendation is for very constrained RAM products. > > Calling close() will initiate a graceful synchronized closure of the > connection. This means continuing to send any queued data until it is ACKed, > the send times out, or we received a RST. Then a FIN is sent indicating the > sending pathway is closed. > > So there's no direct way for the application to tell LWIP to just give up on > one socket without further trying to send data? Can the application specify a > send timeout?\ Yes there is, with SO_LINGER you can perform an abortive closure rather than graceful by setting the timeout to 0. Typically this is a bad idea. There’s a decent discussion here on stackoverflow: http://stackoverflow.com/questions/3757289/tcp-option-so-linger-zero-when-its-required <http://stackoverflow.com/questions/3757289/tcp-option-so-linger-zero-when-its-required> > > Lastly, what version of LwIP are you using? > > I'm using 2.0.0 RC1 Joel > On Wed, Dec 28, 2016 at 4:23 PM, Joel Cunningham <[email protected] > <mailto:[email protected]>> wrote: > > > On Dec 28, 2016, at 06:45 AM, Daniel Pauli <[email protected] > <mailto:[email protected]>> wrote: > >> Am I understanding the description correctly that sending on the stale >> connection eventually blocks once the remote side has crashed and this >> prevents sending on the new socket (only because the thread is blocked)? >> >> If so, then the socket buffer on the stale socket has filled up (most >> likely) and is now blocking. This is blocking I/O operating as expected >> when data is not being acknowledged. You should use non-blocking sockets >> and select if your server is servicing multiple sockets on a single thread. >> >> Joel >> >> Attempting to send on the stale socket blocks, which is okay on its own. But >> I'm already using select() and observed that > > >> >> these stale sockets still somehow seem to block communication over new >> sockets, > > > If this is actually happening as described, that would be unexpected/faulty > behavior. One TCP socket in the half-open state should not have any effect > on the other TCP connections. > >> >> even when no stale sockets are included in the write set of select(). > > > I'm a little confused about the use of select in your application. Are you > using it with blocking sockets? Select returning write-ability doesn't > guarantee the send call won't block. If you have a blocking socket and the > size in the send call can't fit in the amount of available buffer space, the > call will block > >> >> I even close() (successfully, according to the return value) those stale >> sockets after they failed to be write-ready after 10 seconds, but I can see >> in Wireshark that LWIP still sends retransmissions from the port number of >> the closed socket. >> >> Could it be that close() cannot send FIN because the output buffer is full, >> so the socket still remains active? Is there a way from the API to just drop >> the connection without involving any more communication? > > > Calling close() will initiate a graceful synchronized closure of the > connection. This means continuing to send any queued data until it is ACKed, > the send times out, or we received a RST. Then a FIN is sent indicating the > sending pathway is closed. > > Lastly, what version of LwIP are you using? > > Joel > > _______________________________________________ > lwip-users mailing list > [email protected] <mailto:[email protected]> > https://lists.nongnu.org/mailman/listinfo/lwip-users > <https://lists.nongnu.org/mailman/listinfo/lwip-users> > > _______________________________________________ > lwip-users mailing list > [email protected] > https://lists.nongnu.org/mailman/listinfo/lwip-users
_______________________________________________ lwip-users mailing list [email protected] https://lists.nongnu.org/mailman/listinfo/lwip-users
