> On Dec 30, 2016, at 12:59 PM, Daniel Pauli <[email protected]> wrote: > > Many thanks for your advice! I'll try to check the memory pools when I > reproduce the issue next time. > > Further, since it sounds like you initially had sockets configured in > blocking mode, when the new socket tries to transmit, it will block trying to > allocate TCP segments due to the exhausted memory pool. The blocking will > continue until SO_SNDTIMEOUT is reached or the memory exhaustion is resolved > > To clarify, I ran two tests: In the first, all sockets used the MSG_DONTWAIT > flag for send() (non-blocking), in the second no socket used the flag > (blocking), so there should be no mixing of blocking/non-blocking from my > point of view. I'm not sure if I understand what you mean with "initially > configured in blocking mode". Does this mean that send() may still block > under certain circumstances (exhausted memory pool) even with MSG_DONTWAIT > flag set, so I should initially set the O_NONBLOCK option on the socket to > ensure that send() never blocks? >
I was referring to your original posting where you described seeing blocking when sending on a new socket after you had a stale socket in a half-open state with submitted data. I was attempting to explain why that was happening so we know there is not erroneous behavior in LwIP My understanding of BSD socket semantics is that using MSG_DONTWAIT should be equivalent to setting the O_NONBLOCK, though you’ll need to include the flag for each call rather than set the mode once. > > On Fri, Dec 30, 2016 at 6:30 PM, Joel Cunningham <[email protected] > <mailto:[email protected]>> wrote: > >> On Dec 30, 2016, at 10:43 AM, Daniel Pauli <[email protected] >> <mailto:[email protected]>> wrote: >> >> I'm a little confused about the use of select in your application. Are you >> using it with blocking sockets? >> >> I tested with both blocking and non-blocking send. I observed that >> non-blocking send (MSG_DONTWAIT flag set) on sockets determined as >> write-ready by select() sometimes returned ENOMEM when "stale sockets" are >> around. After applying the patch from >> http://lwip.100.n7.nabble.com/bug-49684-lwip-netconn-do-writemore-non-blocking-ERR-MEM-treated-as-failure-td27860.html >> >> <http://lwip.100.n7.nabble.com/bug-49684-lwip-netconn-do-writemore-non-blocking-ERR-MEM-treated-as-failure-td27860.html>, >> I got EWOULDBLOCK errors instead. >> > > Thanks for including this information. The ENOMEM gives me a good clue of > what’s most likely going on. My guess is that you’re experiencing a memory > pool exhaustion and the stale socket has claimed memory from a pool for the > segments which are queued for transmit. Since those segments are not being > ACKed in the half open state, the claimed memory won’t be available until the > segments are freed (happens during transmission timeout or when socket is > aborted) > > Further, since it sounds like you initially had sockets configured in > blocking mode, when the new socket tries to transmit, it will block trying to > allocate TCP segments due to the exhausted memory pool. The blocking will > continue until SO_SNDTIMEOUT is reached or the memory exhaustion is resolved > > If you have LwIP stats enabled, you can check the memory pools for errors to > figure out which one is failing. You should be able to resolve this by > sizing your memory pools to handle the number of supported connections. For > example if you only support 5 simultaneous TCP connections, then your pools > should be big enough to allocate 5 send buffers worth of segments. This is > how I configure my products, which typically have plenty of RAM. Not sure > what the recommendation is for very constrained RAM products. > >> >> Calling close() will initiate a graceful synchronized closure of the >> connection. This means continuing to send any queued data until it is >> ACKed, the send times out, or we received a RST. Then a FIN is sent >> indicating the sending pathway is closed. >> >> So there's no direct way for the application to tell LWIP to just give up on >> one socket without further trying to send data? Can the application specify >> a send timeout?\ > > Yes there is, with SO_LINGER you can perform an abortive closure rather than > graceful by setting the timeout to 0. Typically this is a bad idea. There’s > a decent discussion here on stackoverflow: > > http://stackoverflow.com/questions/3757289/tcp-option-so-linger-zero-when-its-required > > <http://stackoverflow.com/questions/3757289/tcp-option-so-linger-zero-when-its-required> > >> >> Lastly, what version of LwIP are you using? >> >> I'm using 2.0.0 RC1 > > Joel > >> On Wed, Dec 28, 2016 at 4:23 PM, Joel Cunningham <[email protected] >> <mailto:[email protected]>> wrote: >> >> >> On Dec 28, 2016, at 06:45 AM, Daniel Pauli <[email protected] >> <mailto:[email protected]>> wrote: >> >>> Am I understanding the description correctly that sending on the stale >>> connection eventually blocks once the remote side has crashed and this >>> prevents sending on the new socket (only because the thread is blocked)? >>> >>> If so, then the socket buffer on the stale socket has filled up (most >>> likely) and is now blocking. This is blocking I/O operating as expected >>> when data is not being acknowledged. You should use non-blocking sockets >>> and select if your server is servicing multiple sockets on a single thread. >>> >>> Joel >>> >>> Attempting to send on the stale socket blocks, which is okay on its own. >>> But I'm already using select() and observed that >> >> >>> >>> these stale sockets still somehow seem to block communication over new >>> sockets, >> >> >> If this is actually happening as described, that would be unexpected/faulty >> behavior. One TCP socket in the half-open state should not have any effect >> on the other TCP connections. >> >>> >>> even when no stale sockets are included in the write set of select(). >> >> >> I'm a little confused about the use of select in your application. Are you >> using it with blocking sockets? Select returning write-ability doesn't >> guarantee the send call won't block. If you have a blocking socket and the >> size in the send call can't fit in the amount of available buffer space, the >> call will block >> >>> >>> I even close() (successfully, according to the return value) those stale >>> sockets after they failed to be write-ready after 10 seconds, but I can see >>> in Wireshark that LWIP still sends retransmissions from the port number of >>> the closed socket. >>> >>> Could it be that close() cannot send FIN because the output buffer is full, >>> so the socket still remains active? Is there a way from the API to just >>> drop the connection without involving any more communication? >> >> >> Calling close() will initiate a graceful synchronized closure of the >> connection. This means continuing to send any queued data until it is >> ACKed, the send times out, or we received a RST. Then a FIN is sent >> indicating the sending pathway is closed. >> >> Lastly, what version of LwIP are you using? >> >> Joel >> >> _______________________________________________ >> lwip-users mailing list >> [email protected] <mailto:[email protected]> >> https://lists.nongnu.org/mailman/listinfo/lwip-users >> <https://lists.nongnu.org/mailman/listinfo/lwip-users> >> >> _______________________________________________ >> lwip-users mailing list >> [email protected] <mailto:[email protected]> >> https://lists.nongnu.org/mailman/listinfo/lwip-users >> <https://lists.nongnu.org/mailman/listinfo/lwip-users> > > _______________________________________________ > lwip-users mailing list > [email protected] <mailto:[email protected]> > https://lists.nongnu.org/mailman/listinfo/lwip-users > <https://lists.nongnu.org/mailman/listinfo/lwip-users> > Joel
_______________________________________________ lwip-users mailing list [email protected] https://lists.nongnu.org/mailman/listinfo/lwip-users
