PetteriAimonen opened a new issue #3647:
URL: https://github.com/apache/incubator-nuttx/issues/3647


   I have observed this sequence of events, which happens only with specific 
software combination but then occurs within 30 minutes or so:
   
   1. Host opens TCP connection to NuttX and sends HTTP request.
   2. My code on NuttX responds with initial headers and starts preparing 
response payload.
   3. My code calls `send()` while simultaneously host closes TCP connection.
   4. TCP communication stops working, ping still works. With debugger I 
observe that `psock_tcp_send()` is stuck in `net_timedwait()`. Connection has 
`tcpstateflags = 9 (TCP_LAST_ACK)`, `unacked = 0` and `psock.s_flags = 0x83 
(_SF_CLOSED | _SF_SEND)`.
   
   My theory is that there is a window for race condition in `psock_tcp_send()` 
between the check for connection and when `net_lock()` is called. If a network 
device interrupt occurs in that window, it can change the socket to a closed 
state. Then the disconnect event or the data event never happen, and the 
callback does not get called.
   
   I have attached a preliminary patch that appears to fix the problem for me. 
However it would be great if someone could review if this makes sense.
   
   I am unable to reproduce the problem on newest NuttX git version due to the 
specific timing conditions needed, but based on reviewing the code the race 
condition seems to be present there also.
   
   
[nuttx_psock_tcp_send_patch.txt](https://github.com/apache/incubator-nuttx/files/6413704/nuttx_psock_tcp_send_patch.txt)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to