An alternative is to allow the 'connect()' to return '0' in this particular case. The socket state is already 'DISCONNECTING', so any further write will get '-EPIPE' but the socket is still able to read the messages existing in its receive queue. This may seem a bit contrary to the previous patch but they are not mutually exclusive, the previous one would only deal with the case the socket state is 'DISCONNECTING' but the 'sk->err' != 0. How does this look correct from a functional standpoint?
BR/Tuong -----Original Message----- From: Jon Maloy <jma...@redhat.com> Sent: Tuesday, January 21, 2020 7:11 PM To: ying....@windriver.com Cc: tipc-discussion@lists.sourceforge.net Subject: [tipc-discussion] Fwd: [net v2] tipc: fix wrong connect() return code Hi Ying, Do you have any ideas regarding this? My best proposal is to att a time stamp to the new server socket at creation, and then delay a 'close' if the socket is < 5ms old. But I am not quite happy with this solution, so any better idea would be welcome. ///jon > > > ----- Forwarded Message ----- From: Tuong Lien Tong > <tuong.t.l...@dektech.com.au>To: 'Ying Xue' <ying....@windriver.com>; > "jma...@redhat.com" <jma...@redhat.com>; "ma...@donjonn.com" > <ma...@donjonn.com>Cc: "tipc-...@dektech.com.au" > <tipc-...@dektech.com.au>Sent: Monday, January 13, 2020, 04:57:58 AM > GMT-5Subject: FW: [net v2] tipc: fix wrong connect() return code > > Hi Ying, Jon, > > > > The last change has exposed a different issue with our “lightweight” > connection setup mechanism, > > Please see the scenario below: > > > > > > server process client process > > ----------------------- > ------------------------------ > > create()| | > > listen()*tskL | > > bind()| |create() > > +- accept()* tskC* > > | | |connect() ~~~~~~~~~~~~~+ > > wait_for_rcv()| |<---------[SYN]-----------| | > > | | |//tskC: CONNECTING > | | | *wait_for_conn() > > +--------->*tskS | > |->add to wait queue > > |//tskS: ESTABLISHED | | > > |----------[ACK]---------->|//tskC: ESTABLISHED > | ->wakeup(): SLEEPING -> run queue > > send()|----------[DATA]--------->|\ > ->wakeup(): already woken > > send()|----------[DATA]--------->| | > ->wakeup(): already woken > > . . . . |-> receive queue . > > . . . . | . > > send()|----------[DATA]--------->|/ > ->wakeup(): already woken > > close()|----------[FIN]---------->|//tskC: DISCONNECTING | > > | | | > > | > ~~~~~~~~~~~~~~~~~~~~~~~~~~>schedule(): > | run queue -> RUNNING > > > |->add > > |back > > |to > > |wait > > |queue > > |(state > > |!= > > |ESTABLISHED) > > . > > . > > | > > > *timed > > out > > > > > > When the server side receives ‘SYN’ from client, the ‘accept()’ call ends up > with the new socket in the ‘ESTABLISHED’ state. Shortly thereafter, the > server starts sending a number of messages before calling the socket > ‘close()’ without waiting for any response from the client! > > In this scenario, it may happen on the other side that the client process > waiting for the connection to be ‘ESTABLISHED’ cannot be awakened in time. > Although the ‘wakeup()’ call is still done when the ‘ACK’ from the server > arrives, but it just moves the process into the run queue (i.e. runnable) to > be scheduled to run later. The kernel (i.e. RX process) is not preemptable > when it processes the messages from the server, including the ‘ACK ‘, > ‘DATA’, … and finally the ‘FIN’ which will terminate the connection. When > the wait process is really run back, the client socket state is now > ‘DISCONNECTING’ (i.e. != ‘ESTABLISHED’), so it continues to wait until the > timer expires. The client will finally get ‘-ETIMEDOUT’ and forced to > release the socket whereas there remains the messages from the server in the > socket’s receive queue. > > > > Notes: > > - The issue is found by the tipcutils/ptts tool. > - This is a timing issue, sometimes it happens sometimes not, depending on > the small enough number of the data messages (e.g. < 100) from the server > before it sends the ‘FIN’, as well as the network traffic, RX process, > scheduler, etc. on the client side. > - Without the previous change, the issue is _not detected_ since the > ‘connect()’ call will return 0 when the wait process is woken & running > back but the socket state is ‘DISCONNECTING’, there is no socket error > (sk->sk_err == 0) in this case… However, this is the original issue and > unexpected because if the client tries to send something after that, it > will get ‘-EPIPE’ immediately. > > > > We need to find a solution to this problem, > > - It seems to me that in this mechanism the server is “too hasty” to set > the new socket to ‘ESTABLISHED’ at the ‘accept()’ when it has just > received a ‘SYN’ from client. That allows it to go ahead with sending the > messages & closing the socket in one step… Should we consider to include > something like the TCP ‘3-way handshake’ here, so the server needs to get > a ‘ACK’ from the client before the connection is fully established? I > guess ‘no’ since it should be kept lightweight…? > - An alternative is we can introduce a bit delay e.g. 2 or 5ms to the > ‘accept()’ or ‘close()’ before completing it, which will give a chance > for its client to be really established first? This solution looks simple > and should have less impact to user space… > - Or …? > > > > What is your opinion, please share? > > Many thanks! > > > > BR/Tuong > > > > -----Original Message----- > From: Xue, Ying <ying....@windriver.com> > Sent: Thursday, January 2, 2020 4:10 PM > To: Tuong Lien <tuong.t.l...@dektech.com.au>; > tipc-discussion@lists.sourceforge.net; jma...@redhat.com; ma...@donjonn.com > Subject: RE: [net v2] tipc: fix wrong connect() return code > > > > Great! > > > > Acked-by: Ying Xue <ying....@windriver.com> > > > > -----Original Message----- > > From: Tuong Lien [mailto:tuong.t.l...@dektech.com.au] > > Sent: Monday, December 30, 2019 4:56 PM > > To: tipc-discussion@lists.sourceforge.net; jma...@redhat.com; > ma...@donjonn.com; Xue, Ying > > Subject: [net v2] tipc: fix wrong connect() return code > > > > The current 'tipc_wait_for_connect()' function makes a loop and waits > > for the condition 'sk->sk_state != TIPC_CONNECTING' to conclude if the > > connecting has done. However, when the condition is met, it always > > returns '0' even in the case the connecting was actually failed (e.g. > > refused because the server socket has closed...) and the socket state > > was set to 'TIPC_DISCONNECTING'. > > This results in a wrong return code for the 'connect()' call from user, > > making it believe that the connection is established and goes ahead > > with more actions e.g. building & sending a message but then finally > > gets an unexpected result (e.g. '-EPIPE'). > > > > This commit fixes the issue by instead setting the wait condition to > > 'tipc_sk_connected(sk)', so that the function will return '0' only when > > the connection is really established. Otherwise, either the socket > > error code if any or '-ETIMEDOUT'/'-EINTR' will be returned > > correspondingly. > > > > --------- > > v2: changed after discussing with Ying > > --------- > > > > Signed-off-by: Tuong Lien <tuong.t.l...@dektech.com.au> > > --- > > net/tipc/socket.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/net/tipc/socket.c b/net/tipc/socket.c > > index 6552f986774c..2f5679f84060 100644 > > --- a/net/tipc/socket.c > > +++ b/net/tipc/socket.c > > @@ -2432,8 +2432,8 @@ static int tipc_wait_for_connect(struct socket *sock, > long *timeo_p) > > return > sock_intr_errno(*timeo_p); > > > > add_wait_queue(sk_sleep(sk), &wait); > > - done = sk_wait_event(sk, timeo_p, > > - > > sk->sk_state > != TIPC_CONNECTING, &wait); > > + done = sk_wait_event(sk, timeo_p, > tipc_sk_connected(sk), > > + &wait); > > remove_wait_queue(sk_sleep(sk), &wait); > > } while (!done); > > return 0; > > -- > > 2.13.7 > > > _______________________________________________ tipc-discussion mailing list tipc-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/tipc-discussion _______________________________________________ tipc-discussion mailing list tipc-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/tipc-discussion