An alternative is to allow the 'connect()' to return '0' in this particular 
case. The socket state is already 'DISCONNECTING', so any further write will 
get '-EPIPE' but the socket is still able to read the messages existing in its 
receive queue. This may seem a bit contrary to the previous patch but they are 
not mutually exclusive, the previous one would only deal with the case the 
socket state is 'DISCONNECTING' but the 'sk->err' != 0.
How does this look correct from a functional standpoint?

BR/Tuong

-----Original Message-----
From: Jon Maloy <jma...@redhat.com> 
Sent: Tuesday, January 21, 2020 7:11 PM
To: ying....@windriver.com
Cc: tipc-discussion@lists.sourceforge.net
Subject: [tipc-discussion] Fwd: [net v2] tipc: fix wrong connect() return code

Hi Ying,
Do you have any ideas regarding this? My best proposal is to att a time stamp 
to the new server socket at creation,
and then delay a 'close' if the socket is < 5ms old. 
But I am not quite happy with this solution, so any better idea would be 
welcome.

///jon


> 
> 
>    ----- Forwarded Message ----- From: Tuong Lien Tong
>    <tuong.t.l...@dektech.com.au>To: 'Ying Xue' <ying....@windriver.com>;
>    "jma...@redhat.com" <jma...@redhat.com>; "ma...@donjonn.com"
>    <ma...@donjonn.com>Cc: "tipc-...@dektech.com.au"
>    <tipc-...@dektech.com.au>Sent: Monday, January 13, 2020, 04:57:58 AM
>    GMT-5Subject: FW: [net v2] tipc: fix wrong connect() return code
>  
> Hi Ying, Jon,
> 
>   
> 
> The last change has exposed a different issue with our “lightweight”
> connection setup mechanism,
> 
> Please see the scenario below:
> 
>   
> 
>   
> 
>            server process                                 client process
> 
>       -----------------------
>                           ------------------------------
> 
>                  create()|                          |
> 
>                  listen()*tskL                      |
> 
>                    bind()|                          |create()
> 
>               +- accept()*                      tskC*
> 
>               |          |                          |connect() ~~~~~~~~~~~~~+
> 
> wait_for_rcv()|          |<---------[SYN]-----------|                       |
> 
>               |          |                          |//tskC: CONNECTING
>               |          |                          |     *wait_for_conn()
> 
>               +--------->*tskS                      |
>                                      |->add to wait queue
> 
>                          |//tskS: ESTABLISHED       |                       |
> 
>                          |----------[ACK]---------->|//tskC: ESTABLISHED
>                          |   ->wakeup(): SLEEPING -> run queue
> 
>                    send()|----------[DATA]--------->|\
>                                         ->wakeup(): already woken
> 
>                    send()|----------[DATA]--------->| |
>                                        ->wakeup(): already woken
> 
>                      .   .            .             . |-> receive queue     .
> 
>                      .   .            .             . |                     .
> 
>                    send()|----------[DATA]--------->|/
>                                         ->wakeup(): already woken
> 
>                   close()|----------[FIN]---------->|//tskC: DISCONNECTING  |
> 
>                          |                          |                       |
> 
>                          |                        
> ~~~~~~~~~~~~~~~~~~~~~~~~~~>schedule():
>                          |                        run queue -> RUNNING
> 
>                                                                             
> |->add
>                                                                             
> |back
>                                                                             
> |to
>                                                                             
> |wait
>                                                                             
> |queue
>                                                                             
> |(state
>                                                                             
> |!=
>                                                                             
> |ESTABLISHED)
> 
>                                                                             .
> 
>                                                                             .
> 
>                                                                             |
> 
>                                                                             
> *timed
>                                                                             
> out
> 
>   
> 
>   
> 
> When the server side receives ‘SYN’ from client, the ‘accept()’ call ends up
> with the new socket in the ‘ESTABLISHED’ state. Shortly thereafter, the
> server starts sending a number of messages before calling the socket
> ‘close()’ without waiting for any response from the client!
> 
> In this scenario, it may happen on the other side that the client process
> waiting for the connection to be ‘ESTABLISHED’ cannot be awakened in time.
> Although the ‘wakeup()’ call is still done when the ‘ACK’ from the server
> arrives, but it just moves the process into the run queue (i.e. runnable) to
> be scheduled to run later. The kernel (i.e. RX process) is not preemptable
> when it processes the messages from the server, including the ‘ACK ‘,
> ‘DATA’, … and finally the ‘FIN’ which will terminate the connection. When
> the wait process is really run back, the client socket state is now
> ‘DISCONNECTING’ (i.e. != ‘ESTABLISHED’), so it continues to wait until the
> timer expires. The client will finally get ‘-ETIMEDOUT’ and forced to
> release the socket whereas there remains the messages from the server in the
> socket’s receive queue.
> 
>   
> 
> Notes:
>    
>    - The issue is found by the tipcutils/ptts tool.
>    - This is a timing issue, sometimes it happens sometimes not, depending on
>    the small enough number of the data messages (e.g. < 100) from the server
>    before it sends the ‘FIN’, as well as the network traffic, RX process,
>    scheduler, etc. on the client side.
>    - Without the previous change, the issue is _not detected_ since the
>    ‘connect()’ call will return 0 when the wait process is woken & running
>    back but the socket state is ‘DISCONNECTING’, there is no socket error
>    (sk->sk_err == 0) in this case… However, this is the original issue and
>    unexpected because if the client tries to send something after that, it
>    will get ‘-EPIPE’ immediately.
> 
>   
> 
> We need to find a solution to this problem,
>    
>    - It seems to me that in this mechanism the server is “too hasty” to set
>    the new socket to ‘ESTABLISHED’ at the ‘accept()’ when it has just
>    received a ‘SYN’ from client. That allows it to go ahead with sending the
>    messages & closing the socket in one step… Should we consider to include
>    something like the TCP ‘3-way handshake’ here, so the server needs to get
>    a ‘ACK’ from the client before the connection is fully established? I
>    guess ‘no’ since it should be kept lightweight…?
>    - An alternative is we can introduce a bit delay e.g. 2 or 5ms to the
>    ‘accept()’ or ‘close()’ before completing it, which will give a chance
>    for its client to be really established first? This solution looks simple
>    and should have less impact to user space…
>    - Or …?
> 
>   
> 
> What is your opinion, please share?
> 
> Many thanks!
> 
>   
> 
> BR/Tuong
> 
>   
> 
> -----Original Message-----
> From: Xue, Ying <ying....@windriver.com>
> Sent: Thursday, January 2, 2020 4:10 PM
> To: Tuong Lien <tuong.t.l...@dektech.com.au>;
> tipc-discussion@lists.sourceforge.net; jma...@redhat.com; ma...@donjonn.com
> Subject: RE: [net v2] tipc: fix wrong connect() return code
> 
>   
> 
> Great!
> 
>   
> 
> Acked-by: Ying Xue <ying....@windriver.com>
> 
>   
> 
> -----Original Message-----
> 
> From: Tuong Lien [mailto:tuong.t.l...@dektech.com.au]
> 
> Sent: Monday, December 30, 2019 4:56 PM
> 
> To: tipc-discussion@lists.sourceforge.net; jma...@redhat.com;
> ma...@donjonn.com; Xue, Ying
> 
> Subject: [net v2] tipc: fix wrong connect() return code
> 
>   
> 
> The current 'tipc_wait_for_connect()' function makes a loop and waits
> 
> for the condition 'sk->sk_state != TIPC_CONNECTING' to conclude if the
> 
> connecting has done. However, when the condition is met, it always
> 
> returns '0' even in the case the connecting was actually failed (e.g.
> 
> refused because the server socket has closed...) and the socket state
> 
> was set to 'TIPC_DISCONNECTING'.
> 
> This results in a wrong return code for the 'connect()' call from user,
> 
> making it believe that the connection is established and goes ahead
> 
> with more actions e.g. building & sending a message but then finally
> 
> gets an unexpected result (e.g. '-EPIPE').
> 
>   
> 
> This commit fixes the issue by instead setting the wait condition to
> 
> 'tipc_sk_connected(sk)', so that the function will return '0' only when
> 
> the connection is really established. Otherwise, either the socket
> 
> error code if any or '-ETIMEDOUT'/'-EINTR' will be returned
> 
> correspondingly.
> 
>   
> 
> ---------
> 
> v2: changed after discussing with Ying
> 
> ---------
> 
>   
> 
> Signed-off-by: Tuong Lien <tuong.t.l...@dektech.com.au>
> 
> ---
> 
>  net/tipc/socket.c | 4 ++--
> 
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
>   
> 
> diff --git a/net/tipc/socket.c b/net/tipc/socket.c
> 
> index 6552f986774c..2f5679f84060 100644
> 
> --- a/net/tipc/socket.c
> 
> +++ b/net/tipc/socket.c
> 
> @@ -2432,8 +2432,8 @@ static int tipc_wait_for_connect(struct socket *sock,
> long *timeo_p)
> 
>                                                 return
>                                                 sock_intr_errno(*timeo_p);
> 
>  
> 
>                                 add_wait_queue(sk_sleep(sk), &wait);
> 
> -                              done = sk_wait_event(sk, timeo_p,
> 
> -
>                                                                    
> sk->sk_state
> != TIPC_CONNECTING, &wait);
> 
> +                             done = sk_wait_event(sk, timeo_p,
> tipc_sk_connected(sk),
> 
> +                                                                  &wait);
> 
>                                 remove_wait_queue(sk_sleep(sk), &wait);
> 
>                 } while (!done);
> 
>                 return 0;
> 
> --
> 
> 2.13.7
> 
>   
> 

_______________________________________________
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion



_______________________________________________
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Reply via email to