Re: [PATCH 1/2] SUNRPC: accept() may return sockets that are still in SYN_RECV

2016-07-28 Thread Fields Bruce James
On Wed, Jul 27, 2016 at 07:11:23PM +, Trond Myklebust wrote:
> Hi Eric,
> 
> > On Jul 27, 2016, at 14:59, Eric Dumazet  wrote:
> > 
> > On Wed, 2016-07-27 at 14:48 -0400, Fields Bruce James wrote:
> >> On Tue, Jul 26, 2016 at 04:08:29PM +, Trond Myklebust wrote:
> >>> 
>  On Jul 26, 2016, at 11:43, J. Bruce Fields  wrote:
>  
>  On Tue, Jul 26, 2016 at 09:51:19AM -0400, Trond Myklebust wrote:
> > We're seeing traces of the following form:
> > 
> > [10952.396347] svc: transport 88042ba4a 000 dequeued, inuse=2
> > [10952.396351] svc: tcp_accept 88042ba4 a000 sock 88042a6e4c80
> > [10952.396362] nfsd: connect from 10.2.6.1, port=187
> > [10952.396364] svc: svc_setup_socket 8800b99bcf00
> > [10952.396368] setting up TCP socket for reading
> > [10952.396370] svc: svc_setup_socket created 8803eb10a000 (inet 
> > 88042b75b800)
> > [10952.396373] svc: transport 8803eb10a000 put into queue
> > [10952.396375] svc: transport 88042ba4a000 put into queue
> > [10952.396377] svc: server 8800bb0ec000 waiting for data (to = 
> > 360)
> > [10952.396380] svc: transport 8803eb10a000 dequeued, inuse=2
> > [10952.396381] svc_recv: found XPT_CLOSE
> > [10952.396397] svc: svc_delete_xprt(8803eb10a000)
> > [10952.396398] svc: svc_tcp_sock_detach(8803eb10a000)
> > [10952.396399] svc: svc_sock_detach(8803eb10a000)
> > [10952.396412] svc: svc_sock_free(8803eb10a000)
> > 
> > i.e. an immediate close of the socket after initialisation.
>  
>  Interesting, thanks!
>  
>  So the one thing I don't understand is why this is correct behavior for
>  accept--I thought it wasn't supposed to return a socket until it was
>  fully established.
> >>> 
> >>> inet_accept() appears to allow SYN_RECV:
> >> 
> >> OK.  Cc'ing netdev just to make sure we didn't overlook anything.
> >> 
> > 
> > SYN_RECV after accept() is a TCP Fast Open property I think.
> > 
> > Maybe you are playing with some global TCP Fast Open settings ?
> > 
> 
> The Linux kernel client should not be using TCP fast open, but it is possible 
> that some of the other NFSv3 clients we’re using are.
> Would a standard knfsd listener respond to a TCP fast open request, or would 
> the default behaviour be to ignore it?
> 
> If the default behaviour for the server is to allow fast open, then we do 
> need these patches, IMO.

Even if it's not a default, if there's a configuration that allows
accept to return a socket in SYN_RECV state, then knfsd should handle it
gracefully, especially as long as it's this easy.

It'd still be useful to understand why this is happening, though

--b.


Re: [PATCH 1/2] SUNRPC: accept() may return sockets that are still in SYN_RECV

2016-07-27 Thread Trond Myklebust
Hi Eric,

> On Jul 27, 2016, at 14:59, Eric Dumazet  wrote:
> 
> On Wed, 2016-07-27 at 14:48 -0400, Fields Bruce James wrote:
>> On Tue, Jul 26, 2016 at 04:08:29PM +, Trond Myklebust wrote:
>>> 
 On Jul 26, 2016, at 11:43, J. Bruce Fields  wrote:
 
 On Tue, Jul 26, 2016 at 09:51:19AM -0400, Trond Myklebust wrote:
> We're seeing traces of the following form:
> 
> [10952.396347] svc: transport 88042ba4a 000 dequeued, inuse=2
> [10952.396351] svc: tcp_accept 88042ba4 a000 sock 88042a6e4c80
> [10952.396362] nfsd: connect from 10.2.6.1, port=187
> [10952.396364] svc: svc_setup_socket 8800b99bcf00
> [10952.396368] setting up TCP socket for reading
> [10952.396370] svc: svc_setup_socket created 8803eb10a000 (inet 
> 88042b75b800)
> [10952.396373] svc: transport 8803eb10a000 put into queue
> [10952.396375] svc: transport 88042ba4a000 put into queue
> [10952.396377] svc: server 8800bb0ec000 waiting for data (to = 
> 360)
> [10952.396380] svc: transport 8803eb10a000 dequeued, inuse=2
> [10952.396381] svc_recv: found XPT_CLOSE
> [10952.396397] svc: svc_delete_xprt(8803eb10a000)
> [10952.396398] svc: svc_tcp_sock_detach(8803eb10a000)
> [10952.396399] svc: svc_sock_detach(8803eb10a000)
> [10952.396412] svc: svc_sock_free(8803eb10a000)
> 
> i.e. an immediate close of the socket after initialisation.
 
 Interesting, thanks!
 
 So the one thing I don't understand is why this is correct behavior for
 accept--I thought it wasn't supposed to return a socket until it was
 fully established.
>>> 
>>> inet_accept() appears to allow SYN_RECV:
>> 
>> OK.  Cc'ing netdev just to make sure we didn't overlook anything.
>> 
> 
> SYN_RECV after accept() is a TCP Fast Open property I think.
> 
> Maybe you are playing with some global TCP Fast Open settings ?
> 

The Linux kernel client should not be using TCP fast open, but it is possible 
that some of the other NFSv3 clients we’re using are.
Would a standard knfsd listener respond to a TCP fast open request, or would 
the default behaviour be to ignore it?

If the default behaviour for the server is to allow fast open, then we do need 
these patches, IMO.

> 
>> (Also: what were user-visible symptoms?  Mounts failing, or unexpected
>> delays?)
>> 

Connection retry storms on the server.

>> --b.
>> 
>>> 
>>> int inet_accept(struct socket *sock, struct socket *newsock, int flags)
>>> {
>>>struct sock *sk1 = sock->sk;
>>>int err = -EINVAL;
>>>struct sock *sk2 = sk1->sk_prot->accept(sk1, flags, );
>>> 
>>>if (!sk2)
>>>goto do_err;
>>> 
>>>lock_sock(sk2);
>>> 
>>>sock_rps_record_flow(sk2);
>>>WARN_ON(!((1 << sk2->sk_state) &
>>>  (TCPF_ESTABLISHED | TCPF_SYN_RECV |
>>>  TCPF_CLOSE_WAIT | TCPF_CLOSE)));
>>> 
>>>sock_graft(sk2, newsock);
>>> 
>>>newsock->state = SS_CONNECTED;
>>>err = 0;
>>>release_sock(sk2);
>>> do_err:
>>>return err;
>>> }
> 
> 



Re: [PATCH 1/2] SUNRPC: accept() may return sockets that are still in SYN_RECV

2016-07-27 Thread Eric Dumazet
On Wed, 2016-07-27 at 14:48 -0400, Fields Bruce James wrote:
> On Tue, Jul 26, 2016 at 04:08:29PM +, Trond Myklebust wrote:
> > 
> > > On Jul 26, 2016, at 11:43, J. Bruce Fields  wrote:
> > > 
> > > On Tue, Jul 26, 2016 at 09:51:19AM -0400, Trond Myklebust wrote:
> > >> We're seeing traces of the following form:
> > >> 
> > >> [10952.396347] svc: transport 88042ba4a 000 dequeued, inuse=2
> > >> [10952.396351] svc: tcp_accept 88042ba4 a000 sock 88042a6e4c80
> > >> [10952.396362] nfsd: connect from 10.2.6.1, port=187
> > >> [10952.396364] svc: svc_setup_socket 8800b99bcf00
> > >> [10952.396368] setting up TCP socket for reading
> > >> [10952.396370] svc: svc_setup_socket created 8803eb10a000 (inet 
> > >> 88042b75b800)
> > >> [10952.396373] svc: transport 8803eb10a000 put into queue
> > >> [10952.396375] svc: transport 88042ba4a000 put into queue
> > >> [10952.396377] svc: server 8800bb0ec000 waiting for data (to = 
> > >> 360)
> > >> [10952.396380] svc: transport 8803eb10a000 dequeued, inuse=2
> > >> [10952.396381] svc_recv: found XPT_CLOSE
> > >> [10952.396397] svc: svc_delete_xprt(8803eb10a000)
> > >> [10952.396398] svc: svc_tcp_sock_detach(8803eb10a000)
> > >> [10952.396399] svc: svc_sock_detach(8803eb10a000)
> > >> [10952.396412] svc: svc_sock_free(8803eb10a000)
> > >> 
> > >> i.e. an immediate close of the socket after initialisation.
> > > 
> > > Interesting, thanks!
> > > 
> > > So the one thing I don't understand is why this is correct behavior for
> > > accept--I thought it wasn't supposed to return a socket until it was
> > > fully established.
> > 
> > inet_accept() appears to allow SYN_RECV:
> 
> OK.  Cc'ing netdev just to make sure we didn't overlook anything.
> 

SYN_RECV after accept() is a TCP Fast Open property I think.

Maybe you are playing with some global TCP Fast Open settings ?


> (Also: what were user-visible symptoms?  Mounts failing, or unexpected
> delays?)
> 
> --b.
> 
> > 
> > int inet_accept(struct socket *sock, struct socket *newsock, int flags)
> > {
> > struct sock *sk1 = sock->sk;
> > int err = -EINVAL;
> > struct sock *sk2 = sk1->sk_prot->accept(sk1, flags, );
> > 
> > if (!sk2)
> > goto do_err;
> > 
> > lock_sock(sk2);
> > 
> > sock_rps_record_flow(sk2);
> > WARN_ON(!((1 << sk2->sk_state) &
> >   (TCPF_ESTABLISHED | TCPF_SYN_RECV |
> >   TCPF_CLOSE_WAIT | TCPF_CLOSE)));
> > 
> > sock_graft(sk2, newsock);
> > 
> > newsock->state = SS_CONNECTED;
> > err = 0;
> > release_sock(sk2);
> > do_err:
> > return err;
> > }




Re: [PATCH 1/2] SUNRPC: accept() may return sockets that are still in SYN_RECV

2016-07-27 Thread Fields Bruce James
On Tue, Jul 26, 2016 at 04:08:29PM +, Trond Myklebust wrote:
> 
> > On Jul 26, 2016, at 11:43, J. Bruce Fields  wrote:
> > 
> > On Tue, Jul 26, 2016 at 09:51:19AM -0400, Trond Myklebust wrote:
> >> We're seeing traces of the following form:
> >> 
> >> [10952.396347] svc: transport 88042ba4a 000 dequeued, inuse=2
> >> [10952.396351] svc: tcp_accept 88042ba4 a000 sock 88042a6e4c80
> >> [10952.396362] nfsd: connect from 10.2.6.1, port=187
> >> [10952.396364] svc: svc_setup_socket 8800b99bcf00
> >> [10952.396368] setting up TCP socket for reading
> >> [10952.396370] svc: svc_setup_socket created 8803eb10a000 (inet 
> >> 88042b75b800)
> >> [10952.396373] svc: transport 8803eb10a000 put into queue
> >> [10952.396375] svc: transport 88042ba4a000 put into queue
> >> [10952.396377] svc: server 8800bb0ec000 waiting for data (to = 360)
> >> [10952.396380] svc: transport 8803eb10a000 dequeued, inuse=2
> >> [10952.396381] svc_recv: found XPT_CLOSE
> >> [10952.396397] svc: svc_delete_xprt(8803eb10a000)
> >> [10952.396398] svc: svc_tcp_sock_detach(8803eb10a000)
> >> [10952.396399] svc: svc_sock_detach(8803eb10a000)
> >> [10952.396412] svc: svc_sock_free(8803eb10a000)
> >> 
> >> i.e. an immediate close of the socket after initialisation.
> > 
> > Interesting, thanks!
> > 
> > So the one thing I don't understand is why this is correct behavior for
> > accept--I thought it wasn't supposed to return a socket until it was
> > fully established.
> 
> inet_accept() appears to allow SYN_RECV:

OK.  Cc'ing netdev just to make sure we didn't overlook anything.

(Also: what were user-visible symptoms?  Mounts failing, or unexpected
delays?)

--b.

> 
> int inet_accept(struct socket *sock, struct socket *newsock, int flags)
> {
> struct sock *sk1 = sock->sk;
> int err = -EINVAL;
> struct sock *sk2 = sk1->sk_prot->accept(sk1, flags, );
> 
> if (!sk2)
> goto do_err;
> 
> lock_sock(sk2);
> 
> sock_rps_record_flow(sk2);
> WARN_ON(!((1 << sk2->sk_state) &
>   (TCPF_ESTABLISHED | TCPF_SYN_RECV |
>   TCPF_CLOSE_WAIT | TCPF_CLOSE)));
> 
> sock_graft(sk2, newsock);
> 
> newsock->state = SS_CONNECTED;
> err = 0;
> release_sock(sk2);
> do_err:
> return err;
> }