> -----Original Message-----
> From: Ying Xue [mailto:[email protected]]
> Sent: Tuesday, December 05, 2017 06:53
> To: Jon Maloy <[email protected]>; [email protected]
> Cc: [email protected]
> Subject: Re: [net-next PATCH 1/2] tipc: prevent swork item from queuing
> workqueue after connection is closed
> 
> On 12/05/2017 01:45 AM, Jon Maloy wrote:
> >> diff --git a/net/tipc/server.c b/net/tipc/server.c index
> >> acaef80..571e137
> >> 100644
> >> --- a/net/tipc/server.c
> >> +++ b/net/tipc/server.c
> >> @@ -449,16 +449,12 @@ int tipc_conn_sendmsg(struct tipc_server *s,
> >> int conid,  {
> >>    struct outqueue_entry *e;
> >>    struct tipc_conn *con;
> >> +  struct sock *sk;
> >>
> >>    con = tipc_conn_lookup(s, conid);
> >>    if (!con)
> >>            return -EINVAL;
> >>
> >> -  if (!test_bit(CF_CONNECTED, &con->flags)) {
> >> -          conn_put(con);
> >> -          return 0;
> >> -  }
> >> -
> >>    e = tipc_alloc_entry(data, len);
> >>    if (!e) {
> >>            conn_put(con);
> >> @@ -472,8 +468,17 @@ int tipc_conn_sendmsg(struct tipc_server *s, int
> >> conid,
> >
> > You cannot be sure that con is valid here. You need to add the
> !kref_get_unless_zero(&con->kref)) in the lookup function, as I had to do.
> > This was caused by observed crashes.
> >
> 
> Thanks! This is a very good catch point!! I was never aware of such corner
> case.
> 
> >
> >>    list_add_tail(&e->list, &con->outqueue);
> >>    spin_unlock_bh(&con->outqueue_lock);
> >>
> >> -  if (!queue_work(s->send_wq, &con->swork))
> >> +  sk = con->sock->sk;
> >> +
> > You don't know if the socket still exists here. We may be in the middle of a
> tipc_conn_close().
> > Furthermore, in the case of a kernel subscriber, there *is* no socket.
> > But I am not sure this patch is needed at all ??
> 
> Sorry, I missed the case where con->sock is null. But the change is really
> necessary for another case where con->sock is not null. If we don't hold
> sk_callback_lock read lock and don't identify whether
> tipc_unregister_callbac() has been called, but directly queue swork item to
> sending workqueue, this will open another risk window that a new tx work
> item might be queued through tipc_conn_sendmsg() again after all tx/rx
> pending work items have been flushed in tipc_close_conn() conducted in
> patch #2. As a result, the context of freeing con instance might be not in
> tipc_server_stop() any more, which means the nametbl_lock soft lockup at
> module exit might occur again.

If you in tipc_conn_kref_release() remove the connection item from the idr list 
*before* you flush the queues and remove the socket, there is no risk that this 
will happen.
If tipc_conn_sendmsg() obtains a pointer, we can be certain that the queue has 
not been flushed yet, and that this will not happen until conn_put() is called.
And, you won't need to use the callback lock or touch the socket at all here.

Regards
///jon

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
tipc-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Reply via email to