Commit 333f796235a527 ("tipc: fix a race condition leading to
subscriber refcnt bug") reveals a soft lockup while acquiring
nametbl_lock.
Before commit 333f796235a527, we call tipc_conn_shutdown() from
tipc_close_conn() in the context of tipc_topsrv_stop(). In that
context, we are allowed to grab the nametbl_lock.
In commit 333f796235a527, i moved the tipc_conn_release (renamed from
tipc_conn_shutdown) to the connection refcount cleanup. This allows
either tipc_nametbl_withdraw() or tipc_topsrv_stop() to perform
tipc_sock_release().
Since tipc_exit_net() first calls tipc_topsrv_stop() and then
tipc_nametble_withdraw() increases the chances for the later to
perform the connection cleanup.
The soft lockup occurs in the call chain of tipc_nametbl_withdraw(),
when it performs the tipc_conn_kref_release() as it tries to grab
nametbl_lock again while holding it already.
tipc_nametbl_withdraw() grabs nametbl_lock
tipc_nametbl_remove_publ()
tipc_subscrp_report_overlap()
tipc_subscrp_send_event()
tipc_conn_sendmsg()
<< if (con->flags != CF_CONNECTED) we do conn_put(),
triggering the cleanup as refcount=0. >>
tipc_conn_kref_release
tipc_sock_release
tipc_conn_release
tipc_subscrb_delete
tipc_subscrp_delete
tipc_nametbl_unsubscribe << Soft Lockup >>
Until now, tipc_server_stop() grabs and releases the idr_lock twice
for every connection. Once to find the connection and second to unset
connection flag, remove it from list and decrement the refcount.
The above lockup occurs when tipc_nametbl_withdraw() grabs the
connection in between the two, thus owning the connection
and triggering the cleanup while decrementing the refcount later.
In this commit, we perform:
- call tipc_nametbl_withdraw() before tipc_topsrv_stop() to avoid:
1. soft lockup
2. its too late to actually notify the subscribers, as the topology
server might already have started shutting down.
- In tipc_server_stop(), we remove all the connections from connection
list in the scope of the idr_lock to prevent any other thread finding
a connection which has unset CF_CONNECTED in its flags.
Fixes: 333f796235a52727 ("tipc: fix a race condition leading to subscriber
refcnt bug")
Signed-off-by: Parthasarathy Bhuvaragan <[email protected]>
---
v2: commit message update.
cleanup tipc_server_stop() as per Ying Xue.
---
net/tipc/core.c | 4 ++++
net/tipc/net.c | 2 --
net/tipc/server.c | 13 ++++++++-----
3 files changed, 12 insertions(+), 7 deletions(-)
diff --git a/net/tipc/core.c b/net/tipc/core.c
index 236b043a4156..3ea1452e2f06 100644
--- a/net/tipc/core.c
+++ b/net/tipc/core.c
@@ -93,6 +93,10 @@ static int __net_init tipc_init_net(struct net *net)
static void __net_exit tipc_exit_net(struct net *net)
{
+ struct tipc_net *tn = net_generic(net, tipc_net_id);
+
+ tipc_nametbl_withdraw(net, TIPC_CFG_SRV, tn->own_addr, 0,
+ tn->own_addr);
tipc_topsrv_stop(net);
tipc_net_stop(net);
tipc_bcast_stop(net);
diff --git a/net/tipc/net.c b/net/tipc/net.c
index 28bf4feeb81c..92696cc6e763 100644
--- a/net/tipc/net.c
+++ b/net/tipc/net.c
@@ -130,8 +130,6 @@ void tipc_net_stop(struct net *net)
if (!tn->own_addr)
return;
- tipc_nametbl_withdraw(net, TIPC_CFG_SRV, tn->own_addr, 0,
- tn->own_addr);
rtnl_lock();
tipc_bearer_stop(net);
tipc_node_stop(net);
diff --git a/net/tipc/server.c b/net/tipc/server.c
index 215849ce453d..54c23ecccd26 100644
--- a/net/tipc/server.c
+++ b/net/tipc/server.c
@@ -623,16 +623,19 @@ int tipc_server_start(struct tipc_server *s)
void tipc_server_stop(struct tipc_server *s)
{
struct tipc_conn *con;
- int total = 0;
int id;
spin_lock_bh(&s->idr_lock);
- for (id = 0; total < s->idr_in_use; id++) {
+ for (id = 0; s->idr_in_use; id++) {
con = idr_find(&s->conn_idr, id);
- if (con) {
- total++;
+ if (con && test_and_clear_bit(CF_CONNECTED, &con->flags)) {
+ idr_remove(&s->conn_idr, con->conid);
+ s->idr_in_use--;
+ /* release spinlock as kernel_sock_shutdown can sleep.
+ */
spin_unlock_bh(&s->idr_lock);
- tipc_close_conn(con);
+ kernel_sock_shutdown(con->sock, SHUT_RDWR);
+ conn_put(con);
spin_lock_bh(&s->idr_lock);
}
}
--
2.1.4
------------------------------------------------------------------------------
_______________________________________________
tipc-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/tipc-discussion