Hi Ying, These are good design changes, that definitely should go in asap. However, I feel deeply uncomfortable with such a big change going into 'net', especially since our previous, exceptionally large, contribution now has turned out to have problems. I wonder if we could not get away with something simpler for 'net'.
Looking closer at your series, it seems to me that only patches ## 1, 4, and the lock removal part of #5 are really needed to solve the problem we have at hand now. Why not merge those into one patch and deliver this to 'net', while reference count redesign parts can go into net-next ? Regards ///jon > -----Original Message----- > From: Ying Xue [mailto:ying....@windriver.com] > Sent: Monday, February 20, 2017 06:39 AM > To: Jon Maloy <jon.ma...@ericsson.com>; Parthasarathy Bhuvaragan > <parthasarathy.bhuvara...@ericsson.com>; thompa....@gmail.com > Cc: tipc-discussion@lists.sourceforge.net > Subject: [net 0/5] solve two deadlock issues > > Commit d094c4d5f5 ("tipc: add subscription refcount to avoid invalid > delete") accidently introduce the following deadlock scenarios: > > CPU1: CPU2: > ---------- ---------------- > tipc_nametbl_publish > spin_lock_bh(&tn->nametbl_lock) > tipc_nametbl_insert_publ > tipc_nameseq_insert_publ > tipc_subscrp_report_overlap > tipc_subscrp_get > tipc_subscrp_send_event > tipc_close_conn > tipc_subscrb_release_cb > tipc_subscrb_delete > tipc_subscrp_put > tipc_subscrp_put > tipc_subscrp_kref_release > tipc_nametbl_unsubscribe > spin_lock_bh(&tn->nametbl_lock) > <<grab nametbl_lock again>> > > CPU1: CPU2: > ---------- ---------------- > tipc_nametbl_stop > spin_lock_bh(&tn->nametbl_lock) > tipc_purge_publications > tipc_nameseq_remove_publ > tipc_subscrp_report_overlap > tipc_subscrp_get > tipc_subscrp_send_event > tipc_close_conn > tipc_subscrb_release_cb > tipc_subscrb_delete > tipc_subscrp_put > tipc_subscrp_put > tipc_subscrp_kref_release > tipc_nametbl_unsubscribe > spin_lock_bh(&tn->nametbl_lock) > <<grab nametbl_lock again>> > > The root cause of two deadlocks is that we have to hold nametbl lock when > subscription is freed in tipc_subscrp_kref_release(). In order to eliminate > the > need of taking nametbl lock in tipc_subscrp_kref_release(), the functions > protected by nametbl lock in tipc_subscrp_kref_release() are moved to > other places step by step in the series. > > Ying Xue (5): > tipc: advance the time of deleting subscription from > subscriber->subscrp_list > tipc: adjust the policy of holding subscription kref > tipc: adjust policy that sub->timer holds subscription kref > tipc: advance the time of calling tipc_nametbl_unsubscribe > tipc: remove unnecessary increasement of subscription refcount > > net/tipc/name_table.c | 2 ++ > net/tipc/subscr.c | 32 ++++++++++++++------------------ > net/tipc/subscr.h | 3 +++ > 3 files changed, 19 insertions(+), 18 deletions(-) > > -- > 2.7.4 ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot _______________________________________________ tipc-discussion mailing list tipc-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/tipc-discussion