Hi John. Yes you are right. But I still would prefer a condensed patch where we don’t touch the refcounts when this goes into ‘net’. I am awaiting a comment from Ying.
///jon From: John Thompson [mailto:thompa....@gmail.com] Sent: Tuesday, February 21, 2017 04:18 PM To: Jon Maloy <jon.ma...@ericsson.com> Cc: Ying Xue <ying....@windriver.com>; Parthasarathy Bhuvaragan <parthasarathy.bhuvara...@ericsson.com>; tipc-discussion@lists.sourceforge.net Subject: Re: [net 0/5] solve two deadlock issues Patch #2 removes the tipc_subscrp_get() and _put() from tipc_subscrp_report_overlap(). This prevents the problem of the early returns. JT On Wed, Feb 22, 2017 at 1:42 AM, Jon Maloy <jon.ma...@ericsson.com<mailto:jon.ma...@ericsson.com>> wrote: I don't see that you remove the two premature 'return's in subcsrb_report_overlap() in your series. These are also genuine bugs that must be fixed. ///jon > -----Original Message----- > From: Jon Maloy [mailto:jon.ma...@ericsson.com<mailto:jon.ma...@ericsson.com>] > Sent: Tuesday, February 21, 2017 06:12 AM > To: Ying Xue <ying....@windriver.com<mailto:ying....@windriver.com>>; > Parthasarathy Bhuvaragan > <parthasarathy.bhuvara...@ericsson.com<mailto:parthasarathy.bhuvara...@ericsson.com>>; > thompa....@gmail.com<mailto:thompa....@gmail.com> > Cc: > tipc-discussion@lists.sourceforge.net<mailto:tipc-discussion@lists.sourceforge.net> > Subject: Re: [tipc-discussion] [net 0/5] solve two deadlock issues > > Hi Ying, > These are good design changes, that definitely should go in asap. However, I > feel deeply uncomfortable with such a big change going into 'net', especially > since our previous, exceptionally large, contribution now has turned out to > have problems. I wonder if we could not get away with something simpler > for 'net'. > > Looking closer at your series, it seems to me that only patches ## 1, 4, and > the lock removal part of #5 are really needed to solve the problem we have > at hand now. Why not merge those into one patch and deliver this to 'net', > while reference count redesign parts can go into net-next ? > > Regards > ///jon > > > > -----Original Message----- > > From: Ying Xue > > [mailto:ying....@windriver.com<mailto:ying....@windriver.com>] > > Sent: Monday, February 20, 2017 06:39 AM > > To: Jon Maloy <jon.ma...@ericsson.com<mailto:jon.ma...@ericsson.com>>; > > Parthasarathy Bhuvaragan > > <parthasarathy.bhuvara...@ericsson.com<mailto:parthasarathy.bhuvara...@ericsson.com>>; > > thompa....@gmail.com<mailto:thompa....@gmail.com> > > Cc: > > tipc-discussion@lists.sourceforge.net<mailto:tipc-discussion@lists.sourceforge.net> > > Subject: [net 0/5] solve two deadlock issues > > > > Commit d094c4d5f5 ("tipc: add subscription refcount to avoid invalid > > delete") accidently introduce the following deadlock scenarios: > > > > CPU1: CPU2: > > ---------- ---------------- > > tipc_nametbl_publish > > spin_lock_bh(&tn->nametbl_lock) > > tipc_nametbl_insert_publ > > tipc_nameseq_insert_publ > > tipc_subscrp_report_overlap > > tipc_subscrp_get > > tipc_subscrp_send_event > > tipc_close_conn > > tipc_subscrb_release_cb > > tipc_subscrb_delete > > tipc_subscrp_put > > tipc_subscrp_put > > tipc_subscrp_kref_release > > tipc_nametbl_unsubscribe > > spin_lock_bh(&tn->nametbl_lock) > > <<grab nametbl_lock again>> > > > > CPU1: CPU2: > > ---------- ---------------- > > tipc_nametbl_stop > > spin_lock_bh(&tn->nametbl_lock) > > tipc_purge_publications > > tipc_nameseq_remove_publ > > tipc_subscrp_report_overlap > > tipc_subscrp_get > > tipc_subscrp_send_event > > tipc_close_conn > > tipc_subscrb_release_cb > > tipc_subscrb_delete > > tipc_subscrp_put > > tipc_subscrp_put > > tipc_subscrp_kref_release > > tipc_nametbl_unsubscribe > > spin_lock_bh(&tn->nametbl_lock) > > <<grab nametbl_lock again>> > > > > The root cause of two deadlocks is that we have to hold nametbl lock > > when subscription is freed in tipc_subscrp_kref_release(). In order to > > eliminate the need of taking nametbl lock in > > tipc_subscrp_kref_release(), the functions protected by nametbl lock > > in tipc_subscrp_kref_release() are moved to other places step by step in > the series. > > > > Ying Xue (5): > > tipc: advance the time of deleting subscription from > > subscriber->subscrp_list > > tipc: adjust the policy of holding subscription kref > > tipc: adjust policy that sub->timer holds subscription kref > > tipc: advance the time of calling tipc_nametbl_unsubscribe > > tipc: remove unnecessary increasement of subscription refcount > > > > net/tipc/name_table.c | 2 ++ > > net/tipc/subscr.c | 32 ++++++++++++++------------------ > > net/tipc/subscr.h | 3 +++ > > 3 files changed, 19 insertions(+), 18 deletions(-) > > > > -- > > 2.7.4 > > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most engaging > tech sites, SlashDot.org! http://sdm.link/slashdot > _______________________________________________ > tipc-discussion mailing list > tipc-discussion@lists.sourceforge.net<mailto:tipc-discussion@lists.sourceforge.net> > https://lists.sourceforge.net/lists/listinfo/tipc-discussion ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot _______________________________________________ tipc-discussion mailing list tipc-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/tipc-discussion