Hi John.
Yes you are right. But I still would prefer a condensed patch where we don’t 
touch the refcounts when this goes into ‘net’. I am awaiting a comment from 
Ying.

///jon


From: John Thompson [mailto:thompa....@gmail.com]
Sent: Tuesday, February 21, 2017 04:18 PM
To: Jon Maloy <jon.ma...@ericsson.com>
Cc: Ying Xue <ying....@windriver.com>; Parthasarathy Bhuvaragan 
<parthasarathy.bhuvara...@ericsson.com>; tipc-discussion@lists.sourceforge.net
Subject: Re: [net 0/5] solve two deadlock issues

Patch #2 removes the tipc_subscrp_get() and _put() from 
tipc_subscrp_report_overlap().
This prevents the problem of the early returns.
JT


On Wed, Feb 22, 2017 at 1:42 AM, Jon Maloy 
<jon.ma...@ericsson.com<mailto:jon.ma...@ericsson.com>> wrote:
I don't see that you remove the two premature 'return's in 
subcsrb_report_overlap() in your series. These are also genuine bugs that must 
be fixed.

///jon


> -----Original Message-----
> From: Jon Maloy [mailto:jon.ma...@ericsson.com<mailto:jon.ma...@ericsson.com>]
> Sent: Tuesday, February 21, 2017 06:12 AM
> To: Ying Xue <ying....@windriver.com<mailto:ying....@windriver.com>>; 
> Parthasarathy Bhuvaragan
> <parthasarathy.bhuvara...@ericsson.com<mailto:parthasarathy.bhuvara...@ericsson.com>>;
>  thompa....@gmail.com<mailto:thompa....@gmail.com>
> Cc: 
> tipc-discussion@lists.sourceforge.net<mailto:tipc-discussion@lists.sourceforge.net>
> Subject: Re: [tipc-discussion] [net 0/5] solve two deadlock issues
>
> Hi Ying,
> These are good design changes, that definitely should go in asap. However, I
> feel deeply uncomfortable with such a big change going into 'net', especially
> since our previous, exceptionally large, contribution now has turned out to
> have problems. I wonder if we could not get away with something simpler
> for 'net'.
>
> Looking closer at your series, it seems to me that only patches ## 1, 4, and
> the lock removal part of #5 are really needed to solve the problem we have
> at hand now. Why not merge those into one patch and deliver this to 'net',
> while reference count redesign parts can go into net-next ?
>
> Regards
> ///jon
>
>
> > -----Original Message-----
> > From: Ying Xue 
> > [mailto:ying....@windriver.com<mailto:ying....@windriver.com>]
> > Sent: Monday, February 20, 2017 06:39 AM
> > To: Jon Maloy <jon.ma...@ericsson.com<mailto:jon.ma...@ericsson.com>>; 
> > Parthasarathy Bhuvaragan
> > <parthasarathy.bhuvara...@ericsson.com<mailto:parthasarathy.bhuvara...@ericsson.com>>;
> >  thompa....@gmail.com<mailto:thompa....@gmail.com>
> > Cc: 
> > tipc-discussion@lists.sourceforge.net<mailto:tipc-discussion@lists.sourceforge.net>
> > Subject: [net 0/5] solve two deadlock issues
> >
> > Commit d094c4d5f5 ("tipc: add subscription refcount to avoid invalid
> > delete") accidently introduce the following deadlock scenarios:
> >
> >    CPU1:                             CPU2:
> > ----------                     ----------------
> > tipc_nametbl_publish
> > spin_lock_bh(&tn->nametbl_lock)
> > tipc_nametbl_insert_publ
> > tipc_nameseq_insert_publ
> > tipc_subscrp_report_overlap
> > tipc_subscrp_get
> > tipc_subscrp_send_event
> >                              tipc_close_conn
> >                              tipc_subscrb_release_cb
> >                              tipc_subscrb_delete
> >                              tipc_subscrp_put
> > tipc_subscrp_put
> > tipc_subscrp_kref_release
> > tipc_nametbl_unsubscribe
> > spin_lock_bh(&tn->nametbl_lock)
> > <<grab nametbl_lock again>>
> >
> >    CPU1:                              CPU2:
> > ----------                     ----------------
> > tipc_nametbl_stop
> > spin_lock_bh(&tn->nametbl_lock)
> > tipc_purge_publications
> > tipc_nameseq_remove_publ
> > tipc_subscrp_report_overlap
> > tipc_subscrp_get
> > tipc_subscrp_send_event
> >                              tipc_close_conn
> >                              tipc_subscrb_release_cb
> >                              tipc_subscrb_delete
> >                              tipc_subscrp_put
> > tipc_subscrp_put
> > tipc_subscrp_kref_release
> > tipc_nametbl_unsubscribe
> > spin_lock_bh(&tn->nametbl_lock)
> > <<grab nametbl_lock again>>
> >
> > The root cause of two deadlocks is that we have to hold nametbl lock
> > when subscription is freed in tipc_subscrp_kref_release(). In order to
> > eliminate the need of taking nametbl lock in
> > tipc_subscrp_kref_release(), the functions protected by nametbl lock
> > in tipc_subscrp_kref_release() are moved to other places step by step in
> the series.
> >
> > Ying Xue (5):
> >   tipc: advance the time of deleting subscription from
> >     subscriber->subscrp_list
> >   tipc: adjust the policy of holding subscription kref
> >   tipc: adjust policy that sub->timer holds subscription kref
> >   tipc: advance the time of calling tipc_nametbl_unsubscribe
> >   tipc: remove unnecessary increasement of subscription refcount
> >
> >  net/tipc/name_table.c |  2 ++
> >  net/tipc/subscr.c     | 32 ++++++++++++++------------------
> >  net/tipc/subscr.h     |  3 +++
> >  3 files changed, 19 insertions(+), 18 deletions(-)
> >
> > --
> > 2.7.4
>
>
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most engaging
> tech sites, SlashDot.org! http://sdm.link/slashdot
> _______________________________________________
> tipc-discussion mailing list
> tipc-discussion@lists.sourceforge.net<mailto:tipc-discussion@lists.sourceforge.net>
> https://lists.sourceforge.net/lists/listinfo/tipc-discussion

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Reply via email to