Hi Ying,

Thanks for the patches. I introduced these bugs and was sick for more than 2 
weeks, so you had to fix them.


I have few comments for the entire series.

[tipc-discussion] [net 2/5] tipc: adjust the policy of holding subscription 
kref<https://sourceforge.net/p/tipc/mailman/message/35676908/><https://sourceforge.net/p/tipc/mailman/message/35676908/>

The above commit looks good and should fix the issue reported by John.


[tipc-discussion] [net 1/5] tipc: advance the time of deleting subscription 
from 
subscriber->subscrp_list<https://sourceforge.net/p/tipc/mailman/message/35676906/>

In the above commit, we miss to delete sub->subscrp_list at subscription 
timeout. We placed this common code in refcount cleanup to avoid code 
duplication.

[tipc-discussion] [net 3/5] tipc: adjust policy that sub->timer holds 
subscription 
kref<https://sourceforge.net/p/tipc/mailman/message/35676904/><https://sourceforge.net/p/tipc/mailman/message/35676904/>

With this patch, we will never trigger a subscription cleanup at 
tipc_subscrp_timeout() as the refcount will be one when we exit 
tipc_subscrp_timeout().

Thus the subscription is still present in nametable and hence the subscriber 
will receive new events for this subscription after receiving the subscription 
timeout until the subscriber terminates.

We need to update our design policy that at subscription timeout, we remove all 
references to the subscription and it is no longer accessible. This conflicts 
with the change proposed by the above commit.


If we still need to include this commit, we need to adapt 
tipc_subscrp_timeout(), as:

1. tipc_nametbl_unsubscribe()
2. tipc_subscrp_put()
3. tipc_subscrp_put()

static void tipc_subscrp_delete(struct tipc_subscription *sub)
 {
        u32 timeout = htohl(sub->evt.s.timeout, sub->swap);

-       if (timeout == TIPC_WAIT_FOREVER || del_timer(&sub->timer))
+       if (timeout != TIPC_WAIT_FOREVER && del_timer(&sub->timer)) {
                tipc_subscrp_put(sub);
+               tipc_subscrp_put(sub);
        } else {
+               tipc_subscrp_put(sub);
        }
 }

But this makes the functions look odd as we adjust refcount twice in the same 
function. To get the subscription timeout cleanup working correctly, we need to 
sqash this commit together :[tipc-discussion] [net 4/5] tipc: advance the time 
of calling 
tipc_nametbl_unsubscribe<https://sourceforge.net/p/tipc/mailman/message/35676905/>

[tipc-discussion] [net 5/5] tipc: remove unnecessary increasement of 
subscription refcount<https://sourceforge.net/p/tipc/mailman/message/35676917/>

In this commit message, you need to edit the stacktraces as the scenario you 
describe will not happen due to your previous patches. As we never perform a 
tipc_subscrp_get() in tipc_subscrp_report_overlap().

/Partha
________________________________
From: John Thompson <thompa....@gmail.com>
Sent: Tuesday, February 21, 2017 10:15 PM
To: Ying Xue
Cc: Jon Maloy; Parthasarathy Bhuvaragan; tipc-discussion@lists.sourceforge.net
Subject: Re: [net 5/5] tipc: remove unnecessary increasement of subscription 
refcount

Sorry, I was mistaken.  You are right that you have removed the sub->lock from
tipc_subscrp_kref_release() and so there is no chance of a deadlock.

JT

On Tue, Feb 21, 2017 at 11:22 PM, Ying Xue 
<ying....@windriver.com<mailto:ying....@windriver.com>> wrote:
On 02/21/2017 06:13 AM, John Thompson wrote:
>
>
> On Tue, Feb 21, 2017 at 12:39 AM, Ying Xue 
> <ying....@windriver.com<mailto:ying....@windriver.com>
> <mailto:ying....@windriver.com<mailto:ying....@windriver.com>>> wrote:
>
>     As the policy of holding subscription in subscriber time has been
>     adjusted, it's unnecessary to hold subscription refcount before
>     tipc_subscrp_delete() is called. As a consequence, subscriber->lock
>     can be safely removed to avoid the following two deadlock scenarios:
>
>        CPU1:                             CPU2:
>     ----------                     ----------------
>     tipc_nametbl_publish
>     spin_lock_bh(&tn->nametbl_lock)
>     tipc_nametbl_insert_publ
>     tipc_nameseq_insert_publ
>     tipc_subscrp_report_overlap
>     tipc_subscrp_get
>     tipc_subscrp_send_event
>                                      tipc_close_conn
>                                      tipc_subscrb_release_cb
>                                      tipc_subscrb_delete
>                                      tipc_subscrp_put
>     tipc_subscrp_put
>     tipc_subscrp_kref_release
>     tipc_nametbl_unsubscribe
>     spin_lock_bh(&tn->nametbl_lock)
>     <<grab nametbl_lock again>>
>
>        CPU1:                              CPU2:
>     ----------                     ----------------
>     tipc_nametbl_stop
>     spin_lock_bh(&tn->nametbl_lock)
>     tipc_purge_publications
>     tipc_nameseq_remove_publ
>     tipc_subscrp_report_overlap
>     tipc_subscrp_get
>     tipc_subscrp_send_event
>                                      tipc_close_conn
>                                      tipc_subscrb_release_cb
>                                      tipc_subscrb_delete
>                                      tipc_subscrp_put
>     tipc_subscrp_put
>     tipc_subscrp_kref_release
>     tipc_nametbl_unsubscribe
>     spin_lock_bh(&tn->nametbl_lock)
>     <<grab nametbl_lock again>>
>
>     Reported-by: John Thompson 
> <thompa....@gmail.com<mailto:thompa....@gmail.com>
>     <mailto:thompa....@gmail.com<mailto:thompa....@gmail.com>>>
>     Reported-by: Jon Maloy 
> <jon.ma...@ericsson.com<mailto:jon.ma...@ericsson.com>
>     <mailto:jon.ma...@ericsson.com<mailto:jon.ma...@ericsson.com>>>
>     Fixes: d094c4d5f5 ("tipc: add subscription refcount to avoid invalid
>     delete")
>     Signed-off-by: Ying Xue 
> <ying....@windriver.com<mailto:ying....@windriver.com>
>     <mailto:ying....@windriver.com<mailto:ying....@windriver.com>>>
>     ---
>      net/tipc/subscr.c | 6 ------
>      1 file changed, 6 deletions(-)
>
>     diff --git a/net/tipc/subscr.c b/net/tipc/subscr.c
>     index 6624232..ea67cce 100644
>     --- a/net/tipc/subscr.c
>     +++ b/net/tipc/subscr.c
>     @@ -168,9 +168,7 @@ static void tipc_subscrp_kref_release(struct
>     kref *kref)
>             struct tipc_net *tn = net_generic(sub->net, tipc_net_id);
>             struct tipc_subscriber *subscriber = sub->subscriber;
>
>     -       spin_lock_bh(&subscriber->lock);
>             atomic_dec(&tn->subscription_count);
>     -       spin_unlock_bh(&subscriber->lock);
>             kfree(sub);
>             tipc_subscrb_put(subscriber);
>      }
>     @@ -202,11 +200,7 @@ static void tipc_subscrb_subscrp_delete(struct
>     tipc_subscriber *subscriber,
>                     list_del(&sub->subscrp_list);
>
>                     tipc_nametbl_unsubscribe(sub);
>     -               tipc_subscrp_get(sub);
>     -               spin_unlock_bh(&subscriber->lock);
>
>
> By removing the unlocking at this point couldn't you end up with a
> deadlock on subscriber->lock
> due to tipc_subscrp_delete() putting the subscrp twice (which could end
> up with kref == 0) and
> as a result calling tipc_subscrp_kref_release() which gets subscriber->lock?
>

I think there is no deadlock risk even if tipc_subscrp_put() will be
called twice in tipc_subscrp_delete() because
tipc_subscrp_kref_release() doesn't hold subscriber->lock after the change.

Please take a look at tipc_subscrp_kref_release() code:
static void tipc_subscrp_kref_release(struct kref *kref)
{
        struct tipc_subscription *sub = container_of(kref,
                                                     struct
tipc_subscription,
                                                     kref);
        struct tipc_net *tn = net_generic(sub->net, tipc_net_id);
        struct tipc_subscriber *subscriber = sub->subscriber;

        atomic_dec(&tn->subscription_count);
        kfree(sub);
        tipc_subscrb_put(subscriber);
}

>
>
>                     tipc_subscrp_delete(sub);
>     -               tipc_subscrp_put(sub);
>     -               spin_lock_bh(&subscriber->lock);
>
>                     if (s)
>                             break;
>     --
>     2.7.4
>
>

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Reply via email to