Hi Lennart, I have published v2 of the patch with following changes: 1)Added a cautionary Note in the header of saNtfFinalize(). Commit log has enough details, I think, about the problem and solution and it will get committed with the fix, so a separate README is not needed. 2)Added two test cases for saNtfFinalize().
Thanks, Praveen On 13-Oct-15 2:13 PM, Lennart Lund wrote: > Comments: > 1. > The explanation indicates that the problem is not very easy to understand. In > order to save future maintainers some work the explanation below can be added > to for example the README file. > In the saNtfFinalize() you can add a comment saying that the mutex must not > be locked and a reference to the README file can be given. > > 2. > Add test-case(s) in the UML test suite. Maybe the test code attached to the > ticket can be used. > > Thanks > Lennart > >> -----Original Message----- >> From: [email protected] [mailto:[email protected]] >> Sent: den 12 oktober 2015 13:23 >> To: Lennart Lund; Minh Chau H >> Cc: [email protected] >> Subject: [PATCH 1 of 1] ntfa: fix deadlock situation due to saNtfFinalize >> [#1521] >> >> osaf/libs/agents/saf/ntfa/ntfa_api.c | 4 ++-- >> 1 files changed, 2 insertions(+), 2 deletions(-) >> >> >> Generally any API called after finalize returns with BAD_HANDLE. >> But in a multi-threaded application, deadlock situation may arise between >> saNtfFinalize() >> with saNtfNotificationUnsubscribe(), >> or with saNtfStateChangeNotificationAllocate(), >> or with and saNtfPtrValAllocate(). >> >> After getting successful response from the server, saNtfFinalize() takes lock >> on ntfa_cb.cb_lock >> and tries to free all the handles and resources. Here it gets block in >> ntfa_hdl_rec_del() >> ---->ncshm_destroy_hdl() (actually in in ncshm_destroy_hdl()--- >>> hm_block_me())) as client_handle >> is in use by other threads and waits for it to get released. At the same >> time, >> one of the above three APIS may be waiting for taking the lock on >> ntfa_cb.cb_lock which is >> not avialable as it is taken by saNtfFinalize(). >> >> From the solution perspective, the LEAP API where santfFinalize() gets >> blocked wants other threads to >> complete thier functionality and release the handle. In this situation, new >> API >> calls >> will return with BAD_HANDLE (LEAP functionality take_handle() fails) and >> existing API calls which has >> already taken the client_handle will be able to give back client_handle (LEAP >> functionality >> give_handle()). So if the lock is removed from Finalize(), other APIS will be >> able to complete their >> call and will return handle one by one which eventually will unblock >> saNtfFinalize(). >> >> diff --git a/osaf/libs/agents/saf/ntfa/ntfa_api.c >> b/osaf/libs/agents/saf/ntfa/ntfa_api.c >> --- a/osaf/libs/agents/saf/ntfa/ntfa_api.c >> +++ b/osaf/libs/agents/saf/ntfa/ntfa_api.c >> @@ -1187,7 +1187,6 @@ SaAisErrorT saNtfFinalize(SaNtfHandleT n >> rc = SA_AIS_ERR_NO_RESOURCES; >> >> if (rc == SA_AIS_OK) { >> - >> osafassert(pthread_mutex_lock(&ntfa_cb.cb_lock) == 0); >> >> /** delete the hdl rec >> ** including all resources allocated by client if MDS send is >> ** succesful. >> @@ -1197,7 +1196,6 @@ SaAisErrorT saNtfFinalize(SaNtfHandleT n >> TRACE_1("ntfa_hdl_rec_del >> failed"); >> rc = SA_AIS_ERR_BAD_HANDLE; >> } >> - >> osafassert(pthread_mutex_unlock(&ntfa_cb.cb_lock) == 0); >> } >> >> done_give_hdl: >> @@ -2739,6 +2737,7 @@ SaAisErrorT saNtfNotificationUnsubscribe >> >> ntfHandle = ntfHandleGet(subscriptionId); >> if (ntfHandle == 0) { >> + TRACE_1("ntfHandleGet failed, subscription not >> exist"); >> rc = SA_AIS_ERR_NOT_EXIST; >> goto done; >> } >> @@ -2772,6 +2771,7 @@ SaAisErrorT saNtfNotificationUnsubscribe >> /* Send a sync MDS message to obtain a log >> stream id */ >> rc = ntfa_mds_msg_sync_send(&ntfa_cb, >> &msg, &o_msg, timeout); >> if (rc != NCSCC_RC_SUCCESS) { >> + >> TRACE_1("ntfa_mds_msg_sync_send failed with: %u",rc); >> rc = SA_AIS_ERR_TRY_AGAIN; >> goto done_give_hdl; >> } ------------------------------------------------------------------------------ _______________________________________________ Opensaf-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/opensaf-devel
