Hi,

I think there is no problem from CLM perspective. I have checked in both of the 
cases above, initialViewNumber are passed correctly at all stages and an 
application always distingiushes based on the passed initialveiwnumber.
So the fix is needed in AMF.
I will sent out a patch.

Thanks,
Praveen


---

** [tickets:#2372] amf/clm: CLM lock of two more nodes returns REPAIR_PENDING 
for first node.**

**Status:** accepted
**Milestone:** 5.0.2
**Created:** Tue Mar 14, 2017 09:29 AM UTC by Praveen
**Last Updated:** Thu Mar 16, 2017 07:08 AM UTC
**Owner:** Praveen
**Attachments:**

- 
[osafamfd](https://sourceforge.net/p/opensaf/tickets/2372/attachment/osafamfd) 
(3.4 MB; application/octet-stream)
- 
[osafclmd](https://sourceforge.net/p/opensaf/tickets/2372/attachment/osafclmd) 
(860.9 kB; application/octet-stream)


Steps to reproduce:
1) Bring 4 nodes cluster up.
2) Deploy AMf demo on PL-3 and PL-4.
3) LOCK amfd nodes PL-3 and PL-4.
4) Make arranegements so that termination of amf_demo on PL-3 takes  more time 
compare to PL-4.
5)From one terminal issue CLM lock of PL-3 first and in not time issue CLM lock 
of PL-4.

CLM and AMF traces are attached.  
Analysis:
When AMFD gets CLM track callback for PL-3 it starts terminating amf demo on 
PL-3. When termination of amf_demo still going on AMF gets another track 
callback with rootcausetentity as PL-4. However callback contains information 
of PL-3 also. AMFD starts terminating  amf_demo on PL-4 but at the same time it 
responds of PL-3 with invocation id of PL-4 callback. CLM assumes that PL-4 
change_started completed and sends completion callback for PL-4. In this 
callback, AMF clears internal flags which monitors the graceful removal of 
nodes. Since AMF never responded for PL-3 callback, callback timer expires in 
CLMD and it sends complete callback to AMF. AMF thinks this is the case of 
nodefailover and tries to failover PL-3.

Note: In all these stages, CLM sends track callback with information of all the 
nodes. AMF registers params are:
 
SA_TRACK_CURRENT|SA_TRACK_CHANGES_ONLY|SA_TRACK_VALIDATE_STEP|SA_TRACK_START_STEP.
  I am still evaluating whther issue is in CLM or AMF. Since AMF registers for 
**|SA_TRACK_CHANGES_ONLY|** should CLM give information of all the nodes in all 
subsequent callbacks?
 Also AMF should respond to callback when it has completed termination of comps.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to