Hi,
I think there is no problem from CLM perspective. I have checked in both of the
cases above, initialViewNumber are passed correctly at all stages and an
application always distingiushes based on the passed initialveiwnumber.
So the fix is needed in AMF.
I will sent out a patch.
Thanks,
Praveen
---
** [tickets:#2372] amf/clm: CLM lock of two more nodes returns REPAIR_PENDING
for first node.**
**Status:** accepted
**Milestone:** 5.0.2
**Created:** Tue Mar 14, 2017 09:29 AM UTC by Praveen
**Last Updated:** Thu Mar 16, 2017 07:08 AM UTC
**Owner:** Praveen
**Attachments:**
-
[osafamfd](https://sourceforge.net/p/opensaf/tickets/2372/attachment/osafamfd)
(3.4 MB; application/octet-stream)
-
[osafclmd](https://sourceforge.net/p/opensaf/tickets/2372/attachment/osafclmd)
(860.9 kB; application/octet-stream)
Steps to reproduce:
1) Bring 4 nodes cluster up.
2) Deploy AMf demo on PL-3 and PL-4.
3) LOCK amfd nodes PL-3 and PL-4.
4) Make arranegements so that termination of amf_demo on PL-3 takes more time
compare to PL-4.
5)From one terminal issue CLM lock of PL-3 first and in not time issue CLM lock
of PL-4.
CLM and AMF traces are attached.
Analysis:
When AMFD gets CLM track callback for PL-3 it starts terminating amf demo on
PL-3. When termination of amf_demo still going on AMF gets another track
callback with rootcausetentity as PL-4. However callback contains information
of PL-3 also. AMFD starts terminating amf_demo on PL-4 but at the same time it
responds of PL-3 with invocation id of PL-4 callback. CLM assumes that PL-4
change_started completed and sends completion callback for PL-4. In this
callback, AMF clears internal flags which monitors the graceful removal of
nodes. Since AMF never responded for PL-3 callback, callback timer expires in
CLMD and it sends complete callback to AMF. AMF thinks this is the case of
nodefailover and tries to failover PL-3.
Note: In all these stages, CLM sends track callback with information of all the
nodes. AMF registers params are:
SA_TRACK_CURRENT|SA_TRACK_CHANGES_ONLY|SA_TRACK_VALIDATE_STEP|SA_TRACK_START_STEP.
I am still evaluating whther issue is in CLM or AMF. Since AMF registers for
**|SA_TRACK_CHANGES_ONLY|** should CLM give information of all the nodes in all
subsequent callbacks?
Also AMF should respond to callback when it has completed termination of comps.
---
Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets