commit e08db1131c686779e418fe1514deaecf666bf776
Author: thuan.tran <[email protected]>
Date: Fri Sep 28 09:29:29 2018 +0000

smf: campaign is executing forever until cluster reset [#1353]

The function getNodeDestination() reset elapsedTime to zero cause
the node reboot timeout at waitForNodeDestination() never reach.
If scenario that node reboot cannot come back then campaign is stuck
in executing forever until cluster reset.


---

** [tickets:#1353] smf: step undoing is in progress forever until cluster 
reset**

**Status:** review
**Milestone:** 5.18.09
**Created:** Tue Apr 28, 2015 01:33 PM UTC by Neelakanta Reddy
**Last Updated:** Tue Sep 25, 2018 07:06 AM UTC
**Owner:** Thuan
**Attachments:**

- 
[1353.tgz](https://sourceforge.net/p/opensaf/tickets/1353/attachment/1353.tgz) 
(475.2 kB; application/octet-stream)
- 
[messages_step_undo](https://sourceforge.net/p/opensaf/tickets/1353/attachment/messages_step_undo)
 (111.1 kB; application/octet-stream)


Test description:
1. rolling middle-ware upgrade(4.5->4.6) campaign is ran
2. one of the upgrade node(PL-4) the new rpms(4.6) are kept empty and the node 
comes up without opensaf installation
3. the step rollback is taken approximately two hours to describe the campaign 
as EXECUTION_FAILED
4. attaching syslog of SC-1

Apr 24 18:36:55 SLES1 osafamfd[2289]: NO Node 'PL-4' left the cluster
Apr 24 18:36:55 SLES1 osafimmnd[2237]: NO Implementer connected: 47 
(MsgQueueService132111) <2280, 2010f>
Apr 24 18:36:55 SLES1 osafimmnd[2237]: NO Implementer locally disconnected. 
Marking it as doomed 47 <2280, 2010f> (MsgQueueService132111)
Apr 24 18:36:55 SLES1 osafimmnd[2237]: NO Implementer disconnected 47 <2280, 
2010f> (MsgQueueService132111)
Apr 24 18:36:58 SLES1 kernel: [  172.812065] TIPC: Resetting link 
<1.1.1:eth0-1.1.4:eth0>, peer not responding
Apr 24 18:36:58 SLES1 kernel: [  172.812071] TIPC: Lost link 
<1.1.1:eth0-1.1.4:eth0> on network plane A
Apr 24 18:36:58 SLES1 kernel: [  172.812075] TIPC: Lost contact with <1.1.4>
Apr 24 18:37:15 SLES1 osafsmfd[2318]: NO Failed to get node dest for clm node 
safNode=PL-4,safCluster=myClmCluster
Apr 24 18:37:36 SLES1 osafsmfd[2318]: NO Failed to get node dest for clm node 
safNode=PL-4,safCluster=myClmCluster

-------------------
--------------
----------------------

Apr 24 20:36:00 SLES1 osafsmfd[2318]: NO Failed to get node dest for clm node 
safNode=PL-4,safCluster=myClmCluster
Apr 24 20:36:22 SLES1 osafsmfd[2318]: NO Failed to get node dest for clm node 
safNode=PL-4,safCluster=myClmCluster
Apr 24 20:36:44 SLES1 osafsmfd[2318]: NO Failed to get node dest for clm node 
safNode=PL-4,safCluster=myClmCluster
Apr 24 20:37:06 SLES1 osafsmfd[2318]: NO Failed to get node dest for clm node 
safNode=PL-4,safCluster=myClmCluster
Apr 24 20:37:28 SLES1 osafsmfd[2318]: NO Failed to get node dest for clm node 
safNode=PL-4,safCluster=myClmCluster
Apr 24 20:37:28 SLES1 osafsmfd[2318]: NO no node destination found whitin time 
limit for node safAmfNode=PL-4,safAmfCluster=myAmfCluster
Apr 24 20:37:28 SLES1 osafsmfd[2318]: NO no node destination found for node 
safAmfNode=PL-4,safAmfCluster=myAmfCluster
Apr 24 20:37:28 SLES1 osafsmfd[2318]: ER Failed to online install old bundles
Apr 24 20:37:28 SLES1 osafsmfd[2318]: ER Step undoing failed
Apr 24 20:37:28 SLES1 osafsmfd[2318]: NO Step safSmfStep=0004 in procedure 
safSmfProc=OpenSAF-upgrade failed, step result 5
Apr 24 20:37:28 SLES1 osafsmfd[2318]: NO CAMP: Procedure 
safSmfProc=OpenSAF-upgrade returned FAILED





---

Sent from sourceforge.net because [email protected] is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to