I think moving restorePbe from executeWrapup to commit is the right thing to do 
to solve this issue. That way when the cluster is rebooted, and smfd starts up 
again after the reboot, the campaign will be in init state, and nothing will be 
executed.

I think, conceptually, it also makes sense to not write everything to PBE until 
commit time. Then you can use a cluster reboot like a fallback.


---

** [tickets:#2648] smf: smfd crashes after cluster reboot when campaign is in 
ExecutionCompleted**

**Status:** accepted
**Milestone:** 5.17.10
**Created:** Thu Oct 19, 2017 06:45 PM UTC by Alex Jones
**Last Updated:** Thu Oct 19, 2017 06:45 PM UTC
**Owner:** Alex Jones


smfd crashes in updateImmAttr because it returns NO_RESOURCES. Here is how to 
reproduce:

1. enable PBE, and make sure the "disable" flag is set in OpenSafSmfConfig
2. execute an upgrade campaign, and let it go to "execution completed", but 
don't commit it
3. reboot the entire cluster
4. only allow 1 system controller to come up
5. smfd will attempt to re-execute the campaign
6. any writes to IMM (like setting an error because the campaign file can't be 
found) will fail with NO_RESOURCES and smfd will assert and crash

The reason for the assert and crash is because PBE has not been turned off by 
smfd before the campaign has been inititialized.


---

Sent from sourceforge.net because [email protected] is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to