- **assigned_to**: Hans Feldt -->  nobody 


---

** [tickets:#368] standby went for reboot when dynamic addition of components 
during cold sync**

**Status:** unassigned
**Milestone:** future
**Created:** Fri May 31, 2013 03:40 AM UTC by Nagendra Kumar
**Last Updated:** Wed Sep 03, 2014 09:29 AM UTC
**Owner:** nobody

Migrated from http://devel.opensaf.org/ticket/2135

Dynamically added the amf components using immcfg -f Appconfig.xml, during cold 
sync.
Standby rebooted with the following error in syslog :


ep 28 15:26:54 SLES11-SLOT-2 osafimmnd[7464]: Implementer (applier) connected: 
12 (@safAmfService2020f) <7, 2020f>
Sep 28 15:26:55 SLES11-SLOT-2 osafamfnd[7531]: Started
Sep 28 15:26:55 SLES11-SLOT-2 opensafd: OpenSAF services successfully started
Sep 28 15:26:55 SLES11-SLOT-2 osafamfd[7521]: avd_app_get FAILED for 
'safApp=pinv-demo'
Sep 28 15:26:55 SLES11-SLOT-2 osafamfnd[7531]: Rebooting OpenSAF NodeId? = 
131599 EE Name = , Reason: AMF director unexpectedly crasched


Corresponding error in active controller's syslog :


Sep 28 15:24:31 SLES11-SLOT-1 osafamfd[13771]: mbcsv cold sync rsp term


Changed 20 months ago by srikanth 
■attachment traces.tgz  added 
Traces of amfnd,amfd on active controller and amfd on standby


  Changed 20 months ago by hafe ¶
  ■owner changed from ravisekhar to hafe 
■status changed from new to accepted 
  Changed 20 months ago by hafe ¶
  ■patch_waiting changed from no to yes 
  Changed 20 months ago by hafe ¶
  ■status changed from accepted to closed 
■version changed from 4.2.0 to 4.0.2 
■resolution set to fixed 
■milestone changed from 4.2.0.GA to 4.0.3.GA 
changeset: 2933:2356eb5c2680
branch: opensaf-4.0.x
parent: 2930:821e4def94c9
user: Hans Feldt <hans.feldt@…>
date: Tue Oct 11 11:48:29 2011 +0200
summary: avsv/avd: reject config changes when syncing (#2135)


changeset: 2934:29f4bc5bb7b8
branch: opensaf-4.1.x
parent: 2931:c18a63ea5bc7
user: Hans Feldt <hans.feldt@…>
date: Tue Oct 11 11:48:29 2011 +0200
summary: avsv/avd: reject config changes when syncing (#2135)


changeset: 2935:402094237a99
tag: tip
parent: 2932:5402847900e9
user: Hans Feldt <hans.feldt@…>
date: Tue Oct 11 11:48:29 2011 +0200
summary: avsv/avd: reject config changes when syncing (#2135)


remote: rev 2356eb5c26809171df77651d330aaae7117dc4a4 sent
remote: rev 29f4bc5bb7b8d8cbd616f18b5c423a30df8e8dcb sent
remote: rev 402094237a9965c3b18532ec77073b34e610e993 sent


  Changed 18 months ago by hafe ¶
  ■status changed from closed to reopened 
■resolution fixed deleted 
■milestone changed from 4.0.3.GA to 4.2.1 
changeset: 3078:a35a81d7d2e4
branch: opensaf-4.0.x
parent: 3075:601b34676798
user: Hans Feldt <hans.feldt@…>
date: Fri Nov 25 11:08:05 2011 +0100
summary: avsv/avd: reverted changeset 2933:2356eb5c2680


changeset: 3079:dc20e7d57852
branch: opensaf-4.1.x
parent: 3076:da08d2acdfc6
user: Hans Feldt <hans.feldt@…>
date: Fri Nov 25 11:08:28 2011 +0100
summary: avsv/avd: reverted changeset 2934:29f4bc5bb7b8


changeset: 3080:b4593b6c4b8a
tag: tip
parent: 3077:89b9c4f1fda9
user: Hans Feldt <hans.feldt@…>
date: Fri Nov 25 11:09:27 2011 +0100
summary: avsv/avd: reverted changeset 2935:402094237a99


  Changed 15 months ago by hafe ¶
  ■patch_waiting changed from yes to no 
  Changed 14 months ago by ehsjoar ¶
  ■milestone changed from 4.2.1 to future_releases 
TLC meeting 2012
Moved all major/minor non-accepted to future release.
The new process is for developers to pull from this pile.


follow-up: ↓ 8   Changed 4 months ago by hafe ¶
  ■status changed from reopened to accepted 
The problem never got a solution and has now happened again.


The active amfd checkpoints a new app in app_add_to_model() because of the 
runtime attributes saAmfApplicationAdminState & saAmfApplicationAdminState. 
This is kind of unnecessary because the standby will calculate the value of 
saAmfApplicationCurrNumSGs (but not saAmfApplicationAdminState - currently not 
used).


Probably the same kind of solution introduced in #2337 can solve the problem.


Same problem hypothetically exist for SG, SU, SI and comp since they use 
checkpointing in their respective "add_to_model" function thus the race can 
occur.


in reply to: ↑ 7   Changed 4 months ago by hafe ¶
  Replying to hafe:


The problem never got a solution and has now happened again.

The active amfd checkpoints a new app in app_add_to_model() because of the 
runtime attributes saAmfApplicationAdminState & saAmfApplicationAdminState. 
This is kind of unnecessary because the standby will calculate the value of 
saAmfApplicationCurrNumSGs (but not saAmfApplicationAdminState - currently not 
used).

Probably the same kind of solution introduced in #2337 can solve the problem.

Same problem hypothetically exist for SG, SU, SI and comp since they use 
checkpointing in their respective "add_to_model" function thus the race can 
occur.


Correction, I believe the problem is rather a race on the standby amfd side. It 
reads the model, then it sets itself as applier. Then it receives a cold sync 
for an instance (app) that does not exist. That instance was created with a CCB 
that must have sneaked in between the initial read of the model and the 
registration as applier.


amfd setting itself as applier before reading the model should reduce the 
window, but probably not down to zero. Needs more investigation


  Changed 4 months ago by hafe ¶
  ■milestone changed from future_releases to 4.2.3 
  Changed 4 months ago by hafe ¶
  It seems like responding with FAILURE in cold sync callbacks will result in a 
new cold sync started. With that change plus the move of applier setting the 
problem can no longer be reproduced. Will float a patch


  Changed 4 months ago by hafe ¶
  ■patch_waiting changed from no to yes 
  Changed 3 months ago by hafe ¶
   http://list.opensaf.org/pipermail/devel/2013-February/029132.html
please review


  Changed 3 months ago by hafe ¶
  ■patch_waiting changed from yes to no 
■milestone changed from 4.2.3 to future_releases 




---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
_______________________________________________
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to