[tickets] [opensaf:tickets] #1442 log: unable to create new cfg/log files if openning files are corrupted
- **status**: accepted --> duplicate - **Comment**: This was fixed in #2215 --- ** [tickets:#1442] log: unable to create new cfg/log files if openning files are corrupted** **Status:** duplicate **Milestone:** 5.0.2 **Created:** Tue Aug 11, 2015 07:20 AM UTC by Vu Minh Nguyen **Last Updated:** Tue Sep 20, 2016 06:04 PM UTC **Owner:** Canh Truong When something wrong with opening cfg/log file (e.g: files on disk are deleted/moved), if there is any action that leads to create new cfg/log files, logsv will get failed to do that action as logsv sees it failed to rename the files (appending closed time to file names), then it ignores creating new ones. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2383 log: Both active and standby sites own the same log file name but different fd in share file system
- **status**: accepted --> review --- ** [tickets:#2383] log: Both active and standby sites own the same log file name but different fd in share file system** **Status:** review **Milestone:** 5.0.2 **Created:** Thu Mar 16, 2017 11:58 AM UTC by Canh Truong **Last Updated:** Fri Mar 17, 2017 07:09 AM UTC **Owner:** Canh Truong **Attachments:** - [osaflogd](https://sourceforge.net/p/opensaf/tickets/2383/attachment/osaflogd) (59.9 kB; application/octet-stream) Both active and standby side own the same log file name but different fd in share file system configuration. step to reproduce: 1/ create cfg App stream 2/ Write 100 log records to this stream by using saflogger 3/ Switchover while writing log records 4/ Delete cfg app stream. (both active site and standby side close the same log file but different fd) The issue is not always happen. (maybe making the loop step 2 and 3 sometimes) It's happen because when switchover, active side switch to quiesced and stanby does not immidiately switch to Active. In this short time, the log server in quiesced state may still get request from log agent if these request already put to mds queue. if the request are open stream or write async, the log file is opened and is not closed in switchover processing any more when it up to standby. An other side that switch from standby to active state may also open the same log file name. This cause when delete cfg app stream, closing, rename log file will happen in both active and standby. And just rename in active side is successfull, in standby will be failed Get info from mds PR document: "V_DEST_RL_QUIESCED When a VDEST-instance’s HA state is set to V_DEST_RL_QUIESCED: 1. Messages sent to the VDEST start getting buffered within the sender’s MDS layer. (See Table 5) 2. For a short period (MDS’s quiesced acknowledgment time), messages sent to the VDEST under normal view are allowed to reach their destined MDS Address. After that, the VDEST is automatically moved to "standby" HA state; messages received under normal VDEST view are from now on discarded. " --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2387 clm_locked spare controller got standby role after failover
--- ** [tickets:#2387] clm_locked spare controller got standby role after failover** **Status:** unassigned **Milestone:** 5.2.RC2 **Created:** Fri Mar 17, 2017 12:13 PM UTC by Ritu Raj **Last Updated:** Fri Mar 17, 2017 12:13 PM UTC **Owner:** nobody **Attachments:** - [SC-1.tar.bz2](https://sourceforge.net/p/opensaf/tickets/2387/attachment/SC-1.tar.bz2) (873.4 kB; application/x-bzip) - [SC-2.tar.bz2](https://sourceforge.net/p/opensaf/tickets/2387/attachment/SC-2.tar.bz2) (762.0 kB; application/x-bzip) - [SC-3.tar.bz2](https://sourceforge.net/p/opensaf/tickets/2387/attachment/SC-3.tar.bz2) (724.5 kB; application/x-bzip) ###Environment details OS : Suse 64bit Changeset : 8701 ( 5.2.RC1) 6 nodes setup(3 controller and 3 payload, with SC_ABSENCE enabled) ###Summary clm_locked spare controller got standby role after failover ###Steps followed & Observed behaviour 1. Initially SC-1 (ACTIVE), SC-2 (QUIESCED) , SC-3 (STANDBY) role 2. Performed clm_lock operation on SC-2(QUIESCED) controller 3. after, that perfomed on failover on Active controller (SC-1), by killing one director 4. Observed that SC-3 got Active role while SC-2 got Standby role, which is not expcted as node SC-2 is in clm_locked state 5. Later, SC-1 joined as QUIESCED controller (after recovery from failover) **Expected**: clm_lock node should not get standby role as it is in locked state and SC-1 should join as a Standby after recovery from failover. Syslog: Mar 17 17:56:59 suseR2-S2 osafimmnd[21809]: NO Implementer (applier) connected: 28 (@safSmf_applier1) <0, 2030f> Mar 17 17:56:59 suseR2-S2 osafamfnd[21859]: NO Assigning 'safSi=SC-2N,safApp=OpenSAF' STANDBY to 'safSu=SC-2,safSg=2N,safApp=OpenSAF' Mar 17 17:56:59 suseR2-S2 osafrded[21779]: NO RDE role set to STANDBY Mar 17 17:56:59 suseR2-S2 osafrded[21779]: NO Peer up on node 0x2030f Mar 17 17:56:59 suseR2-S2 osafrded[21779]: NO Got peer info request from node 0x2030f with role ACTIVE Mar 17 17:56:59 suseR2-S2 osafrded[21779]: NO Got peer info response from node 0x2030f with role ACTIVE Mar 17 17:56:59 suseR2-S2 osafimmd[21798]: NO MDS event from svc_id 24 (change:3, dest:13) Mar 17 17:56:59 suseR2-S2 osafimmd[21798]: NO MDS event from svc_id 24 (change:5, dest:13) Mar 17 17:56:59 suseR2-S2 osafimmd[21798]: NO MDS event from svc_id 24 (change:5, dest:13) Mar 17 17:56:59 suseR2-S2 osafimmd[21798]: NO MDS event from svc_id 25 (change:3, dest:566317113647120) Mar 17 17:56:59 suseR2-S2 osafimmd[21798]: NO MDS event from svc_id 25 (change:3, dest:565213543063568) Mar 17 17:56:59 suseR2-S2 osafimmd[21798]: IN AMF HA STANDBY request Mar 17 17:56:59 suseR2-S2 osafimmd[21798]: IN Added IMMND node with dest 566317113647120 Mar 17 17:56:59 suseR2-S2 osafimmd[21798]: IN Added IMMND node with dest 565213543063568 Mar 17 17:56:59 suseR2-S2 osafsmfd[21878]: WA saClmClusterNodeGet failed, rc=SA_AIS_ERR_UNAVAILABLE (31) Mar 17 17:56:59 suseR2-S2 osafsmfd[21878]: WA proc_mds_info: SMFND UP failed Mar 17 17:56:59 suseR2-S2 osafsmfd[21878]: WA saClmClusterNodeGet failed, rc=SA_AIS_ERR_UNAVAILABLE (31) Mar 17 17:56:59 suseR2-S2 osafsmfd[21878]: WA proc_mds_info: SMFND UP failed From Traces: SC-2 left the cluster as clm lock operation performed and later SC-1 left the cluster as one failover performed: ~~~ SC-2::: Mar 17 17:54:24.123134 osafamfnd [6773:src/amf/amfnd/clm.cc:0196] >> clm_track_cb: '0' '4' '1' Mar 17 17:54:24.123142 osafamfnd [6773:src/amf/amfnd/clm.cc:0217] TR Node has left the cluster 'safNode=SC-2,safCluster=myClmCluster', avnd_cb->first_time_up 0,notifItem->clusterNode.nodeId 131599, avnd_cb->node_info.nodeId 131343 - - SC-1::: Mar 17 17:57:03.514477 osafamfnd [9266:src/amf/amfnd/clm.cc:0196] >> clm_track_cb: '0' '4' '1' Mar 17 17:57:03.514484 osafamfnd [9266:src/amf/amfnd/clm.cc:0217] TR Node has left the cluster 'safNode=SC-1,safCluster=myClmCluster', avnd_cb->first_time_up 0,notifItem->clusterNode.nodeId 131343, avnd_cb->node_info.nodeId 131855 ~~~ after failover SC-2 got standby role and SC-3 Active : ~~~ SC::2 Mar 17 17:56:59.941081 osafamfnd [21859:src/amf/amfnd/susm.cc:1043] NO Assigned 'safSi=SC-2N,safApp=OpenSAF' STANDBY to 'safSu=SC-2,safSg=2N,safApp=OpenSAF' Mar 17 17:56:59.941089 osafamfnd [21859:src/amf/amfnd/err.cc:1639] >> is_no_assignment_due_to_escalations Mar 17 17:56:59.941097 osafamfnd [21859:src/amf/amfnd/err.cc:1651] << is_no_assignment_due_to_escalations: false Mar 17 17:56:59.941104 osafamfnd [21859:src/amf/amfnd/di.cc:0829] >> avnd_di_susi_resp_send: Sending Resp su=safSu=SC-2,safSg=2N,safApp=OpenSAF, si=safSi=SC-2N,safApp=OpenSAF, curr_state=2, prv_state=0 Mar 17 17:56:59.941112 osafamfnd [21859:src/amf/amfnd/di.cc:0839] TR curr_assign_state '3 SC:::3 Mar 17 17:57:03.656105 osafamfnd [9266:src/amf/amfnd/susm.cc:1043] NO Assigned 'safSi=SC-2N,safApp=OpenSAF' ACTIVE to 'safSu=SC-3,safSg=2N,safApp=OpenSAF' Mar 17 17:57:03.656113 osafamfnd [9266:src/amf/amfnd/err.cc:1639] >> is_no_a
[tickets] [opensaf:tickets] #2386 imm: decrement the pending reply when error is other than SA_AIS_OK when newCcbId is called
- **status**: accepted --> review --- ** [tickets:#2386] imm: decrement the pending reply when error is other than SA_AIS_OK when newCcbId is called** **Status:** review **Milestone:** 5.0.2 **Created:** Fri Mar 17, 2017 09:46 AM UTC by Neelakanta Reddy **Last Updated:** Fri Mar 17, 2017 10:08 AM UTC **Owner:** Neelakanta Reddy After Ccb operation returned SA_AIS_ERR_FAILED_OPERATION, Ccbfinalize returned TRY_AGAIN due to "Too many pending incoming fevs messages (> 16)" . In the imma_finalizeCcb, if the imma_newCcbId returns error other than SA_AIS_OK, imma_proc_decrement_pending_reply is not called. solution is to call imma_proc_decrement_pending_reply when error is not SA_AIS_OK logs: IMMA Mar 10 13:00:10.428430 imma [26655:imma_om_api.c:2907] TR objectDelete send RETURNED:1 Mar 10 13:00:10.428448 imma [26655:imma_om_api.c:3001] TR objectDelete really RETURNING:21 Mar 10 13:00:10.428456 imma [26655:imma_om_api.c:3002] << ccb_object_delete_common Mar 10 13:00:10.428485 imma [26655:imma_om_api.c:9384] >> saImmOmCcbGetErrorStrings Mar 10 13:00:10.428496 imma [26655:imma_om_api.c:9391] >> saImmOmCcbGetErrorStrings Mar 10 13:00:10.428503 imma [26655:imma_om_api.c:9452] << saImmOmCcbGetErrorStrings Mar 10 13:00:10.428651 imma [26655:imma_om_api.c:8838] >> imma_finalizeCcb Mar 10 13:00:10.428662 imma [26655:imma_om_api.c:8860] T1 CCb node found for ccbhandle 14aa837f7c79cc07 ccbid:4158 Mar 10 13:00:10.428670 imma [26655:imma_om_api.c:8929] TR Ccb is active when finalizing Mar 10 13:00:10.428774 imma [26655:imma_om_api.c:5777] >> accessor_get_common Mar 10 13:00:10.429358 imma [26655:imma_om_api.c:8956] TR CcbFinalize returned 6 Mar 10 13:00:10.429369 imma [26655:imma_om_api.c:1198] >> imma_newCcbId Mar 10 13:00:10.429373 imma [26655:imma_om_api.c:1199] TR imma_newCcbId:create new ccb id with admoId:103603 Mar 10 13:00:10.429377 imma [26655:imma_om_api.c:1232] TR Sending request for new ccbid with admin OwnerId:103603 Mar 10 13:00:10.429684 imma [26655:imma_om_api.c:6120] << accessor_get_common Mar 10 13:00:10.430233 imma [26655:imma_om_api.c:5777] >> accessor_get_common Mar 10 13:00:10.430653 imma [26655:imma_om_api.c:1302] << imma_newCcbId Mar 10 13:00:10.430663 imma [26655:imma_om_api.c:9070] << imma_finalizeCcb IMMND: Mar 10 13:00:10.430608 osafimmnd [8927:immsv_evt.c:5473] T8 Received: IMMND_EVT_A2ND_CCBINIT (15) from 2010f Mar 10 13:00:10.430612 osafimmnd [8927:immnd_evt.c:2641] >> immnd_evt_proc_ccb_init Mar 10 13:00:10.430615 osafimmnd [8927:immnd_evt.c:2666] T2 ERR_TRY_AGAIN: Too many pending incoming fevs messages (> 16) rejecting ccb_init request Mar 10 13:00:10.430619 osafimmnd [8927:immnd_evt.c:2722] T2 SENDRSP FAIL 6 Mar 10 13:00:10.430627 osafimmnd [8927:immnd_evt.c:2725] << immnd_evt_proc_ccb_init --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2386 imm: decrement the pending reply when error is other than SA_AIS_OK when newCcbId is called
- **summary**: imm: decrement the pending reply when error is other than SA_AIS_OK in finalizeCcb --> imm: decrement the pending reply when error is other than SA_AIS_OK when newCcbId is called --- ** [tickets:#2386] imm: decrement the pending reply when error is other than SA_AIS_OK when newCcbId is called** **Status:** accepted **Milestone:** 5.0.2 **Created:** Fri Mar 17, 2017 09:46 AM UTC by Neelakanta Reddy **Last Updated:** Fri Mar 17, 2017 09:54 AM UTC **Owner:** Neelakanta Reddy After Ccb operation returned SA_AIS_ERR_FAILED_OPERATION, Ccbfinalize returned TRY_AGAIN due to "Too many pending incoming fevs messages (> 16)" . In the imma_finalizeCcb, if the imma_newCcbId returns error other than SA_AIS_OK, imma_proc_decrement_pending_reply is not called. solution is to call imma_proc_decrement_pending_reply when error is not SA_AIS_OK logs: IMMA Mar 10 13:00:10.428430 imma [26655:imma_om_api.c:2907] TR objectDelete send RETURNED:1 Mar 10 13:00:10.428448 imma [26655:imma_om_api.c:3001] TR objectDelete really RETURNING:21 Mar 10 13:00:10.428456 imma [26655:imma_om_api.c:3002] << ccb_object_delete_common Mar 10 13:00:10.428485 imma [26655:imma_om_api.c:9384] >> saImmOmCcbGetErrorStrings Mar 10 13:00:10.428496 imma [26655:imma_om_api.c:9391] >> saImmOmCcbGetErrorStrings Mar 10 13:00:10.428503 imma [26655:imma_om_api.c:9452] << saImmOmCcbGetErrorStrings Mar 10 13:00:10.428651 imma [26655:imma_om_api.c:8838] >> imma_finalizeCcb Mar 10 13:00:10.428662 imma [26655:imma_om_api.c:8860] T1 CCb node found for ccbhandle 14aa837f7c79cc07 ccbid:4158 Mar 10 13:00:10.428670 imma [26655:imma_om_api.c:8929] TR Ccb is active when finalizing Mar 10 13:00:10.428774 imma [26655:imma_om_api.c:5777] >> accessor_get_common Mar 10 13:00:10.429358 imma [26655:imma_om_api.c:8956] TR CcbFinalize returned 6 Mar 10 13:00:10.429369 imma [26655:imma_om_api.c:1198] >> imma_newCcbId Mar 10 13:00:10.429373 imma [26655:imma_om_api.c:1199] TR imma_newCcbId:create new ccb id with admoId:103603 Mar 10 13:00:10.429377 imma [26655:imma_om_api.c:1232] TR Sending request for new ccbid with admin OwnerId:103603 Mar 10 13:00:10.429684 imma [26655:imma_om_api.c:6120] << accessor_get_common Mar 10 13:00:10.430233 imma [26655:imma_om_api.c:5777] >> accessor_get_common Mar 10 13:00:10.430653 imma [26655:imma_om_api.c:1302] << imma_newCcbId Mar 10 13:00:10.430663 imma [26655:imma_om_api.c:9070] << imma_finalizeCcb IMMND: Mar 10 13:00:10.430608 osafimmnd [8927:immsv_evt.c:5473] T8 Received: IMMND_EVT_A2ND_CCBINIT (15) from 2010f Mar 10 13:00:10.430612 osafimmnd [8927:immnd_evt.c:2641] >> immnd_evt_proc_ccb_init Mar 10 13:00:10.430615 osafimmnd [8927:immnd_evt.c:2666] T2 ERR_TRY_AGAIN: Too many pending incoming fevs messages (> 16) rejecting ccb_init request Mar 10 13:00:10.430619 osafimmnd [8927:immnd_evt.c:2722] T2 SENDRSP FAIL 6 Mar 10 13:00:10.430627 osafimmnd [8927:immnd_evt.c:2725] << immnd_evt_proc_ccb_init --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2386 imm: decrement the pending reply when error is other than SA_AIS_OK in finalizeCcb
- **summary**: imm: --> imm: decrement the pending reply when error is other than SA_AIS_OK in finalizeCcb - **Part**: nd --> lib --- ** [tickets:#2386] imm: decrement the pending reply when error is other than SA_AIS_OK in finalizeCcb** **Status:** accepted **Milestone:** 5.0.2 **Created:** Fri Mar 17, 2017 09:46 AM UTC by Neelakanta Reddy **Last Updated:** Fri Mar 17, 2017 09:46 AM UTC **Owner:** Neelakanta Reddy After Ccb operation returned SA_AIS_ERR_FAILED_OPERATION, Ccbfinalize returned TRY_AGAIN due to "Too many pending incoming fevs messages (> 16)" . In the imma_finalizeCcb, if the imma_newCcbId returns error other than SA_AIS_OK, imma_proc_decrement_pending_reply is not called. solution is to call imma_proc_decrement_pending_reply when error is not SA_AIS_OK logs: IMMA Mar 10 13:00:10.428430 imma [26655:imma_om_api.c:2907] TR objectDelete send RETURNED:1 Mar 10 13:00:10.428448 imma [26655:imma_om_api.c:3001] TR objectDelete really RETURNING:21 Mar 10 13:00:10.428456 imma [26655:imma_om_api.c:3002] << ccb_object_delete_common Mar 10 13:00:10.428485 imma [26655:imma_om_api.c:9384] >> saImmOmCcbGetErrorStrings Mar 10 13:00:10.428496 imma [26655:imma_om_api.c:9391] >> saImmOmCcbGetErrorStrings Mar 10 13:00:10.428503 imma [26655:imma_om_api.c:9452] << saImmOmCcbGetErrorStrings Mar 10 13:00:10.428651 imma [26655:imma_om_api.c:8838] >> imma_finalizeCcb Mar 10 13:00:10.428662 imma [26655:imma_om_api.c:8860] T1 CCb node found for ccbhandle 14aa837f7c79cc07 ccbid:4158 Mar 10 13:00:10.428670 imma [26655:imma_om_api.c:8929] TR Ccb is active when finalizing Mar 10 13:00:10.428774 imma [26655:imma_om_api.c:5777] >> accessor_get_common Mar 10 13:00:10.429358 imma [26655:imma_om_api.c:8956] TR CcbFinalize returned 6 Mar 10 13:00:10.429369 imma [26655:imma_om_api.c:1198] >> imma_newCcbId Mar 10 13:00:10.429373 imma [26655:imma_om_api.c:1199] TR imma_newCcbId:create new ccb id with admoId:103603 Mar 10 13:00:10.429377 imma [26655:imma_om_api.c:1232] TR Sending request for new ccbid with admin OwnerId:103603 Mar 10 13:00:10.429684 imma [26655:imma_om_api.c:6120] << accessor_get_common Mar 10 13:00:10.430233 imma [26655:imma_om_api.c:5777] >> accessor_get_common Mar 10 13:00:10.430653 imma [26655:imma_om_api.c:1302] << imma_newCcbId Mar 10 13:00:10.430663 imma [26655:imma_om_api.c:9070] << imma_finalizeCcb IMMND: Mar 10 13:00:10.430608 osafimmnd [8927:immsv_evt.c:5473] T8 Received: IMMND_EVT_A2ND_CCBINIT (15) from 2010f Mar 10 13:00:10.430612 osafimmnd [8927:immnd_evt.c:2641] >> immnd_evt_proc_ccb_init Mar 10 13:00:10.430615 osafimmnd [8927:immnd_evt.c:2666] T2 ERR_TRY_AGAIN: Too many pending incoming fevs messages (> 16) rejecting ccb_init request Mar 10 13:00:10.430619 osafimmnd [8927:immnd_evt.c:2722] T2 SENDRSP FAIL 6 Mar 10 13:00:10.430627 osafimmnd [8927:immnd_evt.c:2725] << immnd_evt_proc_ccb_init --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2386 imm:
--- ** [tickets:#2386] imm: ** **Status:** accepted **Milestone:** 5.0.2 **Created:** Fri Mar 17, 2017 09:46 AM UTC by Neelakanta Reddy **Last Updated:** Fri Mar 17, 2017 09:46 AM UTC **Owner:** Neelakanta Reddy After Ccb operation returned SA_AIS_ERR_FAILED_OPERATION, Ccbfinalize returned TRY_AGAIN due to "Too many pending incoming fevs messages (> 16)" . In the imma_finalizeCcb, if the imma_newCcbId returns error other than SA_AIS_OK, imma_proc_decrement_pending_reply is not called. solution is to call imma_proc_decrement_pending_reply when error is not SA_AIS_OK logs: IMMA Mar 10 13:00:10.428430 imma [26655:imma_om_api.c:2907] TR objectDelete send RETURNED:1 Mar 10 13:00:10.428448 imma [26655:imma_om_api.c:3001] TR objectDelete really RETURNING:21 Mar 10 13:00:10.428456 imma [26655:imma_om_api.c:3002] << ccb_object_delete_common Mar 10 13:00:10.428485 imma [26655:imma_om_api.c:9384] >> saImmOmCcbGetErrorStrings Mar 10 13:00:10.428496 imma [26655:imma_om_api.c:9391] >> saImmOmCcbGetErrorStrings Mar 10 13:00:10.428503 imma [26655:imma_om_api.c:9452] << saImmOmCcbGetErrorStrings Mar 10 13:00:10.428651 imma [26655:imma_om_api.c:8838] >> imma_finalizeCcb Mar 10 13:00:10.428662 imma [26655:imma_om_api.c:8860] T1 CCb node found for ccbhandle 14aa837f7c79cc07 ccbid:4158 Mar 10 13:00:10.428670 imma [26655:imma_om_api.c:8929] TR Ccb is active when finalizing Mar 10 13:00:10.428774 imma [26655:imma_om_api.c:5777] >> accessor_get_common Mar 10 13:00:10.429358 imma [26655:imma_om_api.c:8956] TR CcbFinalize returned 6 Mar 10 13:00:10.429369 imma [26655:imma_om_api.c:1198] >> imma_newCcbId Mar 10 13:00:10.429373 imma [26655:imma_om_api.c:1199] TR imma_newCcbId:create new ccb id with admoId:103603 Mar 10 13:00:10.429377 imma [26655:imma_om_api.c:1232] TR Sending request for new ccbid with admin OwnerId:103603 Mar 10 13:00:10.429684 imma [26655:imma_om_api.c:6120] << accessor_get_common Mar 10 13:00:10.430233 imma [26655:imma_om_api.c:5777] >> accessor_get_common Mar 10 13:00:10.430653 imma [26655:imma_om_api.c:1302] << imma_newCcbId Mar 10 13:00:10.430663 imma [26655:imma_om_api.c:9070] << imma_finalizeCcb IMMND: Mar 10 13:00:10.430608 osafimmnd [8927:immsv_evt.c:5473] T8 Received: IMMND_EVT_A2ND_CCBINIT (15) from 2010f Mar 10 13:00:10.430612 osafimmnd [8927:immnd_evt.c:2641] >> immnd_evt_proc_ccb_init Mar 10 13:00:10.430615 osafimmnd [8927:immnd_evt.c:2666] T2 ERR_TRY_AGAIN: Too many pending incoming fevs messages (> 16) rejecting ccb_init request Mar 10 13:00:10.430619 osafimmnd [8927:immnd_evt.c:2722] T2 SENDRSP FAIL 6 Mar 10 13:00:10.430627 osafimmnd [8927:immnd_evt.c:2725] << immnd_evt_proc_ccb_init --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #1190 AMF: saAmfSIPrefActiveAssignments has wrong default, stopping scaling nway active SGs
Pushed patch does not contain one change in si_ccb_apply_modify_hdlr(). The change was initially published with V1 of the patch and I did not update and include in V2 version when definition was modified. Now I have published it with the patch of ticket #2268. --- ** [tickets:#1190] AMF: saAmfSIPrefActiveAssignments has wrong default, stopping scaling nway active SGs** **Status:** fixed **Milestone:** 5.2.FC **Created:** Thu Oct 23, 2014 01:10 PM UTC by Hans Feldt **Last Updated:** Fri Mar 10, 2017 06:07 AM UTC **Owner:** Praveen Problem: In naway-active, SUs are not instantiated unless saAmfSIPrefActiveAssignments is configured. saAmfSIPrefActiveAssignments is a configuration attribute only valid for the nway-active redundancy model. According to the spec 3.6.5.3 it should have a default value of "the preferred number of assigned service units." and "saAmfSGNumPrefAssignedSUs" should have a default value of "the preferred number of in-service service units" and "saAmfSGNumPrefInserviceSUs" should have a default value of "the number of the service units configured for the service group." The value of saAmfSIPrefActiveAssignments is currently set to one when not configured, instead it should be set to saAmfSGNumPrefAssignedSUs. In order to avoid any backward compatibility issue, choice is left to the user for default value of the attribute. Default value of saAmfSIPrefActiveAssignments will be either saAmfSGNumPrefAssignedSUs or 1 based on user choice. Following are conditions in which different default values will be honoured: -if a user configures saAmfSIPrefActiveAssignments=1 then SI will assigned to only one SU.This is to ensure backward compatibility. -if a user does not configure attribute saAmfSIPrefActiveAssignments in application or deletes this attributes via CCB operation then AMFD will still honor default value as 1. This is again to ensure backward compatibility. -if a user sets saAmfSIPrefActiveAssignments=0 via CCB or in application conf then AMFD will use section 3.6.5 definition for default value i.e saAmfSGNumPrefAssignedSUs. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2268 amf: assignment from higher ranked SU is removed in N-Way Active model.
- **status**: assigned --> review - **Milestone**: 5.0.2 --> 5.2.RC2 --- ** [tickets:#2268] amf: assignment from higher ranked SU is removed in N-Way Active model.** **Status:** review **Milestone:** 5.2.RC2 **Created:** Wed Jan 18, 2017 05:41 AM UTC by Praveen **Last Updated:** Wed Mar 08, 2017 06:24 AM UTC **Owner:** Praveen **Attachments:** - [AppConfig-nwayactive_3SUs_1SIs.xml](https://sourceforge.net/p/opensaf/tickets/2268/attachment/AppConfig-nwayactive_3SUs_1SIs.xml) (13.7 kB; text/xml) When saAmfSIPrefActiveAssignments is reduced, AMFD removes assignments from higher ranked SU when siranked su is not configured. Steps to reproduce: 1) Bring attached application up on one controller. 2) The only SI is assigned to three SUs. Three SUs have different SURanks. Pref active assignments for SI is 3. 3) Reduce pref active assignment for the SI by running following command: immcfg -a saAmfSIPrefActiveAssignments=2 safSi=NWay_Active,safApp=NWay_Active 4)Since pref active assignments is reduced by 1, AMFD sends quiesced and removal of assignment to SU2. 5)SU2 has rank2. Assignments should be removed from SU3 which has rank 3. Assignments before reducing pref active assignmets: safSISU=safSu=SU1\,safSg=NWay_Active\,safApp=NWay_Active,safSi=NWay_Active,safApp=NWay_Active saAmfSISUHAState=ACTIVE(1) saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1) safSISU=safSu=SU2\,safSg=NWay_Active\,safApp=NWay_Active,safSi=NWay_Active,safApp=NWay_Active saAmfSISUHAState=ACTIVE(1) saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1) safSISU=safSu=SU3\,safSg=NWay_Active\,safApp=NWay_Active,safSi=NWay_Active,safApp=NWay_Active saAmfSISUHAState=ACTIVE(1) saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1) Assignments after reducing pre active assignments: safSISU=safSu=SU1\,safSg=NWay_Active\,safApp=NWay_Active,safSi=NWay_Active,safApp=NWay_Active saAmfSISUHAState=ACTIVE(1) saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1) safSISU=safSu=SU3\,safSg=NWay_Active\,safApp=NWay_Active,safSi=NWay_Active,safApp=NWay_Active saAmfSISUHAState=ACTIVE(1) saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1) --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2371 AMF: NPM app went into unstable state while expanding cluster
SI dep is configured among the SIs assigned in same SU. This ticket must be duplicate of \#92 AVSv: In NPM per SI level role failover needs to be implemented when SI-SI dependency within SU is configured. Analysis: 1) NG was locked and AMF sent quiesced assignment to the SU : Mar 17 4:37:51.067937 osafamfd [2562:src/amf/amfd/nodegroup.cc:1072] >> ng_admin_op_cb: 'safAmfNodeGroup=smfLockAdmNg13,safAmfCluster=myAmfCluster', inv:'936302870542', op:'2' Mar 17 4:37:51.070545 osafamfd [2562:src/amf/amfd/sg_npm_fsm.cc:4454] >> ng_admin: 'safSu=SU1,safSg=SGONE,safApp=NPMAPP', sg_fsm_state:0 Mar 17 4:37:51.070553 osafamfd [2562:src/amf/amfd/sgproc.cc:2319] >> avd_sg_su_si_mod_snd: 'safSu=SU1,safSg=SGONE,safApp=NPMAPP', state 3 Mar 17 4:37:51.070560 osafamfd 2)When response for quiesced state comes, AMFD tries to failover the SU and could not failover it as both sponsor and dependent are in same SU , so it sends deletion of assignment to the SU : Mar 17 4:37:51.229455 osafamfd [2562:src/amf/amfd/sgproc.cc:1104] >> avd_su_si_assign_evh: id:101, node:2010f, act:5, 'safSu=SU1,safSg=SGONE,safApp=NPMAPP', '', ha:3, err:1, single:0 Mar 17 4:37:51.230494 osafamfd [2562:src/amf/amfd/sg_npm_fsm.cc:0162] >> avd_sg_npm_su_chk_snd Mar 17 4:37:51.230507 osafamfd [2562:src/amf/amfd/si_dep.cc:1730] >> avd_sidep_is_su_failover_possible: SU:'safSu=SU1,safSg=SGONE,safApp=NPMAPP' node_state:2 Mar 17 4:37:51.230515 osafamfd [2562:src/amf/amfd/si_dep.cc:1734] TR :susi:safSi=NPMSI2,safApp=NPMAPP si_dep_state:3 state:3 fsm:3 Mar 17 4:37:51.230522 osafamfd [2562:src/amf/amfd/si_dep.cc:1573] >> avd_sidep_is_si_failover_possible: SI: 'safSi=NPMSI2,safApp=NPMAPP', SU safSu=SU1,safSg=SGONE,safApp=NPMAPP Mar 17 4:37:51.230530 osafamfd [2562:src/amf/amfd/si_dep.cc:1712] << avd_sidep_is_si_failover_possible: return value: 0 Mar 17 4:37:51.230536 osafamfd [2562:src/amf/amfd/si_dep.cc:1745] TR Role failover is deferred as sponsors role failover is under going Mar 17 4:37:51.230543 osafamfd [2562:src/amf/amfd/si_dep.cc:0205] TR 'safSi=NPMSI2,safApp=NPMAPP' si_dep_state ASSIGNED => FAILOVER_UNDER_PROGRESS Mar 17 4:37:51.230588 osafamfd [2562:src/amf/amfd/chkop.cc:0229] TR Async update Mar 17 4:37:51.230757 osafamfd [2562:src/amf/amfd/si_dep.cc:1752] << avd_sidep_is_su_failover_possible: return value: 0 Mar 17 4:37:51.230764 osafamfd [2562:src/amf/amfd/sg_npm_fsm.cc:0169] TR role modification cannot be done now as Sponsor SI's are not yet assigned Mar 17 4:37:51.230771 osafamfd [2562:src/amf/amfd/sg_npm_fsm.cc:0208] << avd_sg_npm_su_chk_snd: return value :2 Mar 17 4:37:51.230778 osafamfd [2562:src/amf/amfd/sgproc.cc:2434] >> avd_sg_su_si_del_snd: 'safSu=SU1,safSg=SGONE,safApp=NPMAPP' Mar 17 4:37:51.230795 osafamfd [2562:src/amf/amfd/su.cc:2462] >> any_susi_fsm_in: SU:'safSu=SU1,safSg=SGONE,safApp=NPMAPP', check_fsm:1 Mar 17 4:37:51.230803 osafamfd [2562:src/amf/amfd/su.cc:2467] TR SUSI:'safSu=SU1,safSg=SGONE,safApp=NPMAPP,safSi=NPMSI1,safApp=NPMAPP', fsm:'3' Mar 17 4:37:51.230809 osafamfd [2562:src/amf/amfd/su.cc:2467] TR SUSI:'safSu=SU1,safSg=SGONE,safApp=NPMAPP,safSi=NPMSI2,safApp=NPMAPP', fsm:'3' 3)After deletion of assignment, AMF again tries to failover the assignments but fails for the same reason as above. --- ** [tickets:#2371] AMF: NPM app went into unstable state while expanding cluster** **Status:** assigned **Milestone:** 5.2.RC2 **Created:** Tue Mar 14, 2017 08:03 AM UTC by Chani Srivastava **Last Updated:** Wed Mar 15, 2017 05:37 AM UTC **Owner:** Praveen **Attachments:** - [messages](https://sourceforge.net/p/opensaf/tickets/2371/attachment/messages) (86.9 kB; application/octet-stream) - [osafamfd](https://sourceforge.net/p/opensaf/tickets/2371/attachment/osafamfd) (13.0 MB; application/octet-stream) Environment details OS : Suse 64bit Changeset : 8603( 5.2.MO-1) 4 node cluster without PBE Summary - Application went into unstable state and campaign execution could not complete while expanding the cluster using campaign Steps: 1. Brought up an NPM application with 5 SUs 2. Using campaign add a 3rd payload PL-5 to the cluster App went into bad state Mar 17 04:38:13 NewSC1 osafamfd[2562]: NO 'safSg=SGONE,safApp=NPMAPP' is in unstable/transition state Mar 17 04:38:15 NewSC1 osafamfd[2562]: NO 'safSg=SGONE,safApp=NPMAPP' is in unstable/transition state Mar 17 04:38:17 NewSC1 osafamfd[2562]: NO 'safSg=SGONE,safApp=NPMAPP' is in unstable/transition state Mar 17 04:38:19 NewSC1 osafamfd[2562]: NO 'safSg=SGONE,safApp=NPMAPP' is in unstable/transition state Mar 17 04:38:21 NewSC1 osafamfd[2562]: NO 'safSg=SGONE,safApp=NPMAPP' is in unstable/transition state Mar 17 04:38:23 NewSC1 osafamfd[2562]: NO 'safSg=SGONE,safApp=NPMAPP' is in unstable/transition state Mar 17 04:38:25 NewSC1 osafamfd[2562]: NO 'safSg=SGONE,safApp=NPMAPP' is in unstable/transition state Mar 17 04:38:27 NewSC1 osafamfd[2562]: NO 'safSg=SGONE,safApp=NPMAPP' is in unstable/transition state Mar 17
[tickets] [opensaf:tickets] #2383 log: Both active and standby sites own the same log file name but different fd in share file system
- Attachments has changed: Diff: --- old +++ new @@ -0,0 +1 @@ +osaflogd (59.9 kB; application/octet-stream) --- ** [tickets:#2383] log: Both active and standby sites own the same log file name but different fd in share file system** **Status:** accepted **Milestone:** 5.0.2 **Created:** Thu Mar 16, 2017 11:58 AM UTC by Canh Truong **Last Updated:** Thu Mar 16, 2017 12:39 PM UTC **Owner:** Canh Truong **Attachments:** - [osaflogd](https://sourceforge.net/p/opensaf/tickets/2383/attachment/osaflogd) (59.9 kB; application/octet-stream) Both active and standby side own the same log file name but different fd in share file system configuration. step to reproduce: 1/ create cfg App stream 2/ Write 100 log records to this stream by using saflogger 3/ Switchover while writing log records 4/ Delete cfg app stream. (both active site and standby side close the same log file but different fd) The issue is not always happen. (maybe making the loop step 2 and 3 sometimes) It's happen because when switchover, active side switch to quiesced and stanby does not immidiately switch to Active. In this short time, the log server in quiesced state may still get request from log agent if these request already put to mds queue. if the request are open stream or write async, the log file is opened and is not closed in switchover processing any more when it up to standby. An other side that switch from standby to active state may also open the same log file name. This cause when delete cfg app stream, closing, rename log file will happen in both active and standby. And just rename in active side is successfull, in standby will be failed Get info from mds PR document: "V_DEST_RL_QUIESCED When a VDEST-instance’s HA state is set to V_DEST_RL_QUIESCED: 1. Messages sent to the VDEST start getting buffered within the sender’s MDS layer. (See Table 5) 2. For a short period (MDS’s quiesced acknowledgment time), messages sent to the VDEST under normal view are allowed to reach their destined MDS Address. After that, the VDEST is automatically moved to "standby" HA state; messages received under normal VDEST view are from now on discarded. " --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets