[tickets] [opensaf:tickets] #709 saImmRepositoryInit attrib modification returns TRY_AGAIN in 2PBE scenario
- **status**: assigned -- duplicate - **assigned_to**: Anders Bjornerstedt -- nobody - **Milestone**: 4.4.0 -- never --- ** [tickets:#709] saImmRepositoryInit attrib modification returns TRY_AGAIN in 2PBE scenario** **Status:** duplicate **Created:** Tue Jan 07, 2014 10:23 AM UTC by surender khetavath **Last Updated:** Tue Jan 07, 2014 04:38 PM UTC **Owner:** nobody changeset : 4733 setup: 5nodes Test: 2PBE is configured as per README.2PBE. SC-1 was active and sc-2 standby Opensaf on all the 3 payloads is started. Then opensaf is stopped in this order PL-5,4,3 SC-2. Now modification to attibute saImmRepositoryInit returns ERR_TRY_AGAIN immcfg -a saImmRepositoryInit=2 safRdn=immManagement,safApp=safImmService error - saImmOmCcbObjectModify_2 FAILED: SA_AIS_ERR_TRY_AGAIN (6) Jan 7 15:42:51 SC-1 osafimmnd[2283]: NO Precheck of fevs message of type 33 failed with ERROR:18 error - immcfg command timed out (alarm) SC-1:/etc/opensaf # immcfg -a saImmRepositoryInit=2 safRdn=immManagement,safApp=safImmService error - saImmOmCcbObjectModify_2 FAILED: SA_AIS_ERR_TRY_AGAIN (6) Jan 7 15:43:57 SC-1 osafimmnd[2283]: NO Precheck of fevs message of type 33 failed with ERROR:18 error - immcfg command timed out (alarm) SC-1:/etc/opensaf # ls -l /var/crash/opensaf/ total 0 SC-1:/etc/opensaf # immlist safRdn=immManagement,safApp=safImmService Name Type Value(s) safRdn SA_STRING_T safRdn=immManagement saImmRepositoryInitSA_UINT32_T 1 (0x1) saImmOiTimeout SA_TIME_TEmpty saImmNumOisSA_UINT32_T Empty saImmNumInitializedCcbsSA_UINT32_T Empty saImmNumAdminOwnedObjects SA_UINT32_T Empty saImmLastUpdateSA_TIME_TEmpty saImmExportFileUri SA_STRING_T Empty SaImmAttrImplementerName SA_STRING_T Empty SaImmAttrClassName SA_STRING_T SaImmMngt SaImmAttrAdminOwnerNameSA_STRING_T Empty SC-1:/etc/opensaf # immcfg -a saImmRepositoryInit=2 safRdn=immManagement,safApp=safImmService error - saImmOmCcbObjectModify_2 FAILED: SA_AIS_ERR_TRY_AGAIN (6) Jan 7 15:46:26 SC-1 osafimmnd[2283]: NO Precheck of fevs message of type 33 failed with ERROR:18 error - immcfg command timed out (alarm) /var/log/messages: Jan 7 15:40:38 SC-1 osafamfd[2338]: NO Node 'SC-2' left the cluster Jan 7 15:40:38 SC-1 osafimmnd[2283]: NO Implementer connected: 69 (MsgQueueService131599) 691, 2010f Jan 7 15:40:38 SC-1 osafimmnd[2283]: NO Implementer locally disconnected. Marking it as doomed 69 691, 2010f (MsgQueueService131599) Jan 7 15:40:38 SC-1 opensaf_reboot: Rebooting remote node in the absence of PLM is outside the scope of OpenSAF Jan 7 15:40:38 SC-1 osafimmnd[2283]: NO Implementer disconnected 69 691, 2010f (MsgQueueService131599) Jan 7 15:40:38 SC-1 osafclmd[2319]: ER Node 131855 doesn't exist Jan 7 15:40:38 SC-1 kernel: [ 457.764164] TIPC: Resetting link 1.1.1:eth3-1.1.3:eth2, peer not responding Jan 7 15:40:38 SC-1 kernel: [ 457.764177] TIPC: Lost link 1.1.1:eth3-1.1.3:eth2 on network plane A Jan 7 15:40:38 SC-1 kernel: [ 457.764186] TIPC: Lost contact with 1.1.3 logs attached. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Rapidly troubleshoot problems before they affect your business. Most IT organizations don't have a clear picture of how application performance affects their revenue. With AppDynamics, you get 100% visibility into your Java,.NET, PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! http://pubads.g.doubleclick.net/gampad/clk?id=84349831iu=/4140/ostg.clktrk___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #709 saImmRepositoryInit attrib modification returns TRY_AGAIN in 2PBE scenario
--- ** [tickets:#709] saImmRepositoryInit attrib modification returns TRY_AGAIN in 2PBE scenario** **Status:** unassigned **Created:** Tue Jan 07, 2014 10:23 AM UTC by surender khetavath **Last Updated:** Tue Jan 07, 2014 10:23 AM UTC **Owner:** nobody changeset : 4733 setup: 5nodes Test: 2PBE is configured as per README.2PBE. SC-1 was active and sc-2 standby Opensaf on all the 3 payloads is started. Then opensaf is stopped in this order PL-5,4,3 SC-2. Now modification to attibute saImmRepositoryInit returns ERR_TRY_AGAIN immcfg -a saImmRepositoryInit=2 safRdn=immManagement,safApp=safImmService error - saImmOmCcbObjectModify_2 FAILED: SA_AIS_ERR_TRY_AGAIN (6) Jan 7 15:42:51 SC-1 osafimmnd[2283]: NO Precheck of fevs message of type 33 failed with ERROR:18 error - immcfg command timed out (alarm) SC-1:/etc/opensaf # immcfg -a saImmRepositoryInit=2 safRdn=immManagement,safApp=safImmService error - saImmOmCcbObjectModify_2 FAILED: SA_AIS_ERR_TRY_AGAIN (6) Jan 7 15:43:57 SC-1 osafimmnd[2283]: NO Precheck of fevs message of type 33 failed with ERROR:18 error - immcfg command timed out (alarm) SC-1:/etc/opensaf # ls -l /var/crash/opensaf/ total 0 SC-1:/etc/opensaf # immlist safRdn=immManagement,safApp=safImmService Name Type Value(s) safRdn SA_STRING_T safRdn=immManagement saImmRepositoryInitSA_UINT32_T 1 (0x1) saImmOiTimeout SA_TIME_TEmpty saImmNumOisSA_UINT32_T Empty saImmNumInitializedCcbsSA_UINT32_T Empty saImmNumAdminOwnedObjects SA_UINT32_T Empty saImmLastUpdateSA_TIME_TEmpty saImmExportFileUri SA_STRING_T Empty SaImmAttrImplementerName SA_STRING_T Empty SaImmAttrClassName SA_STRING_T SaImmMngt SaImmAttrAdminOwnerNameSA_STRING_T Empty SC-1:/etc/opensaf # immcfg -a saImmRepositoryInit=2 safRdn=immManagement,safApp=safImmService error - saImmOmCcbObjectModify_2 FAILED: SA_AIS_ERR_TRY_AGAIN (6) Jan 7 15:46:26 SC-1 osafimmnd[2283]: NO Precheck of fevs message of type 33 failed with ERROR:18 error - immcfg command timed out (alarm) /var/log/messages: Jan 7 15:40:38 SC-1 osafamfd[2338]: NO Node 'SC-2' left the cluster Jan 7 15:40:38 SC-1 osafimmnd[2283]: NO Implementer connected: 69 (MsgQueueService131599) 691, 2010f Jan 7 15:40:38 SC-1 osafimmnd[2283]: NO Implementer locally disconnected. Marking it as doomed 69 691, 2010f (MsgQueueService131599) Jan 7 15:40:38 SC-1 opensaf_reboot: Rebooting remote node in the absence of PLM is outside the scope of OpenSAF Jan 7 15:40:38 SC-1 osafimmnd[2283]: NO Implementer disconnected 69 691, 2010f (MsgQueueService131599) Jan 7 15:40:38 SC-1 osafclmd[2319]: ER Node 131855 doesn't exist Jan 7 15:40:38 SC-1 kernel: [ 457.764164] TIPC: Resetting link 1.1.1:eth3-1.1.3:eth2, peer not responding Jan 7 15:40:38 SC-1 kernel: [ 457.764177] TIPC: Lost link 1.1.1:eth3-1.1.3:eth2 on network plane A Jan 7 15:40:38 SC-1 kernel: [ 457.764186] TIPC: Lost contact with 1.1.3 logs attached. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Rapidly troubleshoot problems before they affect your business. Most IT organizations don't have a clear picture of how application performance affects their revenue. With AppDynamics, you get 100% visibility into your Java,.NET, PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! http://pubads.g.doubleclick.net/gampad/clk?id=84349831iu=/4140/ostg.clktrk___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #709 saImmRepositoryInit attrib modification returns TRY_AGAIN in 2PBE scenario
- **status**: unassigned -- assigned - **assigned_to**: Anders Bjornerstedt - **Milestone**: future -- 4.4.0 --- ** [tickets:#709] saImmRepositoryInit attrib modification returns TRY_AGAIN in 2PBE scenario** **Status:** assigned **Created:** Tue Jan 07, 2014 10:23 AM UTC by surender khetavath **Last Updated:** Tue Jan 07, 2014 10:23 AM UTC **Owner:** Anders Bjornerstedt changeset : 4733 setup: 5nodes Test: 2PBE is configured as per README.2PBE. SC-1 was active and sc-2 standby Opensaf on all the 3 payloads is started. Then opensaf is stopped in this order PL-5,4,3 SC-2. Now modification to attibute saImmRepositoryInit returns ERR_TRY_AGAIN immcfg -a saImmRepositoryInit=2 safRdn=immManagement,safApp=safImmService error - saImmOmCcbObjectModify_2 FAILED: SA_AIS_ERR_TRY_AGAIN (6) Jan 7 15:42:51 SC-1 osafimmnd[2283]: NO Precheck of fevs message of type 33 failed with ERROR:18 error - immcfg command timed out (alarm) SC-1:/etc/opensaf # immcfg -a saImmRepositoryInit=2 safRdn=immManagement,safApp=safImmService error - saImmOmCcbObjectModify_2 FAILED: SA_AIS_ERR_TRY_AGAIN (6) Jan 7 15:43:57 SC-1 osafimmnd[2283]: NO Precheck of fevs message of type 33 failed with ERROR:18 error - immcfg command timed out (alarm) SC-1:/etc/opensaf # ls -l /var/crash/opensaf/ total 0 SC-1:/etc/opensaf # immlist safRdn=immManagement,safApp=safImmService Name Type Value(s) safRdn SA_STRING_T safRdn=immManagement saImmRepositoryInitSA_UINT32_T 1 (0x1) saImmOiTimeout SA_TIME_TEmpty saImmNumOisSA_UINT32_T Empty saImmNumInitializedCcbsSA_UINT32_T Empty saImmNumAdminOwnedObjects SA_UINT32_T Empty saImmLastUpdateSA_TIME_TEmpty saImmExportFileUri SA_STRING_T Empty SaImmAttrImplementerName SA_STRING_T Empty SaImmAttrClassName SA_STRING_T SaImmMngt SaImmAttrAdminOwnerNameSA_STRING_T Empty SC-1:/etc/opensaf # immcfg -a saImmRepositoryInit=2 safRdn=immManagement,safApp=safImmService error - saImmOmCcbObjectModify_2 FAILED: SA_AIS_ERR_TRY_AGAIN (6) Jan 7 15:46:26 SC-1 osafimmnd[2283]: NO Precheck of fevs message of type 33 failed with ERROR:18 error - immcfg command timed out (alarm) /var/log/messages: Jan 7 15:40:38 SC-1 osafamfd[2338]: NO Node 'SC-2' left the cluster Jan 7 15:40:38 SC-1 osafimmnd[2283]: NO Implementer connected: 69 (MsgQueueService131599) 691, 2010f Jan 7 15:40:38 SC-1 osafimmnd[2283]: NO Implementer locally disconnected. Marking it as doomed 69 691, 2010f (MsgQueueService131599) Jan 7 15:40:38 SC-1 opensaf_reboot: Rebooting remote node in the absence of PLM is outside the scope of OpenSAF Jan 7 15:40:38 SC-1 osafimmnd[2283]: NO Implementer disconnected 69 691, 2010f (MsgQueueService131599) Jan 7 15:40:38 SC-1 osafclmd[2319]: ER Node 131855 doesn't exist Jan 7 15:40:38 SC-1 kernel: [ 457.764164] TIPC: Resetting link 1.1.1:eth3-1.1.3:eth2, peer not responding Jan 7 15:40:38 SC-1 kernel: [ 457.764177] TIPC: Lost link 1.1.1:eth3-1.1.3:eth2 on network plane A Jan 7 15:40:38 SC-1 kernel: [ 457.764186] TIPC: Lost contact with 1.1.3 logs attached. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Rapidly troubleshoot problems before they affect your business. Most IT organizations don't have a clear picture of how application performance affects their revenue. With AppDynamics, you get 100% visibility into your Java,.NET, PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! http://pubads.g.doubleclick.net/gampad/clk?id=84349831iu=/4140/ostg.clktrk___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #709 saImmRepositoryInit attrib modification returns TRY_AGAIN in 2PBE scenario
If I understand correctly, you have a cluster configured for 2PBE where only one SC is available. That means the cluster is by default not persistent writable (in the twoSafe2PBE state that requires both PBEs to be available). Ccb operations will be rejected with TRY_AGAIN in this state. It is possible, in a 2PBE system with only one SC currently available, to enter the oneSafe2PBE state. This will open up for allowing persistent writes using only one SC. Such writes will then of course only reach one of the two PBE files. This should be avoided unless the ccb is urgent, or necessary as part of the repair of the other SC. See the section titled 'oneSafe2PBE' in the README.2PBE file. In this particular case, the ccb is quite special. You are updating the repositoryInitMode, that controls the enablement of the PBE service itself. If the current state of the repositoryInitMode is SA_IMM_INIT_FROM_FILE then there should be no problem in applying the CCB. Even if 2PBE is configured and only one SC is available, when PBE is disabled then the cluster should be persistentWritable even if the PBE is not available (it not expected to be running). But if the current state of the repositoryInitMode is SA_IMM_KEEP_REPOSITORY then the oneSAfe2PBE needs to be toggled on to allow persistent writes in 2PBE with one SC. There is also an escape admin-op allowing PBE in general to be disabled, i.e. to change repositoryInitMode to FROM_FILE, when PBE is currently not persistent-wriable. This is relevant for both 1PBE and 2PBE in cases where the PBE(s) is/are permanently hung or congested. That admin-op is documented in the original section for PBE (1PBE). --- ** [tickets:#709] saImmRepositoryInit attrib modification returns TRY_AGAIN in 2PBE scenario** **Status:** assigned **Created:** Tue Jan 07, 2014 10:23 AM UTC by surender khetavath **Last Updated:** Tue Jan 07, 2014 11:31 AM UTC **Owner:** Anders Bjornerstedt changeset : 4733 setup: 5nodes Test: 2PBE is configured as per README.2PBE. SC-1 was active and sc-2 standby Opensaf on all the 3 payloads is started. Then opensaf is stopped in this order PL-5,4,3 SC-2. Now modification to attibute saImmRepositoryInit returns ERR_TRY_AGAIN immcfg -a saImmRepositoryInit=2 safRdn=immManagement,safApp=safImmService error - saImmOmCcbObjectModify_2 FAILED: SA_AIS_ERR_TRY_AGAIN (6) Jan 7 15:42:51 SC-1 osafimmnd[2283]: NO Precheck of fevs message of type 33 failed with ERROR:18 error - immcfg command timed out (alarm) SC-1:/etc/opensaf # immcfg -a saImmRepositoryInit=2 safRdn=immManagement,safApp=safImmService error - saImmOmCcbObjectModify_2 FAILED: SA_AIS_ERR_TRY_AGAIN (6) Jan 7 15:43:57 SC-1 osafimmnd[2283]: NO Precheck of fevs message of type 33 failed with ERROR:18 error - immcfg command timed out (alarm) SC-1:/etc/opensaf # ls -l /var/crash/opensaf/ total 0 SC-1:/etc/opensaf # immlist safRdn=immManagement,safApp=safImmService Name Type Value(s) safRdn SA_STRING_T safRdn=immManagement saImmRepositoryInitSA_UINT32_T 1 (0x1) saImmOiTimeout SA_TIME_TEmpty saImmNumOisSA_UINT32_T Empty saImmNumInitializedCcbsSA_UINT32_T Empty saImmNumAdminOwnedObjects SA_UINT32_T Empty saImmLastUpdateSA_TIME_TEmpty saImmExportFileUri SA_STRING_T Empty SaImmAttrImplementerName SA_STRING_T Empty SaImmAttrClassName SA_STRING_T SaImmMngt SaImmAttrAdminOwnerNameSA_STRING_T Empty SC-1:/etc/opensaf # immcfg -a saImmRepositoryInit=2 safRdn=immManagement,safApp=safImmService error - saImmOmCcbObjectModify_2 FAILED: SA_AIS_ERR_TRY_AGAIN (6) Jan 7 15:46:26 SC-1 osafimmnd[2283]: NO Precheck of fevs message of type 33 failed with ERROR:18 error - immcfg command timed out (alarm) /var/log/messages: Jan 7 15:40:38 SC-1 osafamfd[2338]: NO Node 'SC-2' left the cluster Jan 7 15:40:38 SC-1 osafimmnd[2283]: NO Implementer connected: 69 (MsgQueueService131599) 691, 2010f Jan 7 15:40:38 SC-1 osafimmnd[2283]: NO Implementer locally disconnected. Marking it as doomed 69 691, 2010f (MsgQueueService131599) Jan 7 15:40:38 SC-1 opensaf_reboot: Rebooting remote node in the absence of PLM is outside the scope of OpenSAF Jan 7 15:40:38 SC-1 osafimmnd[2283]: NO Implementer disconnected 69 691, 2010f (MsgQueueService131599) Jan 7 15:40:38 SC-1 osafclmd[2319]: ER Node 131855 doesn't exist Jan 7 15:40:38 SC-1 kernel: [ 457.764164] TIPC: Resetting link 1.1.1:eth3-1.1.3:eth2, peer not responding Jan 7 15:40:38 SC-1 kernel: [ 457.764177] TIPC: Lost link 1.1.1:eth3-1.1.3:eth2 on network plane A Jan