[tickets] [opensaf:tickets] #2111 imm: CcbObjectCreate() informs clients about the object being in another CCB
- **status**: review --> fixed - **Comment**: default (5.2) [staging:182d6b] changeset: 8208:182d6b40b476 user:Hung Nguyen date:Tue Oct 11 17:22:08 2016 +0700 summary: imm: Return error string when object is being created in another CCB [#2111] --- ** [tickets:#2111] imm: CcbObjectCreate() informs clients about the object being in another CCB** **Status:** fixed **Milestone:** 5.2.FC **Created:** Tue Oct 11, 2016 09:53 AM UTC by Hung Nguyen **Last Updated:** Wed Oct 12, 2016 03:17 AM UTC **Owner:** Hung Nguyen Unlike CcbObjectDelete() or CcbObjectModify() which have ERR_BUSY return code to inform the client that the targeted object being in another CCB, CcbObjectCreate() currently returns ERR_EXIST for that case. This is due to the SAF Application Interface Specification. The clients can't distinguish if the object has been successfully created or just being in another CCB. We can use CCB error string to inform client about the object being in another CCB. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2107 immtool: immcfg finalizes admo even when admo is not initialized
- **status**: review --> fixed - **Comment**: default (5.2) [staging:1eceec] changeset: 8209:1eceec883b6c user:Hung Nguyen date:Tue Oct 11 10:47:56 2016 +0700 summary: immtool: Don't finalize admo if it hasn't been initialized [#2107] opensaf-5.1.x [staging:f6e7f4] changeset: 8210:f6e7f4b93028 user:Hung Nguyen date:Tue Oct 11 10:47:56 2016 +0700 summary: immtool: Don't finalize admo if it hasn't been initialized [#2107] --- ** [tickets:#2107] immtool: immcfg finalizes admo even when admo is not initialized** **Status:** fixed **Milestone:** 5.1.1 **Created:** Mon Oct 10, 2016 03:46 AM UTC by Hung Nguyen **Last Updated:** Tue Oct 11, 2016 03:57 AM UTC **Owner:** Hung Nguyen The transaction mode in immcfg command fails with error SA_AIS_ERR_BAD_HANDLE when exiting The bug can be reproduced with following test case: ~~~ SC-1:~ # echo "" | immcfg error - saImmOmAdminOwnerFinalize FAILED: SA_AIS_ERR_BAD_HANDLE (9) SC-1:~ # echo $? 1 ~~~ See also ltrace below. It shows that saImmOmAdminOwnerFinalize() is called even though saImmOmAdminOwnerInitialize() has *never* been called, therefore handle must be invalid. ~~~ SC-1:~ # echo "" | ltrace immcfg __libc_start_main(0x4046f0, 1, 0x7ffd7991aca8, 0x415a90 _ZNSt8ios_base4InitC1Ev(0x61ea68, 0x7ffd7991aca8, 0x7ffd7991acb8, 5)= 0 __cxa_atexit(0x7febc7b07250, 0x61ea68, 0x61e588, 0x7ffd7991aaa0)= 0 __cxa_atexit(0x40b5b0, 0x61ea00, 0x61e588, 6) = 0 setenv("SA_ENABLE_EXTENDED_NAMES", "1", 1) = 0 osaf_extended_name_init(0x250bdd0, 0x7ffd7991ae68, 0, 0x7ffd7991cfca) = 0x7febc84d3148 getopt_long(1, 0x7ffd7991aca8, "a:c:f:t:dhmvuL:o:X:", 0x7ffd7991a7a0, 0)= -1 saImmOmInitialize(0x61ea88, 0, 0x7ffd7991a520, 0) = 1 fileno(0x7febc6f1c4e0) = 0 __fxstat(1, 0, 0x7ffd7991a690) = 0 __getdelim(0x7ffd7991a558, 0x7ffd7991a550, 10, 0x7febc6f1c4e0) = 1 strlen("\n")= 1 strlen("\n")= 1 free(0x252eac0) = __getdelim(0x7ffd7991a558, 0x7ffd7991a550, 10, 0x7febc6f1c4e0) = -1 free(0x252eac0) = saImmOmAdminOwnerFinalize(0, 0, 0x7febc6f1b658, 0) = 9 saf_error(9, 0, 0x7febc8256398, 0) = 0x7febc82b2a5f __fprintf_chk(0x7febc6f1c060, 1, 0x417030, 0x7febc82b2a5ferror - saImmOmAdminOwnerFinalize FAILED: SA_AIS_ERR_BAD_HANDLE (9))=68 saImmOmFinalize(0x38530002010f, 68, 0, -1) = 1 +++ exited (status 1) +++ ~~~ --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2054 log: fail to create directory when changing logRootDirectory
- **status**: review --> fixed - **assigned_to**: Vu Minh Nguyen --> nobody - **Comment**: changeset: 8213:32824c74a736 tag: tip parent: 8209:1eceec883b6c user:Vu Minh Nguyen date:Wed Oct 12 15:47:24 2016 +0700 summary: log: fix failure to create directory when changing logRootDirectory [#2054] changeset: 8212:38f928e934cc branch: opensaf-5.1.x parent: 8210:f6e7f4b93028 user:Vu Minh Nguyen date:Wed Oct 12 15:47:24 2016 +0700 summary: log: fix failure to create directory when changing logRootDirectory [#2054] changeset: 8211:9a42c66ed888 branch: opensaf-5.0.x parent: 8205:948f0b033f1f user:Vu Minh Nguyen date:Wed Oct 12 15:47:24 2016 +0700 summary: log: fix failure to create directory when changing logRootDirectory [#2054] --- ** [tickets:#2054] log: fail to create directory when changing logRootDirectory** **Status:** fixed **Milestone:** 5.0.2 **Created:** Wed Sep 21, 2016 04:18 AM UTC by Canh Truong **Last Updated:** Wed Sep 21, 2016 08:13 AM UTC **Owner:** nobody Steps to produce issue: 1/ Create new appstream with pathname: > immcfg -c SaLogStreamConfig safLgStrCfg=TestApp7 -a > saLogStreamPathName=./test -a saLogStreamFileName=TestApp7 2/ Change root directory: > immcfg -a logRootDirectory=/srv/shared logConfig=1,safApp=safLogService > osaflogd trace log: > Sep 21 11:02:28.045538 osaflogd [463:lgs_util.cc:0486] TR logsv_root_dir > "/srv/shared/saflog" > Sep 21 11:02:28.045554 osaflogd [463:lgs_util.cc:0487] TR path "./test" > Sep 21 11:02:28.045586 osaflogd [463:lgs_filehdl.cc:0341] >> make_log_dir_hdl > Sep 21 11:02:28.045604 osaflogd [463:lgs_filehdl.cc:0343] TR rootpath > "/srv/shared/saflog" > Sep 21 11:02:28.045621 osaflogd [463:lgs_filehdl.cc:0344] TR relpath "./test" > Sep 21 11:02:28.045643 osaflogd [463:lgs_filehdl.cc:0368] TR make_log_dir_hdl > - Path to create "/srv/shared/saflog/./test/" > Sep 21 11:02:28.045682 osaflogd [463:lgs_filehdl.cc:0389] TR make_log_dir_hdl > - Dir "/srv/shared/saflog/./test/" created > Sep 21 11:02:28.045701 osaflogd [463:lgs_filehdl.cc:0393] << > make_log_dir_hdl: mldh_rc = 0 > Sep 21 11:02:28.046676 osaflogd [463:lgs_util.cc:0511] << lgs_make_reldir_h: > rc = 0 > Sep 21 11:02:28.047021 osaflogd [463:lgs_util.cc:0104] TR > lgs_create_config_file_h - Config file path "/srv/shared/./test/TestApp7.cfg" > Sep 21 11:02:28.047075 osaflogd [463:lgs_filehdl.cc:0156] >> > create_config_file_hdl > Sep 21 11:02:28.047096 osaflogd [463:lgs_filehdl.cc:0158] TR > create_config_file_hdl - file_path "/srv/shared/./test/TestApp7.cfg" > Sep 21 11:02:28.048062 osaflogd [463:lgs_filehdl.cc:0168] NO Could not open > '/srv/shared/./test/TestApp7.cfg' - No such file or directory > Sep 21 11:02:28.048192 osaflogd [463:lgs_filehdl.cc:0218] << > create_config_file_hdl: rc = -1 > Sep 21 11:02:28.048255 osaflogd [463:lgs_util.cc:0165] << > lgs_create_config_file_h: rc = -1 > Sep 21 11:02:28.049655 osaflogd [463:lgs_imm.cc:1868] ER New config file > could not be created for stream: safLgStrCfg=TestApp7 > Sep 21 11:02:28.049820 osaflogd [463:lgs_stream.cc:0667] >> log_file_open > Sep 21 11:02:28.049847 osaflogd [463:lgs_stream.cc:0671] TR log_file_open - > Opening file "/srv/shared/./test/TestApp7_20160921_110227.log" > Sep 21 11:02:28.049865 osaflogd [463:lgs_stream.cc:0066] >> fileopen_h > Sep 21 11:02:28.049883 osaflogd [463:lgs_stream.cc:0082] TR fileopen_h - > filepath "/srv/shared/./test/TestApp7_20160921_110227.log" > Sep 21 11:02:28.050923 osaflogd [463:lgs_filehdl.cc:0417] >> fileopen_hdl > Sep 21 11:02:28.050993 osaflogd [463:lgs_filehdl.cc:0419] TR fileopen_hdl - > filepath "/srv/shared/./test/TestApp7_20160921_110227.log" > Sep 21 11:02:28.051185 osaflogd [463:lgs_filehdl.cc:0436] IN Could not open: > /srv/shared/./test/TestApp7_20160921_110227.log - No such file or directory > Sep 21 11:02:28.051232 osaflogd [463:lgs_filehdl.cc:0457] << fileopen_hdl > Sep 21 11:02:28.051285 osaflogd [463:lgs_stream.cc:0100] << fileopen_h > Sep 21 11:02:28.051314 osaflogd [463:lgs_stream.cc:0678] << log_file_open > Sep 21 11:02:28.051686 osaflogd [463:lgs_imm.cc:1878] ER New log file could > not be created for stream: safLgStrCfg=TestApp7 > New directory is not created. In consequence, creating cfg/log file of stream are failed. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Opensaf-ticke
[tickets] [opensaf:tickets] #2091 dtm: Use inotify to improve response time for transport monitor process
- **status**: accepted --> review --- ** [tickets:#2091] dtm: Use inotify to improve response time for transport monitor process** **Status:** review **Milestone:** 5.2.FC **Created:** Mon Oct 03, 2016 02:48 PM UTC by Anders Widell **Last Updated:** Tue Oct 04, 2016 11:31 AM UTC **Owner:** Hans Nordebäck Instead of checking the /proc file system once per second to montor the osafdtm process, we should investigate the possibility to use inotify to monitor the /proc file system. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2097 Both controllers went for reboot while recovering from split brain
I suggest to close this ticket with status "Invalid". The configuration above is not correct. --- ** [tickets:#2097] Both controllers went for reboot while recovering from split brain** **Status:** unassigned **Milestone:** 5.2.FC **Created:** Thu Oct 06, 2016 04:58 AM UTC by Chani Srivastava **Last Updated:** Thu Oct 06, 2016 12:17 PM UTC **Owner:** nobody **Attachments:** - [Fencing_logs.zip](https://sourceforge.net/p/opensaf/tickets/2097/attachment/Fencing_logs.zip) (43.3 kB; application/zip) S : Ubuntu 64bit Changeset : 7997 ( 5.1.FC) Setup : 3-node cluster (2 controllers, 1 payload) Remote fencing enabled Steps: 1. Bring up OpenSaf on all nodes 2. Enable STONITH 3. Disconnect network from both controllers at the same time -- This will stimulate split brain and both controllers become ACTIVE 4. Connect network to both controllers together --- Both controllers reboot Expected: Controllers should join the cluster by rebooting only one of the controller. Syslog attached for both controllers --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2113 imm: IMMND sending ADMOP_RSP message to itself
- **status**: assigned --> accepted --- ** [tickets:#2113] imm: IMMND sending ADMOP_RSP message to itself** **Status:** accepted **Milestone:** 5.0.2 **Created:** Wed Oct 12, 2016 04:19 AM UTC by Hung Nguyen **Last Updated:** Wed Oct 12, 2016 04:21 AM UTC **Owner:** Hung Nguyen **Attachments:** - [logs.7z](https://sourceforge.net/p/opensaf/tickets/2113/attachment/logs.7z) (146.3 kB; application/octet-stream) OM client and OI are in the same node. The OM client invokes an adm operation and crash before the OI sending response to IMM. ~~~ 10:56:48.422146 osafimmnd [1075:immsv_evt.c:5422] T8 Received: IMMND_EVT_A2ND_IMM_ADMOP (12) from 0 10:56:48.422196 osafimmnd [1075:ImmModel.cc:12721] T5 IMPLEMENTER FOR ADMIN OPERATION INVOKE 5 conn:378 node:2010f name:xhunngu 10:56:48.422202 osafimmnd [1075:ImmModel.cc:12729] T5 Updating req invocation inv:12884901889 conn:379 timeout:61 10:56:48.422208 osafimmnd [1075:ImmModel.cc:12736] TR Located pre request continuation 12884901889 adjusting timeout to 61 10:56:48.422214 osafimmnd [1075:ImmModel.cc:12764] T5 Storing impl invocation 378 for inv: 12884901889 ... 10:56:49.426901 osafimmnd [1075:immnd_evt.c:10350] T2 IMMA DOWN EVENT 10:56:49.426974 osafimmnd [1075:ImmModel.cc:13718] >> discardContinuations 10:56:49.426997 osafimmnd [1075:ImmModel.cc:13726] T5 Discarding Adm Req continuation 12884901889 10:56:49.427039 osafimmnd [1075:ImmModel.cc:13767] << discardContinuations ~~~ When receiving the response from OI, IMMND finds the continuation in AdmImpl map but doesn't find it in AdmReq map. That makes IMMND think it's in a different node from the OM client and try to send ND2ND_ADMOP_RSP message to itself. ~~~ 10:56:51.423769 osafimmnd [1075:immsv_evt.c:5422] T8 Received: IMMND_EVT_A2ND_ADMOP_RSP (21) from 2010f 10:56:51.423813 osafimmnd [1075:ImmModel.cc:12897] T5 Fetch implCon for invocation:12884901889 10:56:51.423822 osafimmnd [1075:ImmModel.cc:12907] T5 IMPL ADM CONTINUATION 378 FOUND FOR 12884901889 10:56:51.423832 osafimmnd [1075:immnd_evt.c:4912] T2 invocation:12884901889, result:1 impl:378 req:0 dest:564118102108831 me:564118102108831 10:56:51.423839 osafimmnd [1075:immnd_evt.c:5016] T2 FORWARDING TO OTHER ND! 10:56:51.423845 osafimmnd [1075:immsv_evt.c:5408] T8 Sending: IMMND_EVT_ND2ND_ADMOP_RSP to 2010f 10:56:51.424702 osafimmnd [1075:immnd_mds.c:0740] WA MDS Send Failed to service:IMMND rc:2 10:56:51.425065 osafimmnd [1075:immnd_evt.c:5021] ER Problem in sending to peer IMMND over MDS. Discarding admin op reply. 10:56:51.425550 osafimmnd [1075:immnd_evt.c:0699] WA Error code 2 returned for message type 21 - ignoring ~~~ Attached is traces and syslog --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2114 smf: balanced upgrade, missing removal of exectrl copy
--- ** [tickets:#2114] smf: balanced upgrade, missing removal of exectrl copy** **Status:** assigned **Milestone:** 5.1.1 **Created:** Wed Oct 12, 2016 11:01 AM UTC by Rafael **Last Updated:** Wed Oct 12, 2016 11:01 AM UTC **Owner:** Rafael When doing several bisu upgrades it was noticed that a IMM copy of execControl object was not removed after a upgrade. Looking into the code there is a bug which would case SMF to never remove this execControl copy. Then the copy would be reused in the next campaign if it did an SI swap. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2098 amfnd: amfnd doesn't exit at opensafd stop
- **status**: assigned --> accepted --- ** [tickets:#2098] amfnd: amfnd doesn't exit at opensafd stop** **Status:** accepted **Milestone:** 5.0.2 **Created:** Thu Oct 06, 2016 09:02 AM UTC by Hans Nordebäck **Last Updated:** Wed Oct 12, 2016 04:29 AM UTC **Owner:** Praveen amfnd doesn't exit at opensafd stop. This problem can be reproduced in UML using the AmfDemo application, nwayactive: 1) Change the AppConfig-nwayactive.xml file to: saAmfCtCompCategory 8 2) Do the following changes to the amf_demo_script to: start() { return 0 } build_uml install_testprog opensaf start 3) unlock-in, unlock all 5 amf demo SUs 4) At PL-3 modify /opt/amf_demo_script : stop() { while true do sleep 2 done } run /etc/init.d/opensafd stop and the following problem occurs: Oct 6 10:45:37 PL-3 local0.notice osafamfnd[423]: NO Reason:'Script did not exit within time' Oct 6 10:45:37 PL-3 local0.warn osafamfnd[423]: WA 'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo2' Presence State TERMINATING => TERMINATION_FAILED Oct 6 10:45:37 PL-3 local0.notice osafamfnd[423]: NO 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo2' Presence State TERMINATING => TERMINATION_FAILED Oct 6 10:46:18 PL-3 user.notice opensafd: amfnd has not yet exited, killing it forcibly. Oct 6 10:46:18 PL-3 local0.alert osafclmna[414]: AL AMF Node Director is down, terminate this process Oct 6 10:46:18 PL-3 local0.crit osafamfwd[489]: Rebooting OpenSAF NodeId = 0 EE Name = No EE Mapped, Reason: AMF unexpectedly crashed, OwnNodeId = 131855, SupervisionTime = 60 Oct 6 10:46:18 PL-3 user.notice opensaf_reboot: Rebooting local node; timeout=60 O --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #1797 Spare controller failed to become active when both ACTIVE and STANDBY SC rebooted
This issue is still reproducible in opensaf 5.1.GA Changeset- 8190 Headless feature enabled Steps: 1. Brought Up cluster with 3 controller (Active, Standby, Spare) 2. Kill any director of active and standby followed by 2 second >> Quiesced controller failed to took active role and rebooted --- ** [tickets:#1797] Spare controller failed to become active when both ACTIVE and STANDBY SC rebooted** **Status:** unassigned **Milestone:** 5.0.2 **Created:** Fri Apr 29, 2016 09:36 AM UTC by Ritu Raj **Last Updated:** Tue Sep 20, 2016 05:45 PM UTC **Owner:** nobody setup: Changeset- 7436 Version - opensaf 5.0 FC * Issue Observed: Spare controller failed to become active when both ACTIVE and STANDBY SC rebooted resulted cluster reset. * Steps To Reproduce: 1. Brought up cluster, where SC-1 took active role SC-2 standby and SC-3 in quiesced state, PL-6 and PL-7 are payloads. 2. Kill any director of active and standby followed by 2 second 3. Observed that quiesced controller failed to took active role and cluster reset happened >> SCALE_SLOT-93:~ # May 2 18:25:33 SCALE_SLOT-93 osafimmnd[1767]: NO Implementer disconnected 5 <0, 2010f> (safAmfService) May 2 18:25:34 SCALE_SLOT-93 osafamfnd[1817]: WA AMF director unexpectedly crashed May 2 18:25:34 SCALE_SLOT-93 osafamfnd[1817]: Rebooting OpenSAF NodeId = 131855 EE Name = , Reason: local AVD down(Adest) or both AVD down(Vdest) received, OwnNodeId = 131855, SupervisionTime = 60 May 2 18:25:34 SCALE_SLOT-93 osafimmnd[1767]: NO Implementer disconnected 17 <0, 2020f> (@safAmfService2020f) May 2 18:25:34 SCALE_SLOT-93 opensaf_reboot: Rebooting local node; timeout=60 May 2 18:25:38 SCALE_SLOT-93 kernel: [273050.885507] md: stopping all md devices. May 2 18:25:38 SCALE_SLOT-93 kernel: [273051.878473] sd 0:0:0:0: [sda] Synchronizing SCSI cache << --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2115 amfnd: loses sync with director if PG track action msg is sent during SC recovery
--- ** [tickets:#2115] amfnd: loses sync with director if PG track action msg is sent during SC recovery** **Status:** assigned **Milestone:** 5.1.1 **Created:** Wed Oct 12, 2016 10:30 PM UTC by Gary Lee **Last Updated:** Wed Oct 12, 2016 10:30 PM UTC **Owner:** Gary Lee After SC absence, active amfd will reject messages from 'veteran' amfnds until its local amfnd has started. During this period, if a PG track action msg is sent and rejected by amfd, it will cause the sending amfnd to lose sync with amfd. So we should also queue this message to be re-sent. Oct 11 18:06:01 SC-1 osafamfd[12545]: WA avd_msg_sanity_chk: invalid msg id 3, msg type 8, from 2030f should be 2 Oct 11 18:06:01 SC-1 osafamfd[12545]: WA avd_msg_sanity_chk: invalid msg id 4, msg type 8, from 2030f should be 2 Oct 11 18:06:10 SC-1 osafamfd[12545]: WA avd_msg_sanity_chk: invalid msg id 5, msg type 6, from 2030f should be 2 Oct 11 18:06:20 SC-1 osafamfd[12545]: WA avd_msg_sanity_chk: invalid msg id 6, msg type 8, from 2030f should be 5 Oct 11 18:06:20 SC-1 osafamfd[12545]: WA avd_msg_sanity_chk: invalid msg id 7, msg type 8, from 2030f should be 5 Oct 11 18:06:20 SC-1 osafamfd[12545]: WA avd_msg_sanity_chk: invalid msg id 8, msg type 8, from 2030f should be 5 Oct 11 18:06:20 SC-1 osafamfd[12545]: WA avd_msg_sanity_chk: invalid msg id 9, msg type 8, from 2030f should be 5 After set_leds event is received by amfnd, it can be seen that msgs with id 3 and 4 are retransmitted, but 5 is not received by amfd. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2115 amfnd: loses sync with director if PG track action msg is sent during SC recovery
- **status**: assigned --> review --- ** [tickets:#2115] amfnd: loses sync with director if PG track action msg is sent during SC recovery** **Status:** review **Milestone:** 5.1.1 **Created:** Wed Oct 12, 2016 10:30 PM UTC by Gary Lee **Last Updated:** Wed Oct 12, 2016 10:30 PM UTC **Owner:** Gary Lee After SC absence, active amfd will reject messages from 'veteran' amfnds until its local amfnd has started. During this period, if a PG track action msg is sent and rejected by amfd, it will cause the sending amfnd to lose sync with amfd. So we should also queue this message to be re-sent. Oct 11 18:06:01 SC-1 osafamfd[12545]: WA avd_msg_sanity_chk: invalid msg id 3, msg type 8, from 2030f should be 2 Oct 11 18:06:01 SC-1 osafamfd[12545]: WA avd_msg_sanity_chk: invalid msg id 4, msg type 8, from 2030f should be 2 Oct 11 18:06:10 SC-1 osafamfd[12545]: WA avd_msg_sanity_chk: invalid msg id 5, msg type 6, from 2030f should be 2 Oct 11 18:06:20 SC-1 osafamfd[12545]: WA avd_msg_sanity_chk: invalid msg id 6, msg type 8, from 2030f should be 5 Oct 11 18:06:20 SC-1 osafamfd[12545]: WA avd_msg_sanity_chk: invalid msg id 7, msg type 8, from 2030f should be 5 Oct 11 18:06:20 SC-1 osafamfd[12545]: WA avd_msg_sanity_chk: invalid msg id 8, msg type 8, from 2030f should be 5 Oct 11 18:06:20 SC-1 osafamfd[12545]: WA avd_msg_sanity_chk: invalid msg id 9, msg type 8, from 2030f should be 5 After set_leds event is received by amfnd, it can be seen that msgs with id 3 and 4 are retransmitted, but 5 is not received by amfd. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #1765 ckpt : saCkptCheckpointOpen api call failed and returing SA_AIS_ERR_LIBRARY after couple of failover
Apart from ERR_LIBRARY return value, CKPT open fails with ERR_NO_RESOURCES randomly after failover. --- ** [tickets:#1765] ckpt : saCkptCheckpointOpen api call failed and returing SA_AIS_ERR_LIBRARY after couple of failover** **Status:** accepted **Milestone:** 5.0.2 **Created:** Fri Apr 15, 2016 06:26 AM UTC by Ritu Raj **Last Updated:** Tue Sep 20, 2016 06:04 PM UTC **Owner:** Pham Hoang Nhat **Attachments:** - [ckpt_trace.tar.bz2](https://sourceforge.net/p/opensaf/tickets/1765/attachment/ckpt_trace.tar.bz2) (3.2 MB; application/x-bzip) setup: Changeset- 7436 Version - opensaf 5.0 FC 4 nodes configured with single PBE and a load of 30K objects * Issue observed : saCkptCheckpointOpen api call failed and returing SA_AIS_ERR_LIBRARY after couple of failover * Steps to reproduce: > Ran couple of failover and observed saCkptCheckpointOpen failed. > below is the snippet of agent trace: Apr 15 8:08:50.275115 cpa [28883:cpa_mds.c:0776] << cpa_mds_msg_sync_send: retval = 1 Apr 15 8:08:50.275128 cpa [28883:cpa_api.c:1043] T4 Cpa CkptOpen failed with return value:2,ckptHandle:63 Apr 15 8:08:50.275141 cpa [28883:cpa_api.c:1146] << **saCkptCheckpointOpen: API return code = 2** > Traces of both controllers and agent trace of payload is attached. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2112 amfd: multiple SUs incorrectly assigned to single node
Attached file is amfd trace that shows how a duplicated node mapped to SUs Before headless, the node<->su map as below: SU1: PL3, SU2: PL4, SU3: PL5, SU4: SC1, SU5: SC2 After headless, the nodes read from IMM and initialization phase was reverted in order, thus the node <-> su map was changed: SU5: PL3, SU4: PL4, SU3: PL5, SU2: SC1, SU1: SC2 At the headless sync phase, the node of SU2 was updated to PL4, because SU2 was actually mapped to PL4 before headless, eventually it becomes SU2: PL4 and SU4: PL4. The problem happens due to some reasons that order of SU read from IMM reverted and saAmfSUHostedByNode was empty which caused amfd pick randomly the node to assign to SU. A solution could be: (1) Make both active/standby amfd become early implementer/applier so that saAmfSUHostedByNode was read properly after headless Or, (2) Just before reading headless sync information, amfd read (again) saAmfSUHostedByNode and update to SU. At this point, saAmfSUHostedByNode should be read as non-empty value and the saAmfSUHostedByNode was mapped in initialization phase is not reliable. Attachments: - [osafamfd](https://sourceforge.net/p/opensaf/tickets/_discuss/thread/76ef399e/9d50/attachment/osafamfd) (5.9 MB; application/octet-stream) --- ** [tickets:#2112] amfd: multiple SUs incorrectly assigned to single node** **Status:** assigned **Milestone:** 5.1.1 **Created:** Tue Oct 11, 2016 11:56 PM UTC by Gary Lee **Last Updated:** Wed Oct 12, 2016 02:16 AM UTC **Owner:** Minh Hon Chau Multiple SUs are assigned to a single node after SC absence. To reproduce: 0) load nwayactive demo 1) stop SCs 2) restart SCs The following is observed: root@SC-1:~# immlist safSu=SU4,safSg=AmfDemo,safApp=AmfDemo2 ... saAmfSUHostedByNodeSA_NAME_T safAmfNode=PL-4,safAmfCluster=myAmfCluster (42) root@SC-1:~# immlist safSu=SU2,safSg=AmfDemo,safApp=AmfDemo2 ... saAmfSUHostedByNodeSA_NAME_T safAmfNode=PL-4,safAmfCluster=myAmfCluster (42) SU2 is indeed assigned to PL-4, but SU4 was assigned to one of the SCs and is not assigned to PL-4. Operations on SU4 will lead to a crash of amfnd on PL-4. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2093 saLogStreamOpen_2 api returns SA_AIS_ERR_BAD_OPERATION during si-swap operation
- **status**: review --> fixed - **assigned_to**: Vu Minh Nguyen --> nobody - **Comment**: changeset: 8216:90192f4b8e98 tag: tip parent: 8213:32824c74a736 user:Vu Minh Nguyen date:Wed Oct 12 16:03:09 2016 +0700 summary: log: fix saLogStreamOpen_2 returns SA_AIS_ERR_BAD_OPERATION during si-swap [#2093] changeset: 8215:f86e8509a000 branch: opensaf-5.1.x parent: 8212:38f928e934cc user:Vu Minh Nguyen date:Wed Oct 12 16:03:09 2016 +0700 summary: log: fix saLogStreamOpen_2 returns SA_AIS_ERR_BAD_OPERATION during si-swap [#2093] changeset: 8214:7614e1f897a8 branch: opensaf-5.0.x parent: 8211:9a42c66ed888 user:Vu Minh Nguyen date:Thu Oct 13 11:28:42 2016 +0700 summary: log: fix saLogStreamOpen_2 returns SA_AIS_ERR_BAD_OPERATION during si-swap [#2093] --- ** [tickets:#2093] saLogStreamOpen_2 api returns SA_AIS_ERR_BAD_OPERATION during si-swap operation** **Status:** fixed **Milestone:** 5.0.2 **Created:** Wed Oct 05, 2016 06:48 AM UTC by Ritu Raj **Last Updated:** Wed Oct 05, 2016 11:04 AM UTC **Owner:** nobody **Attachments:** - [log_agent.trace](https://sourceforge.net/p/opensaf/tickets/2093/attachment/log_agent.trace) (52.7 kB; application/octet-stream) # Environment details OS : Suse 64bit Changeset : 7997 ( 5.1.FC) Setup : 4 nodes ( 2 controllers and 2 payloads with headless feature disabled & PBE disabled # Summary saLogStreamOpen_2 api returns SA_AIS_ERR_BAD_OPERATION during si-swap operation while opening previously closed app stream with same properties # Steps followed & Observed behaviour 1. Open an application stream 2. Close the stream 3. Perform si-swap operation 4. Again, Opening a previously closed stream with same properties during switchover opeartion >> saLogStreamOpen_2 api returns Bad return status!!! rc = 20 Below is agent trace: Oct 5 6:36:19.674354 lga [9272:ntfa_mds.c:0583] << ntfa_mds_enc Oct 5 6:36:19.676361 lga [9272:ntfa_mds.c:0827] >> ntfa_mds_dec Oct 5 6:36:19.676399 lga [9272:ntfa_mds.c:0857] T2 NTFSV_NTFA_API_RESP_MSG rc = 1 Oct 5 6:36:19.676414 lga [9272:ntfa_mds.c:0936] << ntfa_mds_dec Oct 5 6:36:19.676815 lga [9272:ntfa_mds.c:1202] << ntfa_mds_msg_sync_send Oct 5 6:36:19.676863 lga [9272:ntfa_api.c:2128] T1 subscriptionId from server 18681 Oct 5 6:36:19.676893 lga [9272:ntfa_api.c:2170] << saNtfNotificationSubscribe Oct 5 6:36:31.472555 lga [9272:lga_api.c:0774] >> saLogStreamOpen_2 Oct 5 6:36:31.472598 lga [9272:lga_api.c:0613] >> validate_open_params Oct 5 6:36:31.472608 lga [9272:lga_api.c:0740] << validate_open_params Oct 5 6:36:31.472618 lga [9272:lga_api.c:0082] >> populate_open_params Oct 5 6:36:31.472624 lga [9272:lga_api.c:0107] << populate_open_params Oct 5 6:36:31.472643 lga [9272:lga_mds.c:1285] >> lga_mds_msg_sync_send Oct 5 6:36:31.472672 lga [9272:lga_mds.c:0706] >> lga_mds_enc Oct 5 6:36:31.472682 lga [9272:lga_mds.c:0737] T2 msgtype: 0 Oct 5 6:36:31.472688 lga [9272:lga_mds.c:0750] T2 api_info.type: 2 Oct 5 6:36:31.472694 lga [9272:lga_mds.c:0122] >> lga_enc_lstr_open_sync_msg Oct 5 6:36:31.472702 lga [9272:lga_mds.c:0247] << lga_enc_lstr_open_sync_msg Oct 5 6:36:31.472707 lga [9272:lga_mds.c:0778] << lga_mds_enc Oct 5 6:36:31.579120 lga [9272:lga_mds.c:0591] >> lga_mds_svc_evt Oct 5 6:36:31.579154 lga [9272:lga_mds.c:0595] TR lga_mds_svc_evt NCSMDS_NO_ACTIVE Oct 5 6:36:31.579165 lga [9272:lga_mds.c:0599] TR NCSMDS_NO_ACTIVE Oct 5 6:36:31.579174 lga [9272:lga_mds.c:0650] << lga_mds_svc_evt Oct 5 6:36:31.583167 lga [9272:lga_mds.c:0977] >> lga_mds_dec Oct 5 6:36:31.583211 lga [9272:lga_mds.c:1009] T2 LGSV_LGA_API_RESP_MSG Oct 5 6:36:31.583227 lga [9272:lga_mds.c:1060] << lga_mds_dec Oct 5 6:36:31.583306 lga [9272:lga_mds.c:1312] << lga_mds_msg_sync_send Oct 5 6:36:31.583323 lga [9272:lga_api.c:0921] TR Bad return status!!! rc = 20 Oct 5 6:36:31.583336 lga [9272:lga_api.c:0970] << saLogStreamOpen_2 Oct 5 6:36:32.492387 lga [9272:ntfa_mds.c:0388] T2 NTFA Rcvd MDS subscribe evt from svc 28 Oct 5 6:36:32.492427 lga [9272:ntfa_mds.c:0398] TR NTFS down **Notes: Agent trace attached --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2112 amfd: multiple SUs incorrectly assigned to single node
Hi Minh, Since this is a defect, I think as of now we can take approach (2). For (1), an enhancement can be raised for 5.2 FC so that it will go through proper testing post 5.2 FC tag. Thanks, Praveen --- ** [tickets:#2112] amfd: multiple SUs incorrectly assigned to single node** **Status:** assigned **Milestone:** 5.1.1 **Created:** Tue Oct 11, 2016 11:56 PM UTC by Gary Lee **Last Updated:** Thu Oct 13, 2016 02:06 AM UTC **Owner:** Minh Hon Chau Multiple SUs are assigned to a single node after SC absence. To reproduce: 0) load nwayactive demo 1) stop SCs 2) restart SCs The following is observed: root@SC-1:~# immlist safSu=SU4,safSg=AmfDemo,safApp=AmfDemo2 ... saAmfSUHostedByNodeSA_NAME_T safAmfNode=PL-4,safAmfCluster=myAmfCluster (42) root@SC-1:~# immlist safSu=SU2,safSg=AmfDemo,safApp=AmfDemo2 ... saAmfSUHostedByNodeSA_NAME_T safAmfNode=PL-4,safAmfCluster=myAmfCluster (42) SU2 is indeed assigned to PL-4, but SU4 was assigned to one of the SCs and is not assigned to PL-4. Operations on SU4 will lead to a crash of amfnd on PL-4. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2087 SMF: smfnd asserted on active controller with long dn when executing the campaign.
- **status**: fixed --> assigned --- ** [tickets:#2087] SMF: smfnd asserted on active controller with long dn when executing the campaign.** **Status:** assigned **Milestone:** 5.0.1 **Created:** Fri Sep 30, 2016 11:47 AM UTC by Madhurika Koppula **Last Updated:** Wed Oct 12, 2016 06:03 AM UTC **Owner:** Neelakanta Reddy **Attachments:** - [smfnd_assert.tgz](https://sourceforge.net/p/opensaf/tickets/2087/attachment/smfnd_assert.tgz) (936.2 kB; application/octet-stream) **Environment Details:** OS : Suse 64bit Setup : 4 nodes ( 2 controllers and 2 payloads with headless feature disabled & PBE disabled ). **Summary**: smfnd asserted on active controller with long dn when executing the campaign. **Steps followed & Observed behaviour:** 1) Initially brought up four nodes and all the nodes joined the cluster (PBE is disabled) 2) Enabled long dn as follows. immcfg -a longDnsAllowed=1 opensafImm=opensafImm,safApp=safImmService 3) Successfully created the campaign object with long dn immcfg -c SaSmfCampaign safSmfCampaign=campaign_long_dn_d,safApp=safSmfService -a SmfCmpgFileUri=/hostfs/campaign85.xml 4) Executed the campaign. Observations: smfnd asserted on active controller. Campaign execution failed. Below is the timestamp of Active controller (SC-1): Oct 9 22:07:06 SCALE_SLOT-21 osafsmfd[2816]: NO CAMP: Calling configured smfBundleCheckCmd for each bundle existing in IMM, to be installed or removed by the campaign **Oct 9 22:07:06 SCALE_SLOT-21 osafsmfnd[2810]: osaf_extended_name.c:144: osaf_extended_name_length: Assertion 'osaf_extended_names_enabled && length >= SA_MAX_UNEXTENDED_NAME_LENGTH' failed.** Oct 9 22:07:06 SCALE_SLOT-21 osafamfnd[2794]: NO 'safSu=SC-1,safSg=NoRed,safApp=OpenSAF' component restart probation timer started (timeout: 600 ns) Oct 9 22:07:06 SCALE_SLOT-21 osafamfnd[2794]: NO Restarting a component of 'safSu=SC-1,safSg=NoRed,safApp=OpenSAF' (comp restart count: 1) Oct 9 22:07:06 SCALE_SLOT-21 osafamfnd[2794]: NO 'safComp=SMFND,safSu=SC-1,safSg=NoRed,safApp=OpenSAF' faulted due to 'avaDown' : Recovery is 'componentRestart' Oct 9 22:07:06 SCALE_SLOT-21 osafsmfnd[3072]: Started Attachments: 1)Syslog, smf, imm traces of active controller. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2087 SMF: smfnd asserted on active controller with long dn when executing the campaign.
- **status**: assigned --> fixed - **Comment**: changeset: 8217:810f6dde01cb branch: opensaf-5.0.x tag: tip parent: 8214:7614e1f897a8 user:Neelakanta Reddy date:Thu Oct 13 11:16:49 2016 +0530 summary: smf:fixed build error [#2087] --- ** [tickets:#2087] SMF: smfnd asserted on active controller with long dn when executing the campaign.** **Status:** fixed **Milestone:** 5.0.1 **Created:** Fri Sep 30, 2016 11:47 AM UTC by Madhurika Koppula **Last Updated:** Thu Oct 13, 2016 05:43 AM UTC **Owner:** Neelakanta Reddy **Attachments:** - [smfnd_assert.tgz](https://sourceforge.net/p/opensaf/tickets/2087/attachment/smfnd_assert.tgz) (936.2 kB; application/octet-stream) **Environment Details:** OS : Suse 64bit Setup : 4 nodes ( 2 controllers and 2 payloads with headless feature disabled & PBE disabled ). **Summary**: smfnd asserted on active controller with long dn when executing the campaign. **Steps followed & Observed behaviour:** 1) Initially brought up four nodes and all the nodes joined the cluster (PBE is disabled) 2) Enabled long dn as follows. immcfg -a longDnsAllowed=1 opensafImm=opensafImm,safApp=safImmService 3) Successfully created the campaign object with long dn immcfg -c SaSmfCampaign safSmfCampaign=campaign_long_dn_d,safApp=safSmfService -a SmfCmpgFileUri=/hostfs/campaign85.xml 4) Executed the campaign. Observations: smfnd asserted on active controller. Campaign execution failed. Below is the timestamp of Active controller (SC-1): Oct 9 22:07:06 SCALE_SLOT-21 osafsmfd[2816]: NO CAMP: Calling configured smfBundleCheckCmd for each bundle existing in IMM, to be installed or removed by the campaign **Oct 9 22:07:06 SCALE_SLOT-21 osafsmfnd[2810]: osaf_extended_name.c:144: osaf_extended_name_length: Assertion 'osaf_extended_names_enabled && length >= SA_MAX_UNEXTENDED_NAME_LENGTH' failed.** Oct 9 22:07:06 SCALE_SLOT-21 osafamfnd[2794]: NO 'safSu=SC-1,safSg=NoRed,safApp=OpenSAF' component restart probation timer started (timeout: 600 ns) Oct 9 22:07:06 SCALE_SLOT-21 osafamfnd[2794]: NO Restarting a component of 'safSu=SC-1,safSg=NoRed,safApp=OpenSAF' (comp restart count: 1) Oct 9 22:07:06 SCALE_SLOT-21 osafamfnd[2794]: NO 'safComp=SMFND,safSu=SC-1,safSg=NoRed,safApp=OpenSAF' faulted due to 'avaDown' : Recovery is 'componentRestart' Oct 9 22:07:06 SCALE_SLOT-21 osafsmfnd[3072]: Started Attachments: 1)Syslog, smf, imm traces of active controller. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2098 amfnd: amfnd doesn't exit at opensafd stop
- **labels**: --> TERM_FAILED, SHUTDOWN - **status**: accepted --> review --- ** [tickets:#2098] amfnd: amfnd doesn't exit at opensafd stop** **Status:** review **Milestone:** 5.0.2 **Labels:** TERM_FAILED SHUTDOWN **Created:** Thu Oct 06, 2016 09:02 AM UTC by Hans Nordebäck **Last Updated:** Wed Oct 12, 2016 11:06 AM UTC **Owner:** Praveen amfnd doesn't exit at opensafd stop. This problem can be reproduced in UML using the AmfDemo application, nwayactive: 1) Change the AppConfig-nwayactive.xml file to: saAmfCtCompCategory 8 2) Do the following changes to the amf_demo_script to: start() { return 0 } build_uml install_testprog opensaf start 3) unlock-in, unlock all 5 amf demo SUs 4) At PL-3 modify /opt/amf_demo_script : stop() { while true do sleep 2 done } run /etc/init.d/opensafd stop and the following problem occurs: Oct 6 10:45:37 PL-3 local0.notice osafamfnd[423]: NO Reason:'Script did not exit within time' Oct 6 10:45:37 PL-3 local0.warn osafamfnd[423]: WA 'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo2' Presence State TERMINATING => TERMINATION_FAILED Oct 6 10:45:37 PL-3 local0.notice osafamfnd[423]: NO 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo2' Presence State TERMINATING => TERMINATION_FAILED Oct 6 10:46:18 PL-3 user.notice opensafd: amfnd has not yet exited, killing it forcibly. Oct 6 10:46:18 PL-3 local0.alert osafclmna[414]: AL AMF Node Director is down, terminate this process Oct 6 10:46:18 PL-3 local0.crit osafamfwd[489]: Rebooting OpenSAF NodeId = 0 EE Name = No EE Mapped, Reason: AMF unexpectedly crashed, OwnNodeId = 131855, SupervisionTime = 60 Oct 6 10:46:18 PL-3 user.notice opensaf_reboot: Rebooting local node; timeout=60 O --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #538 AMF: fail-over assignments despite comps in TERM-FAILED state
- **status**: unassigned --> assigned - **assigned_to**: Praveen - **Part**: - --> nd --- ** [tickets:#538] AMF: fail-over assignments despite comps in TERM-FAILED state** **Status:** assigned **Milestone:** 5.0.2 **Created:** Fri Aug 09, 2013 06:43 AM UTC by Hans Feldt **Last Updated:** Tue Sep 20, 2016 06:04 PM UTC **Owner:** Praveen AMF currently performs fail-over recovery action although a component is in termination-failed presence state. This can lead to severe inconsistencies for the application. The specification also clearly states how this should work in 4.8: "If the component and any of its contained components (for a container component) were assigned the active HA state for some component service instances when the CLEANUP command was executed, and semantics of the redundancy model of its enclosing service group guarantee that at a point in time only one component can be in the active HA state for a given component service instance, the failure to terminate that component prevents the Availability Management Framework from assigning to another component the active HA state for these component service instances (and by the same token prevents the assignment of the active HA state to other service units for the service instances that contain the involved CSIs). In this case, the ser- vice instances will stay unassigned until an administrative action is performed to ter- minate the failed component." Can be tested by running the AMF 2N sa-aware sample app and modifying the cleanup script to do "exit 1" which gives this effect when the active component is killed: Aug 9 08:40:01 Vostro osafamfnd[11307]: NO 'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' faulted due to 'avaDown' : Recovery is 'componentRestart' Aug 9 08:40:01 Vostro osafamfnd[11307]: NO Cleanup of 'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' failed Aug 9 08:40:01 Vostro osafamfnd[11307]: NO Reason:'Exec of script success, but script exits with non-zero status' Aug 9 08:40:01 Vostro osafamfnd[11307]: NO Exit code: 1 Aug 9 08:40:01 Vostro osafamfnd[11307]: NO Component Failover trigerred for 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1': Failed component: 'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' Aug 9 08:40:01 Vostro osafamfnd[11307]: NO 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' Presence State INSTANTIATED => TERMINATION_FAILED Aug 9 08:40:01 Vostro osafamfnd[11307]: NO Assigning 'safSi=AmfDemo,safApp=AmfDemo1' QUIESCED to 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' Aug 9 08:40:01 Vostro osafamfnd[11307]: NO Assigned 'safSi=AmfDemo,safApp=AmfDemo1' QUIESCED to 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' Aug 9 08:40:01 Vostro osafamfnd[11307]: NO Assigning 'safSi=AmfDemo,safApp=AmfDemo1' ACTIVE to 'safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1' Aug 9 08:40:01 Vostro amf_demo[11620]: CSI Set - HAState Active for all assigned CSIs Aug 9 08:40:01 Vostro osafamfnd[11307]: NO Assigned 'safSi=AmfDemo,safApp=AmfDemo1' ACTIVE to 'safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1' Aug 9 08:40:01 Vostro osafamfnd[11307]: NO Removing 'safSi=AmfDemo,safApp=AmfDemo1' from 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' Aug 9 08:40:01 Vostro osafamfnd[11307]: NO Removed 'safSi=AmfDemo,safApp=AmfDemo1' from 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets