[tickets] [opensaf:tickets] #2011 ckptd seg faulted on active controller when trying to create checkpoint
- **status**: accepted --> not-reproducible - **Milestone**: 5.18.09 --> never --- ** [tickets:#2011] ckptd seg faulted on active controller when trying to create checkpoint** **Status:** not-reproducible **Milestone:** never **Created:** Thu Sep 08, 2016 07:28 AM UTC by Ritu Raj **Last Updated:** Mon Sep 03, 2018 07:18 AM UTC **Owner:** Mohan Kanakam **Attachments:** - [ckptd_bt](https://sourceforge.net/p/opensaf/tickets/2011/attachment/ckptd_bt) (2.6 kB; application/octet-stream) - [messages-20160907.bz2](https://sourceforge.net/p/opensaf/tickets/2011/attachment/messages-20160907.bz2) (380.1 kB; application/x-bzip) - [syslog2](https://sourceforge.net/p/opensaf/tickets/2011/attachment/syslog2) (1.4 MB; application/octet-stream) Environment details OS : Suse 64bit Changeset : 7997 ( 5.1.FC) Setup : 4 nodes ( 2 controllers and 2 payloads with headless feature disabled & 1PBE enabled with 30K objects ) Summary : ckptd crashed on active controller when trying to create checkpoint during failover Steps followed & Observed behaviour 1. Initially ran some CKPT test scenarios, along with failovers. After the end of the test scenarios, The following IMM objects & replicas are not deleted sofo-s3:/dev/shm # immfind | grep 101 safCkpt=all_replicas_ckpt_name_101 safCkpt=collocated_ckpt_name_101 safReplica=safNode=PL-3\,safCluster=myClmCluster,safCkpt=all_replicas_ckpt_name_101 safReplica=safNode=PL-3\,safCluster=myClmCluster,safCkpt=collocated_ckpt_name_101 safReplica=safNode=SC-1\,safCluster=myClmCluster,safCkpt=all_replicas_ckpt_name_101 safReplica=safNode=SC-2\,safCluster=myClmCluster,safCkpt=all_replicas_ckpt_name_101 2. When ckpt is created with the earlier name (all_replicas_ckpt_name_101) observed the following error in syslog. Also CkptOpen failed with ERR_LIBRARY. >> saImmOiRtObjectCreate_2 failed with error = 14 >> Sep 7 17:21:11 sofo-s2 osafimmnd[2137]: NO PBE-OI established on this SC. Dumping incrementally to file imm.db Sep 7 17:21:12 sofo-s2 osafckptd[2284]: ER create_runtime_ckpt_object - saImmOiRtObjectCreate_2 failed with error = 14 Sep 7 17:21:12 sofo-s2 osafckptd[2284]: ER create runtime ckpt object failed with error: 14 Sep 7 17:21:12 sofo-s2 osafckptd[2284]: ER cpd db add ckpt_node failed for ckpt_id:2 4. After some time cpktd seg faulted on active controller >> Sep 7 17:21:43 sofo-s2 osafamfnd[2187]: NO 'safComp=CPD,safSu=SC-2,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : Recovery is 'nodeFailfast' Sep 7 17:21:43 sofo-s2 osafamfnd[2187]: ER safComp=CPD,safSu=SC-2,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery is:nodeFailfast Sep 7 17:21:43 sofo-s2 osafamfnd[2187]: Rebooting OpenSAF NodeId = 131599 EE Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId = 131599, SupervisionTime = 60 Sep 7 17:21:43 sofo-s2 opensaf_reboot: Rebooting local node; timeout=60 5. Below is the bt 0- 0x7fbbd5ffcb20 in memcmp () from /lib64/libc.so.6 1- 0x7fbbd7a10929 in ncs_patricia_tree_get (pTree=0x67b4c8, pKey=0x7d22531c "\017\001\002") at patricia.c:435 2- 0x0040800d in cpd_cpnd_info_node_get (cpnd_tree=0x67b4c8, dest=0x67ec60, cpnd_info_node=0x7d225350) at cpd_db.c:706 3- 0x0040cd56 in cpd_evt_proc_mds_evt (cb=0x67b340, evt=0x67ec50) at cpd_evt.c:1378 4- 0x004091cb in cpd_process_evt (evt=0x67ec40) at cpd_evt.c:107 5- 0x0041185f in cpd_main_process (cb=0x67b340) at cpd_init.c:661 6 - 0x00411b89 in main (argc=1, argv=0x7d225578) at cpd_main.c:74 Notes: 1. Syslog attached 2. bt attached 3. ckptd traces not enabled --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2011 ckptd seg faulted on active controller when trying to create checkpoint
I tried to reproduce this ticket on opensaf 5.18.04. Setup : 4 nodes ( 2 controllers and 2 payloads with headless feature disabled & 1PBE enabled with 30K objects ) 1)In this setup, SC-1 is Active and SC-2 is standby root@mohan-VirtualBox:~# amf-state siass safSISU=safSu=SC-2\,safSg=2N\,safApp=OpenSAF,safSi=SC-2N,safApp=OpenSAF saAmfSISUHAState=STANDBY(2) saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1) safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC-2N,safApp=OpenSAF saAmfSISUHAState=ACTIVE(1) saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1) safSISU=safSu=SC-1\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed2,safApp=OpenSAF saAmfSISUHAState=ACTIVE(1) saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1) safSISU=safSu=PL-4\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed4,safApp=OpenSAF saAmfSISUHAState=ACTIVE(1) saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1) safSISU=safSu=SC-2\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed3,safApp=OpenSAF saAmfSISUHAState=ACTIVE(1) saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1) safSISU=safSu=PL-3\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed1, safApp=OpenSAF saAmfSISUHAState=ACTIVE(1) saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1) 2)I ran the demo application, which creates checkpoints continuously on PL-3 root@mohan-VirtualBox:/home/mohan/ticket2011# ./tkt *** Demonstrating Checkpoint Service *** Initialising With Checkpoint Service saCkptInitialize passed Opening Checkpoint = safCkpt=DemoCkpt000,safApp=safCkptService with create flags saCkptCheckpointOpen passed Opening Checkpoint = safCkpt=DemoCkpt001,safApp=safCkptService with create flags saCkptCheckpointOpen passed Opening Checkpoint = safCkpt=DemoCkpt002,safApp=safCkptService with create flags saCkptCheckpointOpen passed Opening Checkpoint = safCkpt=DemoCkpt003,safApp=safCkptService with create flags saCkptCheckpointOpen passed Opening Checkpoint = safCkpt=DemoCkpt004,safApp=safCkptService with create flags saCkptCheckpointOpen passed Opening Checkpoint = safCkpt=DemoCkpt005,safApp=safCkptService with create flags saCkptCheckpointOpen failed 6 Opening Checkpoint = safCkpt=DemoCkpt006,safApp=safCkptService with create flags saCkptCheckpointOpen failed 6 Opening Checkpoint = safCkpt=DemoCkpt007,safApp=safCkptService with create flags 3)while running the demo application on Pl-3, I done failover. root@mohan-VirtualBox:~# /etc/init.d/opensafd stop [ ok ] Stopping opensafd (via systemctl): opensafd.service. root@mohan-VirtualBox:~# /etc/init.d/opensafd start [ ok ] Starting opensafd (via systemctl): opensafd.service. root@mohan-VirtualBox:~# amf-state siass safSISU=safSu=SC-2\,safSg=2N\,safApp=OpenSAF,safSi=SC-2N,safApp=OpenSAF saAmfSISUHAState=ACTIVE(1) saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1) safSISU=safSu=SC-2\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed3, safApp=OpenSAF saAmfSISUHAState=ACTIVE(1) saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1) safSISU=safSu=PL-4\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed4, safApp=OpenSAF saAmfSISUHAState=ACTIVE(1) saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1) safSISU=safSu=PL-3\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed1, safApp=OpenSAF saAmfSISUHAState=ACTIVE(1) saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1) safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC-2N,safApp=OpenSAF saAmfSISUHAState=STANDBY(2) saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1) safSISU=safSu=SC-1\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed2,safApp=OpenSAF saAmfSISUHAState=ACTIVE(1) saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1) 4)I observed that ckptd is not crashed on active controller. Sep 3 12:11:23 mohan-VirtualBox opensafd: OpenSAF services successfully stopped Sep 3 12:11:23 mohan-VirtualBox opensafd[3193]: Stopping OpenSAF Services: * Sep 3 12:11:23 mohan-VirtualBox systemd[1]: Stopped OpenSAF daemon. Sep 3 12:11:28 mohan-VirtualBox dhclient[3176]: DHCPDISCOVER on enp0s8 to 255.255.255.255 port 67 interval 14 (xid=0xdde9787c) Sep 3 12:11:29 mohan-VirtualBox systemd[1]: Starting OpenSAF daemon... Sep 3 12:11:29 mohan-VirtualBox opensafd: Starting OpenSAF Services(5.18.04 - c5117a898d331edb395434df56d630449a9ad7d2) (Using TIPC) Sep 3 12:11:29 mohan-VirtualBox opensafd: Reboot file /var/log/opensaf/clm_cluster_reboot_in_progress not found, startup continue... Sep 3 12:11:29 mohan-VirtualBox opensafd[3820]: logtrace: trace enabled to file 'opensafd.log', mask=0x0 Sep 3 12:11:30 mohan-VirtualBox kernel: [ 2141.109226] tipc: Activated (version 2.0.0) Sep 3 12:11:30 mohan-VirtualBox kernel: [ 2141.109295] NET: Registered protocol family 30 Sep 3 12:11:30 mohan-VirtualBox kernel: [ 2141.109408] tipc: Started in
[tickets] [opensaf:tickets] #2011 ckptd seg faulted on active controller when trying to create checkpoint
- **assigned_to**: Mohan Kanakam - **Milestone**: future --> 5.18.09 --- ** [tickets:#2011] ckptd seg faulted on active controller when trying to create checkpoint** **Status:** accepted **Milestone:** 5.18.09 **Created:** Thu Sep 08, 2016 07:28 AM UTC by Ritu Raj **Last Updated:** Mon Aug 28, 2017 08:45 AM UTC **Owner:** Mohan Kanakam **Attachments:** - [ckptd_bt](https://sourceforge.net/p/opensaf/tickets/2011/attachment/ckptd_bt) (2.6 kB; application/octet-stream) - [messages-20160907.bz2](https://sourceforge.net/p/opensaf/tickets/2011/attachment/messages-20160907.bz2) (380.1 kB; application/x-bzip) - [syslog2](https://sourceforge.net/p/opensaf/tickets/2011/attachment/syslog2) (1.4 MB; application/octet-stream) Environment details OS : Suse 64bit Changeset : 7997 ( 5.1.FC) Setup : 4 nodes ( 2 controllers and 2 payloads with headless feature disabled & 1PBE enabled with 30K objects ) Summary : ckptd crashed on active controller when trying to create checkpoint during failover Steps followed & Observed behaviour 1. Initially ran some CKPT test scenarios, along with failovers. After the end of the test scenarios, The following IMM objects & replicas are not deleted sofo-s3:/dev/shm # immfind | grep 101 safCkpt=all_replicas_ckpt_name_101 safCkpt=collocated_ckpt_name_101 safReplica=safNode=PL-3\,safCluster=myClmCluster,safCkpt=all_replicas_ckpt_name_101 safReplica=safNode=PL-3\,safCluster=myClmCluster,safCkpt=collocated_ckpt_name_101 safReplica=safNode=SC-1\,safCluster=myClmCluster,safCkpt=all_replicas_ckpt_name_101 safReplica=safNode=SC-2\,safCluster=myClmCluster,safCkpt=all_replicas_ckpt_name_101 2. When ckpt is created with the earlier name (all_replicas_ckpt_name_101) observed the following error in syslog. Also CkptOpen failed with ERR_LIBRARY. >> saImmOiRtObjectCreate_2 failed with error = 14 >> Sep 7 17:21:11 sofo-s2 osafimmnd[2137]: NO PBE-OI established on this SC. Dumping incrementally to file imm.db Sep 7 17:21:12 sofo-s2 osafckptd[2284]: ER create_runtime_ckpt_object - saImmOiRtObjectCreate_2 failed with error = 14 Sep 7 17:21:12 sofo-s2 osafckptd[2284]: ER create runtime ckpt object failed with error: 14 Sep 7 17:21:12 sofo-s2 osafckptd[2284]: ER cpd db add ckpt_node failed for ckpt_id:2 4. After some time cpktd seg faulted on active controller >> Sep 7 17:21:43 sofo-s2 osafamfnd[2187]: NO 'safComp=CPD,safSu=SC-2,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : Recovery is 'nodeFailfast' Sep 7 17:21:43 sofo-s2 osafamfnd[2187]: ER safComp=CPD,safSu=SC-2,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery is:nodeFailfast Sep 7 17:21:43 sofo-s2 osafamfnd[2187]: Rebooting OpenSAF NodeId = 131599 EE Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId = 131599, SupervisionTime = 60 Sep 7 17:21:43 sofo-s2 opensaf_reboot: Rebooting local node; timeout=60 5. Below is the bt 0- 0x7fbbd5ffcb20 in memcmp () from /lib64/libc.so.6 1- 0x7fbbd7a10929 in ncs_patricia_tree_get (pTree=0x67b4c8, pKey=0x7d22531c "\017\001\002") at patricia.c:435 2- 0x0040800d in cpd_cpnd_info_node_get (cpnd_tree=0x67b4c8, dest=0x67ec60, cpnd_info_node=0x7d225350) at cpd_db.c:706 3- 0x0040cd56 in cpd_evt_proc_mds_evt (cb=0x67b340, evt=0x67ec50) at cpd_evt.c:1378 4- 0x004091cb in cpd_process_evt (evt=0x67ec40) at cpd_evt.c:107 5- 0x0041185f in cpd_main_process (cb=0x67b340) at cpd_init.c:661 6 - 0x00411b89 in main (argc=1, argv=0x7d225578) at cpd_main.c:74 Notes: 1. Syslog attached 2. bt attached 3. ckptd traces not enabled --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2011 ckptd seg faulted on active controller when trying to create checkpoint
- **assigned_to**: A V Mahesh (AVM) --> nobody - **Blocker**: --> False - **Milestone**: 5.17.08 --> future --- ** [tickets:#2011] ckptd seg faulted on active controller when trying to create checkpoint** **Status:** accepted **Milestone:** future **Created:** Thu Sep 08, 2016 07:28 AM UTC by Ritu Raj **Last Updated:** Mon Apr 10, 2017 01:40 PM UTC **Owner:** nobody **Attachments:** - [ckptd_bt](https://sourceforge.net/p/opensaf/tickets/2011/attachment/ckptd_bt) (2.6 kB; application/octet-stream) - [messages-20160907.bz2](https://sourceforge.net/p/opensaf/tickets/2011/attachment/messages-20160907.bz2) (380.1 kB; application/x-bzip) - [syslog2](https://sourceforge.net/p/opensaf/tickets/2011/attachment/syslog2) (1.4 MB; application/octet-stream) Environment details OS : Suse 64bit Changeset : 7997 ( 5.1.FC) Setup : 4 nodes ( 2 controllers and 2 payloads with headless feature disabled & 1PBE enabled with 30K objects ) Summary : ckptd crashed on active controller when trying to create checkpoint during failover Steps followed & Observed behaviour 1. Initially ran some CKPT test scenarios, along with failovers. After the end of the test scenarios, The following IMM objects & replicas are not deleted sofo-s3:/dev/shm # immfind | grep 101 safCkpt=all_replicas_ckpt_name_101 safCkpt=collocated_ckpt_name_101 safReplica=safNode=PL-3\,safCluster=myClmCluster,safCkpt=all_replicas_ckpt_name_101 safReplica=safNode=PL-3\,safCluster=myClmCluster,safCkpt=collocated_ckpt_name_101 safReplica=safNode=SC-1\,safCluster=myClmCluster,safCkpt=all_replicas_ckpt_name_101 safReplica=safNode=SC-2\,safCluster=myClmCluster,safCkpt=all_replicas_ckpt_name_101 2. When ckpt is created with the earlier name (all_replicas_ckpt_name_101) observed the following error in syslog. Also CkptOpen failed with ERR_LIBRARY. >> saImmOiRtObjectCreate_2 failed with error = 14 >> Sep 7 17:21:11 sofo-s2 osafimmnd[2137]: NO PBE-OI established on this SC. Dumping incrementally to file imm.db Sep 7 17:21:12 sofo-s2 osafckptd[2284]: ER create_runtime_ckpt_object - saImmOiRtObjectCreate_2 failed with error = 14 Sep 7 17:21:12 sofo-s2 osafckptd[2284]: ER create runtime ckpt object failed with error: 14 Sep 7 17:21:12 sofo-s2 osafckptd[2284]: ER cpd db add ckpt_node failed for ckpt_id:2 4. After some time cpktd seg faulted on active controller >> Sep 7 17:21:43 sofo-s2 osafamfnd[2187]: NO 'safComp=CPD,safSu=SC-2,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : Recovery is 'nodeFailfast' Sep 7 17:21:43 sofo-s2 osafamfnd[2187]: ER safComp=CPD,safSu=SC-2,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery is:nodeFailfast Sep 7 17:21:43 sofo-s2 osafamfnd[2187]: Rebooting OpenSAF NodeId = 131599 EE Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId = 131599, SupervisionTime = 60 Sep 7 17:21:43 sofo-s2 opensaf_reboot: Rebooting local node; timeout=60 5. Below is the bt 0- 0x7fbbd5ffcb20 in memcmp () from /lib64/libc.so.6 1- 0x7fbbd7a10929 in ncs_patricia_tree_get (pTree=0x67b4c8, pKey=0x7d22531c "\017\001\002") at patricia.c:435 2- 0x0040800d in cpd_cpnd_info_node_get (cpnd_tree=0x67b4c8, dest=0x67ec60, cpnd_info_node=0x7d225350) at cpd_db.c:706 3- 0x0040cd56 in cpd_evt_proc_mds_evt (cb=0x67b340, evt=0x67ec50) at cpd_evt.c:1378 4- 0x004091cb in cpd_process_evt (evt=0x67ec40) at cpd_evt.c:107 5- 0x0041185f in cpd_main_process (cb=0x67b340) at cpd_init.c:661 6 - 0x00411b89 in main (argc=1, argv=0x7d225578) at cpd_main.c:74 Notes: 1. Syslog attached 2. bt attached 3. ckptd traces not enabled --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2011 ckptd seg faulted on active controller when trying to create checkpoint
- **Milestone**: 4.7.2 --> 5.0.2 --- ** [tickets:#2011] ckptd seg faulted on active controller when trying to create checkpoint** **Status:** accepted **Milestone:** 5.0.2 **Created:** Thu Sep 08, 2016 07:28 AM UTC by Ritu Raj **Last Updated:** Mon Sep 12, 2016 10:06 AM UTC **Owner:** A V Mahesh (AVM) **Attachments:** - [ckptd_bt](https://sourceforge.net/p/opensaf/tickets/2011/attachment/ckptd_bt) (2.6 kB; application/octet-stream) - [messages-20160907.bz2](https://sourceforge.net/p/opensaf/tickets/2011/attachment/messages-20160907.bz2) (380.1 kB; application/x-bzip) - [syslog2](https://sourceforge.net/p/opensaf/tickets/2011/attachment/syslog2) (1.4 MB; application/octet-stream) Environment details OS : Suse 64bit Changeset : 7997 ( 5.1.FC) Setup : 4 nodes ( 2 controllers and 2 payloads with headless feature disabled & 1PBE enabled with 30K objects ) Summary : ckptd crashed on active controller when trying to create checkpoint during failover Steps followed & Observed behaviour 1. Initially ran some CKPT test scenarios, along with failovers. After the end of the test scenarios, The following IMM objects & replicas are not deleted sofo-s3:/dev/shm # immfind | grep 101 safCkpt=all_replicas_ckpt_name_101 safCkpt=collocated_ckpt_name_101 safReplica=safNode=PL-3\,safCluster=myClmCluster,safCkpt=all_replicas_ckpt_name_101 safReplica=safNode=PL-3\,safCluster=myClmCluster,safCkpt=collocated_ckpt_name_101 safReplica=safNode=SC-1\,safCluster=myClmCluster,safCkpt=all_replicas_ckpt_name_101 safReplica=safNode=SC-2\,safCluster=myClmCluster,safCkpt=all_replicas_ckpt_name_101 2. When ckpt is created with the earlier name (all_replicas_ckpt_name_101) observed the following error in syslog. Also CkptOpen failed with ERR_LIBRARY. >> saImmOiRtObjectCreate_2 failed with error = 14 >> Sep 7 17:21:11 sofo-s2 osafimmnd[2137]: NO PBE-OI established on this SC. Dumping incrementally to file imm.db Sep 7 17:21:12 sofo-s2 osafckptd[2284]: ER create_runtime_ckpt_object - saImmOiRtObjectCreate_2 failed with error = 14 Sep 7 17:21:12 sofo-s2 osafckptd[2284]: ER create runtime ckpt object failed with error: 14 Sep 7 17:21:12 sofo-s2 osafckptd[2284]: ER cpd db add ckpt_node failed for ckpt_id:2 4. After some time cpktd seg faulted on active controller >> Sep 7 17:21:43 sofo-s2 osafamfnd[2187]: NO 'safComp=CPD,safSu=SC-2,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : Recovery is 'nodeFailfast' Sep 7 17:21:43 sofo-s2 osafamfnd[2187]: ER safComp=CPD,safSu=SC-2,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery is:nodeFailfast Sep 7 17:21:43 sofo-s2 osafamfnd[2187]: Rebooting OpenSAF NodeId = 131599 EE Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId = 131599, SupervisionTime = 60 Sep 7 17:21:43 sofo-s2 opensaf_reboot: Rebooting local node; timeout=60 5. Below is the bt 0- 0x7fbbd5ffcb20 in memcmp () from /lib64/libc.so.6 1- 0x7fbbd7a10929 in ncs_patricia_tree_get (pTree=0x67b4c8, pKey=0x7d22531c "\017\001\002") at patricia.c:435 2- 0x0040800d in cpd_cpnd_info_node_get (cpnd_tree=0x67b4c8, dest=0x67ec60, cpnd_info_node=0x7d225350) at cpd_db.c:706 3- 0x0040cd56 in cpd_evt_proc_mds_evt (cb=0x67b340, evt=0x67ec50) at cpd_evt.c:1378 4- 0x004091cb in cpd_process_evt (evt=0x67ec40) at cpd_evt.c:107 5- 0x0041185f in cpd_main_process (cb=0x67b340) at cpd_init.c:661 6 - 0x00411b89 in main (argc=1, argv=0x7d225578) at cpd_main.c:74 Notes: 1. Syslog attached 2. bt attached 3. ckptd traces not enabled --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- ___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2011 ckptd seg faulted on active controller when trying to create checkpoint
- **status**: unassigned --> accepted - **assigned_to**: A V Mahesh (AVM) --- ** [tickets:#2011] ckptd seg faulted on active controller when trying to create checkpoint** **Status:** accepted **Milestone:** 4.7.2 **Created:** Thu Sep 08, 2016 07:28 AM UTC by Ritu Raj **Last Updated:** Thu Sep 08, 2016 07:28 AM UTC **Owner:** A V Mahesh (AVM) **Attachments:** - [ckptd_bt](https://sourceforge.net/p/opensaf/tickets/2011/attachment/ckptd_bt) (2.6 kB; application/octet-stream) - [messages-20160907.bz2](https://sourceforge.net/p/opensaf/tickets/2011/attachment/messages-20160907.bz2) (380.1 kB; application/x-bzip) - [syslog2](https://sourceforge.net/p/opensaf/tickets/2011/attachment/syslog2) (1.4 MB; application/octet-stream) Environment details OS : Suse 64bit Changeset : 7997 ( 5.1.FC) Setup : 4 nodes ( 2 controllers and 2 payloads with headless feature disabled & 1PBE enabled with 30K objects ) Summary : ckptd crashed on active controller when trying to create checkpoint during failover Steps followed & Observed behaviour 1. Initially ran some CKPT test scenarios, along with failovers. After the end of the test scenarios, The following IMM objects & replicas are not deleted sofo-s3:/dev/shm # immfind | grep 101 safCkpt=all_replicas_ckpt_name_101 safCkpt=collocated_ckpt_name_101 safReplica=safNode=PL-3\,safCluster=myClmCluster,safCkpt=all_replicas_ckpt_name_101 safReplica=safNode=PL-3\,safCluster=myClmCluster,safCkpt=collocated_ckpt_name_101 safReplica=safNode=SC-1\,safCluster=myClmCluster,safCkpt=all_replicas_ckpt_name_101 safReplica=safNode=SC-2\,safCluster=myClmCluster,safCkpt=all_replicas_ckpt_name_101 2. When ckpt is created with the earlier name (all_replicas_ckpt_name_101) observed the following error in syslog. Also CkptOpen failed with ERR_LIBRARY. >> saImmOiRtObjectCreate_2 failed with error = 14 >> Sep 7 17:21:11 sofo-s2 osafimmnd[2137]: NO PBE-OI established on this SC. Dumping incrementally to file imm.db Sep 7 17:21:12 sofo-s2 osafckptd[2284]: ER create_runtime_ckpt_object - saImmOiRtObjectCreate_2 failed with error = 14 Sep 7 17:21:12 sofo-s2 osafckptd[2284]: ER create runtime ckpt object failed with error: 14 Sep 7 17:21:12 sofo-s2 osafckptd[2284]: ER cpd db add ckpt_node failed for ckpt_id:2 4. After some time cpktd seg faulted on active controller >> Sep 7 17:21:43 sofo-s2 osafamfnd[2187]: NO 'safComp=CPD,safSu=SC-2,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : Recovery is 'nodeFailfast' Sep 7 17:21:43 sofo-s2 osafamfnd[2187]: ER safComp=CPD,safSu=SC-2,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery is:nodeFailfast Sep 7 17:21:43 sofo-s2 osafamfnd[2187]: Rebooting OpenSAF NodeId = 131599 EE Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId = 131599, SupervisionTime = 60 Sep 7 17:21:43 sofo-s2 opensaf_reboot: Rebooting local node; timeout=60 5. Below is the bt 0- 0x7fbbd5ffcb20 in memcmp () from /lib64/libc.so.6 1- 0x7fbbd7a10929 in ncs_patricia_tree_get (pTree=0x67b4c8, pKey=0x7d22531c "\017\001\002") at patricia.c:435 2- 0x0040800d in cpd_cpnd_info_node_get (cpnd_tree=0x67b4c8, dest=0x67ec60, cpnd_info_node=0x7d225350) at cpd_db.c:706 3- 0x0040cd56 in cpd_evt_proc_mds_evt (cb=0x67b340, evt=0x67ec50) at cpd_evt.c:1378 4- 0x004091cb in cpd_process_evt (evt=0x67ec40) at cpd_evt.c:107 5- 0x0041185f in cpd_main_process (cb=0x67b340) at cpd_init.c:661 6 - 0x00411b89 in main (argc=1, argv=0x7d225578) at cpd_main.c:74 Notes: 1. Syslog attached 2. bt attached 3. ckptd traces not enabled --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- ___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2011 ckptd seg faulted on active controller when trying to create checkpoint
--- ** [tickets:#2011] ckptd seg faulted on active controller when trying to create checkpoint** **Status:** unassigned **Milestone:** 4.7.2 **Created:** Thu Sep 08, 2016 07:28 AM UTC by Ritu Raj **Last Updated:** Thu Sep 08, 2016 07:28 AM UTC **Owner:** nobody **Attachments:** - [ckptd_bt](https://sourceforge.net/p/opensaf/tickets/2011/attachment/ckptd_bt) (2.6 kB; application/octet-stream) - [messages-20160907.bz2](https://sourceforge.net/p/opensaf/tickets/2011/attachment/messages-20160907.bz2) (380.1 kB; application/x-bzip) - [syslog2](https://sourceforge.net/p/opensaf/tickets/2011/attachment/syslog2) (1.4 MB; application/octet-stream) Environment details OS : Suse 64bit Changeset : 7997 ( 5.1.FC) Setup : 4 nodes ( 2 controllers and 2 payloads with headless feature disabled & 1PBE enabled with 30K objects ) Summary : ckptd crashed on active controller when trying to create checkpoint during failover Steps followed & Observed behaviour 1. Initially ran some CKPT test scenarios, along with failovers. After the end of the test scenarios, The following IMM objects & replicas are not deleted sofo-s3:/dev/shm # immfind | grep 101 safCkpt=all_replicas_ckpt_name_101 safCkpt=collocated_ckpt_name_101 safReplica=safNode=PL-3\,safCluster=myClmCluster,safCkpt=all_replicas_ckpt_name_101 safReplica=safNode=PL-3\,safCluster=myClmCluster,safCkpt=collocated_ckpt_name_101 safReplica=safNode=SC-1\,safCluster=myClmCluster,safCkpt=all_replicas_ckpt_name_101 safReplica=safNode=SC-2\,safCluster=myClmCluster,safCkpt=all_replicas_ckpt_name_101 2. When ckpt is created with the earlier name (all_replicas_ckpt_name_101) observed the following error in syslog. Also CkptOpen failed with ERR_LIBRARY. >> saImmOiRtObjectCreate_2 failed with error = 14 >> Sep 7 17:21:11 sofo-s2 osafimmnd[2137]: NO PBE-OI established on this SC. Dumping incrementally to file imm.db Sep 7 17:21:12 sofo-s2 osafckptd[2284]: ER create_runtime_ckpt_object - saImmOiRtObjectCreate_2 failed with error = 14 Sep 7 17:21:12 sofo-s2 osafckptd[2284]: ER create runtime ckpt object failed with error: 14 Sep 7 17:21:12 sofo-s2 osafckptd[2284]: ER cpd db add ckpt_node failed for ckpt_id:2 4. After some time cpktd seg faulted on active controller >> Sep 7 17:21:43 sofo-s2 osafamfnd[2187]: NO 'safComp=CPD,safSu=SC-2,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : Recovery is 'nodeFailfast' Sep 7 17:21:43 sofo-s2 osafamfnd[2187]: ER safComp=CPD,safSu=SC-2,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery is:nodeFailfast Sep 7 17:21:43 sofo-s2 osafamfnd[2187]: Rebooting OpenSAF NodeId = 131599 EE Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId = 131599, SupervisionTime = 60 Sep 7 17:21:43 sofo-s2 opensaf_reboot: Rebooting local node; timeout=60 5. Below is the bt 0- 0x7fbbd5ffcb20 in memcmp () from /lib64/libc.so.6 1- 0x7fbbd7a10929 in ncs_patricia_tree_get (pTree=0x67b4c8, pKey=0x7d22531c "\017\001\002") at patricia.c:435 2- 0x0040800d in cpd_cpnd_info_node_get (cpnd_tree=0x67b4c8, dest=0x67ec60, cpnd_info_node=0x7d225350) at cpd_db.c:706 3- 0x0040cd56 in cpd_evt_proc_mds_evt (cb=0x67b340, evt=0x67ec50) at cpd_evt.c:1378 4- 0x004091cb in cpd_process_evt (evt=0x67ec40) at cpd_evt.c:107 5- 0x0041185f in cpd_main_process (cb=0x67b340) at cpd_init.c:661 6 - 0x00411b89 in main (argc=1, argv=0x7d225578) at cpd_main.c:74 Notes: 1. Syslog attached 2. bt attached 3. ckptd traces not enabled --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- ___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets