I tried to reproduce this ticket on opensaf 5.18.04.
Setup : 4 nodes ( 2 controllers and 2 payloads with headless feature disabled &
1PBE enabled with 30K objects )
1)In this setup, SC-1 is Active and SC-2 is standby
root@mohan-VirtualBox:~# amf-state siass
safSISU=safSu=SC-2\,safSg=2N\,safApp=OpenSAF,safSi=SC-2N,safApp=OpenSAF
saAmfSISUHAState=STANDBY(2)
saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC-2N,safApp=OpenSAF
saAmfSISUHAState=ACTIVE(1)
saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
safSISU=safSu=SC-1\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed2,safApp=OpenSAF
saAmfSISUHAState=ACTIVE(1)
saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
safSISU=safSu=PL-4\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed4,safApp=OpenSAF
saAmfSISUHAState=ACTIVE(1)
saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
safSISU=safSu=SC-2\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed3,safApp=OpenSAF
saAmfSISUHAState=ACTIVE(1)
saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
safSISU=safSu=PL-3\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed1,
safApp=OpenSAF
saAmfSISUHAState=ACTIVE(1)
saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
2)I ran the demo application, which creates checkpoints continuously on PL-3
root@mohan-VirtualBox:/home/mohan/ticket2011# ./tkt
*******************************************************************
Demonstrating Checkpoint Service
*******************************************************************
Initialising With Checkpoint Service....
saCkptInitialize passed
Opening Checkpoint = safCkpt=DemoCkpt000,safApp=safCkptService with create
flags....
saCkptCheckpointOpen passed
Opening Checkpoint = safCkpt=DemoCkpt001,safApp=safCkptService with create
flags....
saCkptCheckpointOpen passed
Opening Checkpoint = safCkpt=DemoCkpt002,safApp=safCkptService with create
flags....
saCkptCheckpointOpen passed
Opening Checkpoint = safCkpt=DemoCkpt003,safApp=safCkptService with create
flags....
saCkptCheckpointOpen passed
Opening Checkpoint = safCkpt=DemoCkpt004,safApp=safCkptService with create
flags....
saCkptCheckpointOpen passed
Opening Checkpoint = safCkpt=DemoCkpt005,safApp=safCkptService with create
flags....
saCkptCheckpointOpen failed 6
Opening Checkpoint = safCkpt=DemoCkpt006,safApp=safCkptService with create
flags....
saCkptCheckpointOpen failed 6
Opening Checkpoint = safCkpt=DemoCkpt007,safApp=safCkptService with create
flags....
3)while running the demo application on Pl-3, I done failover.
root@mohan-VirtualBox:~# /etc/init.d/opensafd stop
[ ok ] Stopping opensafd (via systemctl): opensafd.service.
root@mohan-VirtualBox:~# /etc/init.d/opensafd start
[ ok ] Starting opensafd (via systemctl): opensafd.service.
root@mohan-VirtualBox:~# amf-state siass
safSISU=safSu=SC-2\,safSg=2N\,safApp=OpenSAF,safSi=SC-2N,safApp=OpenSAF
saAmfSISUHAState=ACTIVE(1)
saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
safSISU=safSu=SC-2\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed3, safApp=OpenSAF
saAmfSISUHAState=ACTIVE(1)
saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
safSISU=safSu=PL-4\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed4, safApp=OpenSAF
saAmfSISUHAState=ACTIVE(1)
saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
safSISU=safSu=PL-3\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed1, safApp=OpenSAF
saAmfSISUHAState=ACTIVE(1)
saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC-2N,safApp=OpenSAF
saAmfSISUHAState=STANDBY(2)
saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
safSISU=safSu=SC-1\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed2,safApp=OpenSAF
saAmfSISUHAState=ACTIVE(1)
saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
4)I observed that ckptd is not crashed on active controller.
Sep 3 12:11:23 mohan-VirtualBox opensafd: OpenSAF services successfully stopped
Sep 3 12:11:23 mohan-VirtualBox opensafd[3193]: Stopping OpenSAF Services: *
Sep 3 12:11:23 mohan-VirtualBox systemd[1]: Stopped OpenSAF daemon.
Sep 3 12:11:28 mohan-VirtualBox dhclient[3176]: DHCPDISCOVER on enp0s8 to
255.255.255.255 port 67 interval 14 (xid=0xdde9787c)
Sep 3 12:11:29 mohan-VirtualBox systemd[1]: Starting OpenSAF daemon...
Sep 3 12:11:29 mohan-VirtualBox opensafd: Starting OpenSAF Services(5.18.04 -
c5117a898d331edb395434df56d630449a9ad7d2) (Using TIPC)
Sep 3 12:11:29 mohan-VirtualBox opensafd: Reboot file
/var/log/opensaf/clm_cluster_reboot_in_progress not found, startup continue...
Sep 3 12:11:29 mohan-VirtualBox opensafd[3820]: logtrace: trace enabled to
file 'opensafd.log', mask=0x0
Sep 3 12:11:30 mohan-VirtualBox kernel: [ 2141.109226] tipc: Activated
(version 2.0.0)
Sep 3 12:11:30 mohan-VirtualBox kernel: [ 2141.109295] NET: Registered
protocol family 30
Sep 3 12:11:30 mohan-VirtualBox kernel: [ 2141.109408] tipc: Started in single
node mode
Sep 3 12:11:30 mohan-VirtualBox kernel: [ 2141.181816] Started in network mode
Sep 3 12:11:30 mohan-VirtualBox kernel: [ 2141.181820] Own node address
<1.1.1>, network identity 4000
Sep 3 12:11:30 mohan-VirtualBox kernel: [ 2141.191090] Enabled bearer
<eth:enp0s3>, discovery domain <1.1.0>, priority 10
Sep 3 12:11:30 mohan-VirtualBox osaftransportd[3855]: Started
Sep 3 12:11:30 mohan-VirtualBox opensafd[3820]: NO Monitoring of TRANSPORT
started
Sep 3 12:11:30 mohan-VirtualBox osafclmna[3860]: Started
Sep 3 12:11:30 mohan-VirtualBox opensafd[3820]: NO Monitoring of CLMNA started
Sep 3 12:11:31 mohan-VirtualBox osafrded[3870]: Started
Sep 3 12:11:31 mohan-VirtualBox opensafd[3820]: NO Monitoring of RDE started
Sep 3 12:11:31 mohan-VirtualBox osafclmna[3860]: NO Starting to promote this
node to a system controller
Sep 3 12:11:31 mohan-VirtualBox osafrded[3870]: NO Requesting ACTIVE role
Sep 3 12:11:31 mohan-VirtualBox osafrded[3870]: NO RDE role set to Undefined
Sep 3 12:11:31 mohan-VirtualBox osaffmd[3880]: Started
Sep 3 12:11:31 mohan-VirtualBox osafrded[3870]: NO Peer up on node 0x2020f
Sep 3 12:11:31 mohan-VirtualBox osafclmna[3860]: NO
safNode=SC-1,safCluster=myClmCluster Joined cluster, nodeid=2010f
Sep 3 12:11:31 mohan-VirtualBox osafrded[3870]: NO Got peer info response from
node 0x2020f with role ACTIVE
Sep 3 12:11:31 mohan-VirtualBox osafrded[3870]: NO RDE role set to QUIESCED
Sep 3 12:11:31 mohan-VirtualBox osafrded[3870]: NO Giving up election against
0x2020f with role ACTIVE. My role is now QUIESCED
Sep 3 12:11:31 mohan-VirtualBox opensafd[3820]: NO Monitoring of HLFM started
Sep 3 12:11:31 mohan-VirtualBox osafimmd[3891]: Started
Sep 3 12:11:31 mohan-VirtualBox opensafd[3820]: NO Monitoring of IMMD started
Sep 3 12:11:32 mohan-VirtualBox osafimmnd[3903]: Started
Sep 3 12:11:32 mohan-VirtualBox osafimmnd[3903]: NO Use default reserved class
names.
Sep 3 12:11:32 mohan-VirtualBox osafimmnd[3903]: NO Persistent Back-End
capability configured, Pbe file:imm.db (suffix may get added)
Sep 3 12:11:32 mohan-VirtualBox osafimmnd[3903]: NO IMMD service is UP ...
ScAbsenseAllowed?:0 introduced?:0
Sep 3 12:11:32 mohan-VirtualBox osafimmnd[3903]: NO SERVER STATE:
IMM_SERVER_ANONYMOUS --> IMM_SERVER_CLUSTER_WAITING
Sep 3 12:11:32 mohan-VirtualBox osafimmnd[3903]: NO Fevs count adjusted to
32402 preLoadPid: 0
Sep 3 12:11:32 mohan-VirtualBox osafimmnd[3903]: NO Sync client discarded
classimplementer set. Impl-id:27 Class:SaSmfCampaign
Sep 3 12:11:32 mohan-VirtualBox osafimmnd[3903]: NO Sync client discarded
classimplementer set. Impl-id:27 Class:OpenSafSmfConfig
Sep 3 12:11:32 mohan-VirtualBox osafimmnd[3903]: NO Sync client discarded
classimplementer set. Impl-id:27 Class:SaSmfSwBundle
Sep 3 12:11:32 mohan-VirtualBox osafimmnd[3903]: NO Sync client discarded
classimplementer set. Impl-id:27 Class:OpenSafSmfExecControl
Sep 3 12:11:32 mohan-VirtualBox osafimmnd[3903]: NO SERVER STATE:
IMM_SERVER_CLUSTER_WAITING --> IMM_SERVER_LOADING_PENDING
Sep 3 12:11:32 mohan-VirtualBox osafimmnd[3903]: NO SERVER STATE:
IMM_SERVER_LOADING_PENDING --> IMM_SERVER_SYNC_PENDING
Sep 3 12:11:32 mohan-VirtualBox osafimmnd[3903]: NO NODE STATE->
IMM_NODE_ISOLATED
Sep 3 12:11:32 mohan-VirtualBox NetworkManager[706]: <warn> [1535956892.3765]
dhcp4 (enp0s8): request timed out
Sep 3 12:11:32 mohan-VirtualBox NetworkManager[706]: <warn> [1535956892.3765]
dhcp4 (enp0s8): request timed out
Sep 3 12:11:32 mohan-VirtualBox NetworkManager[706]: <info> [1535956892.3766]
dhcp4 (enp0s8): state changed unknown -> timeout
Sep 3 12:11:32 mohan-VirtualBox NetworkManager[706]: <info> [1535956892.3780]
dhcp4 (enp0s8): canceled DHCP transaction, DHCP client pid 3176
Sep 3 12:11:32 mohan-VirtualBox NetworkManager[706]: <info> [1535956892.3780]
dhcp4 (enp0s8): state changed timeout -> done
Sep 3 12:11:32 mohan-VirtualBox NetworkManager[706]: <info> [1535956892.3799]
device (enp0s8): state change: ip-config -> failed (reason
'ip-config-unavailable') [70 120 5]
Sep 3 12:11:32 mohan-VirtualBox NetworkManager[706]: <info> [1535956892.3818]
policy: disabling autoconnect for connection 'Wired connection 2'.
Sep 3 12:11:32 mohan-VirtualBox NetworkManager[706]: <warn> [1535956892.3820]
device (enp0s8): Activation: failed for connection 'Wired connection 2'
Sep 3 12:11:32 mohan-VirtualBox NetworkManager[706]: <info> [1535956892.3961]
device (enp0s8): state change: failed -> disconnected (reason 'none') [120 30 0]
Sep 3 12:11:32 mohan-VirtualBox avahi-daemon[635]: Withdrawing address record
for fe80::1568:1729:bf0a:6028 on enp0s8.
Sep 3 12:11:32 mohan-VirtualBox avahi-daemon[635]: Leaving mDNS multicast
group on interface enp0s8.IPv6 with address fe80::1568:1729:bf0a:6028.
Sep 3 12:11:32 mohan-VirtualBox avahi-daemon[635]: Interface enp0s8.IPv6 no
longer relevant for mDNS.
Sep 3 12:11:32 mohan-VirtualBox osafimmnd[3903]: NO NODE STATE->
IMM_NODE_W_AVAILABLE
Sep 3 12:11:32 mohan-VirtualBox osafimmnd[3903]: NO SERVER STATE:
IMM_SERVER_SYNC_PENDING --> IMM_SERVER_SYNC_CLIENT
Sep 3 12:11:43 mohan-VirtualBox osafimmnd[3903]: NO NODE STATE->
IMM_NODE_FULLY_AVAILABLE 2799
Sep 3 12:11:43 mohan-VirtualBox osafimmnd[3903]: NO RepositoryInitModeT is
SA_IMM_KEEP_REPOSITORY
Sep 3 12:11:43 mohan-VirtualBox osafimmnd[3903]: WA IMM Access Control mode is
DISABLED!
Sep 3 12:11:43 mohan-VirtualBox osafimmnd[3903]: NO Epoch set to 154 in
ImmModel
Sep 3 12:11:43 mohan-VirtualBox osafimmnd[3903]: NO SERVER STATE:
IMM_SERVER_SYNC_CLIENT --> IMM_SERVER_READY
Sep 3 12:11:43 mohan-VirtualBox osafimmnd[3903]: NO ImmModel received
scAbsenceAllowed 0
Sep 3 12:11:43 mohan-VirtualBox opensafd[3820]: NO Monitoring of IMMND started
Sep 3 12:11:43 mohan-VirtualBox osaflogd[3918]: Started
Sep 3 12:11:43 mohan-VirtualBox opensafd[3820]: NO Monitoring of LOGD started
Sep 3 12:11:43 mohan-VirtualBox osafntfd[3929]: Started
Sep 3 12:11:43 mohan-VirtualBox opensafd[3820]: NO Monitoring of NTFD started
Sep 3 12:11:43 mohan-VirtualBox osafclmd[3940]: Started
Sep 3 12:11:43 mohan-VirtualBox opensafd[3820]: NO Monitoring of CLMD started
Sep 3 12:11:43 mohan-VirtualBox osafamfd[3951]: Started
Sep 3 12:11:43 mohan-VirtualBox opensafd[3820]: NO Monitoring of AMFD started
Sep 3 12:11:43 mohan-VirtualBox osafamfnd[3962]: Started
Sep 3 12:11:43 mohan-VirtualBox osafamfnd[3962]: NO Start monitoring AMFD
using /var/lib/opensaf/osafamfd.fifo
Sep 3 12:11:43 mohan-VirtualBox osafamfnd[3962]: NO Sending node up due to
NCSMDS_UP
Sep 3 12:11:44 mohan-VirtualBox osafamfnd[3962]: NO
'safSu=SC-1,safSg=NoRed,safApp=OpenSAF' Presence State UNINSTANTIATED =>
INSTANTIATING
Sep 3 12:11:44 mohan-VirtualBox osafamfnd[3962]: NO
'safSu=SC-1,safSg=2N,safApp=OpenSAF' Presence State UNINSTANTIATED =>
INSTANTIATING
Sep 3 12:11:44 mohan-VirtualBox osafsmfd[3980]: Started
Sep 3 12:11:44 mohan-VirtualBox osafsmfnd[3982]: Started
Sep 3 12:11:44 mohan-VirtualBox osafsmfnd[3982]: NO MDS initialize_smfnd:
smfnd_mds_init()
Sep 3 12:11:44 mohan-VirtualBox osafsmfnd[3982]: NO MDS smfnd_mds_init:
mds_get_handle()
Sep 3 12:11:44 mohan-VirtualBox osafsmfnd[3982]: NO MDS mds_get_handle: Done
Sep 3 12:11:44 mohan-VirtualBox osafsmfnd[3982]: NO MDS smfnd_mds_init:
mds_register()
Sep 3 12:11:44 mohan-VirtualBox osafsmfnd[3982]: NO MDS mds_svc_event:
NCSMDS_SVC_ID_SMFD dest = 0xf
Sep 3 12:11:44 mohan-VirtualBox osafsmfnd[3982]: NO MDS smfnd_mds_init: Done
Sep 3 12:11:44 mohan-VirtualBox osaflcknd[4030]: Started
Sep 3 12:11:44 mohan-VirtualBox osafmsgd[4031]: Started
Sep 3 12:11:44 mohan-VirtualBox osafckptnd[4057]: Started
Sep 3 12:11:44 mohan-VirtualBox osaflckd[4083]: Started
Sep 3 12:11:45 mohan-VirtualBox osafevtd[4106]: Started
Sep 3 12:11:45 mohan-VirtualBox osafamfnd[3962]: NO
'safSu=SC-1,safSg=NoRed,safApp=OpenSAF' Presence State INSTANTIATING =>
INSTANTIATED
Sep 3 12:11:45 mohan-VirtualBox osafimmnd[3903]: NO Implementer connected: 29
(MsgQueueService131343) <134, 2010f>
Sep 3 12:11:45 mohan-VirtualBox osafamfnd[3962]: NO Assigning
'safSi=SC-2N,safApp=OpenSAF' STANDBY to 'safSu=SC-1,safSg=2N,safApp=OpenSAF'
Sep 3 12:11:45 mohan-VirtualBox osafrded[3870]: NO RDE role set to STANDBY
Sep 3 12:11:45 mohan-VirtualBox osafimmd[3891]: NO MDS event from svc_id 24
(change:3, dest:13)
Sep 3 12:11:45 mohan-VirtualBox osafimmd[3891]: NO MDS event from svc_id 24
(change:5, dest:13)
Sep 3 12:11:45 mohan-VirtualBox osafimmd[3891]: NO MDS event from svc_id 24
(change:5, dest:13)
Sep 3 12:11:45 mohan-VirtualBox osafrded[3870]: NO Peer up on node 0x2020f
Sep 3 12:11:45 mohan-VirtualBox osafrded[3870]: NO Got peer info response from
node 0x2020f with role ACTIVE
Sep 3 12:11:45 mohan-VirtualBox osafimmd[3891]: NO MDS event from svc_id 25
(change:3, dest:564116203505534)
Sep 3 12:11:45 mohan-VirtualBox osafimmd[3891]: NO MDS event from svc_id 25
(change:3, dest:565215709162246)
Sep 3 12:11:45 mohan-VirtualBox osafimmd[3891]: NO MDS event from svc_id 25
(change:3, dest:567412432561139)
Sep 3 12:11:45 mohan-VirtualBox osafimmd[3891]: NO MDS event from svc_id 25
(change:3, dest:566315084623799)
Sep 3 12:11:45 mohan-VirtualBox osaflogd[3918]: NO LOGSV_DATA_GROUPNAME not
found
Sep 3 12:11:45 mohan-VirtualBox osaflogd[3918]: NO LOG root directory is:
"/var/log/opensaf/saflog"
Sep 3 12:11:45 mohan-VirtualBox osaflogd[3918]: NO LOG data group is: ""
Sep 3 12:11:45 mohan-VirtualBox osaflogd[3918]: NO LGS_MBCSV_VERSION = 7
Sep 3 12:11:45 mohan-VirtualBox osafimmnd[3903]: NO Implementer (applier)
connected: 30 (@safAmfService2010f) <142, 2010f>
Sep 3 12:11:45 mohan-VirtualBox osafamfnd[3962]: NO Assigning
'safSi=NoRed2,safApp=OpenSAF' ACTIVE to 'safSu=SC-1,safSg=NoRed,safApp=OpenSAF'
Sep 3 12:11:45 mohan-VirtualBox osafamfnd[3962]: NO Assigned
'safSi=NoRed2,safApp=OpenSAF' ACTIVE to 'safSu=SC-1,safSg=NoRed,safApp=OpenSAF'
Sep 3 12:11:46 mohan-VirtualBox osafsmfd[3980]: NO MDS
initialize_for_assignment: smfd_mds_init()
Sep 3 12:11:46 mohan-VirtualBox osafsmfd[3980]: NO MDS smfd_mds_init:
mds_vdest_create()
Sep 3 12:11:46 mohan-VirtualBox osafsmfd[3980]: NO MDS mds_vdest_create: VDEST
created
Sep 3 12:11:46 mohan-VirtualBox osafsmfd[3980]: NO MDS smfd_mds_init:
mds_register()
Sep 3 12:11:46 mohan-VirtualBox osafsmfd[3980]: NO MDS mds_register: mds
registration is done
Sep 3 12:11:46 mohan-VirtualBox osafsmfd[3980]: NO MDS smfd_mds_init:
smfd_mds_change_role()
Sep 3 12:11:46 mohan-VirtualBox opensafd[3798]: Starting OpenSAF Services
(Using TIPC): *
Sep 3 12:11:46 mohan-VirtualBox osafsmfd[3980]: NO MDS smfd_mds_change_role:
Setting; arg.info.vdest_chg_role.i_vdest = 0xf, ncsvda_api() rc = 1
Sep 3 12:11:46 mohan-VirtualBox osafsmfd[3980]: NO initialize_for_assignment:
smfd_mds_init() Done
Sep 3 12:11:46 mohan-VirtualBox osafsmfd[3980]: NO MDS amf_csi_set_callback:
smfd_mds_change_role()
Sep 3 12:11:46 mohan-VirtualBox systemd[1]: Started OpenSAF daemon.
Sep 3 12:11:46 mohan-VirtualBox osafsmfd[3980]: NO MDS smfd_mds_change_role:
Setting; arg.info.vdest_chg_role.i_vdest = 0xf, ncsvda_api() rc = 1
Sep 3 12:11:46 mohan-VirtualBox osafamfnd[3962]: NO Assigned
'safSi=SC-2N,safApp=OpenSAF' STANDBY to 'safSu=SC-1,safSg=2N,safApp=OpenSAF'
Sep 3 12:11:46 mohan-VirtualBox opensafd: OpenSAF(5.18.04 -
c5117a898d331edb395434df56d630449a9ad7d2) services successfully started
Sep 3 12:11:48 mohan-VirtualBox osafamfd[3951]: NO Cold sync complete!
Sep 3 12:14:31 mohan-VirtualBox dhc
5)I did multiple failovers, but didn't observe the crash.
Because, The test reproduction step is not clear and ckptd trace is not
enabled, so i cant reproduce it and debug it further.
I tried to reproduce with these steps as close as possible but failed to
reproduce it.
I am closing it as of now, please reopen it with reproduceable steps and ckptd
traces if possible.
---
** [tickets:#2011] ckptd seg faulted on active controller when trying to create
checkpoint**
**Status:** accepted
**Milestone:** 5.18.09
**Created:** Thu Sep 08, 2016 07:28 AM UTC by Ritu Raj
**Last Updated:** Fri Aug 31, 2018 02:46 PM UTC
**Owner:** Mohan Kanakam
**Attachments:**
-
[ckptd_bt](https://sourceforge.net/p/opensaf/tickets/2011/attachment/ckptd_bt)
(2.6 kB; application/octet-stream)
-
[messages-20160907.bz2](https://sourceforge.net/p/opensaf/tickets/2011/attachment/messages-20160907.bz2)
(380.1 kB; application/x-bzip)
- [syslog2](https://sourceforge.net/p/opensaf/tickets/2011/attachment/syslog2)
(1.4 MB; application/octet-stream)
Environment details
OS : Suse 64bit
Changeset : 7997 ( 5.1.FC)
Setup : 4 nodes ( 2 controllers and 2 payloads with headless feature disabled &
1PBE enabled with 30K objects )
Summary :
ckptd crashed on active controller when trying to create checkpoint during
failover
Steps followed & Observed behaviour
1. Initially ran some CKPT test scenarios, along with failovers. After the end
of the test scenarios, The following IMM objects & replicas are not deleted
sofo-s3:/dev/shm # immfind | grep 101
safCkpt=all_replicas_ckpt_name_101
safCkpt=collocated_ckpt_name_101
safReplica=safNode=PL-3\,safCluster=myClmCluster,safCkpt=all_replicas_ckpt_name_101
safReplica=safNode=PL-3\,safCluster=myClmCluster,safCkpt=collocated_ckpt_name_101
safReplica=safNode=SC-1\,safCluster=myClmCluster,safCkpt=all_replicas_ckpt_name_101
safReplica=safNode=SC-2\,safCluster=myClmCluster,safCkpt=all_replicas_ckpt_name_101
2. When ckpt is created with the earlier name (all_replicas_ckpt_name_101)
observed the following error in syslog. Also CkptOpen failed with ERR_LIBRARY.
>> saImmOiRtObjectCreate_2 failed with error = 14
>>
Sep 7 17:21:11 sofo-s2 osafimmnd[2137]: NO PBE-OI established on this SC.
Dumping incrementally to file imm.db
Sep 7 17:21:12 sofo-s2 osafckptd[2284]: ER create_runtime_ckpt_object -
saImmOiRtObjectCreate_2 failed with error = 14
Sep 7 17:21:12 sofo-s2 osafckptd[2284]: ER create runtime ckpt object failed
with error: 14
Sep 7 17:21:12 sofo-s2 osafckptd[2284]: ER cpd db add ckpt_node failed for
ckpt_id:2
4. After some time cpktd seg faulted on active controller
>>
Sep 7 17:21:43 sofo-s2 osafamfnd[2187]: NO
'safComp=CPD,safSu=SC-2,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' :
Recovery is 'nodeFailfast'
Sep 7 17:21:43 sofo-s2 osafamfnd[2187]: ER
safComp=CPD,safSu=SC-2,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery
is:nodeFailfast
Sep 7 17:21:43 sofo-s2 osafamfnd[2187]: Rebooting OpenSAF NodeId = 131599 EE
Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId =
131599, SupervisionTime = 60
Sep 7 17:21:43 sofo-s2 opensaf_reboot: Rebooting local node; timeout=60
5. Below is the bt
0- 0x00007fbbd5ffcb20 in memcmp () from /lib64/libc.so.6
1- 0x00007fbbd7a10929 in ncs_patricia_tree_get (pTree=0x67b4c8,
pKey=0x7ffffd22531c "\017\001\002") at patricia.c:435
2- 0x000000000040800d in cpd_cpnd_info_node_get (cpnd_tree=0x67b4c8,
dest=0x67ec60, cpnd_info_node=0x7ffffd225350) at cpd_db.c:706
3- 0x000000000040cd56 in cpd_evt_proc_mds_evt (cb=0x67b340, evt=0x67ec50) at
cpd_evt.c:1378
4- 0x00000000004091cb in cpd_process_evt (evt=0x67ec40) at cpd_evt.c:107
5- 0x000000000041185f in cpd_main_process (cb=0x67b340) at cpd_init.c:661
6 - 0x0000000000411b89 in main (argc=1, argv=0x7ffffd225578) at cpd_main.c:74
Notes:
1. Syslog attached
2. bt attached
3. ckptd traces not enabled
---
Sent from sourceforge.net because [email protected] is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets