- **status**: accepted --> review
- **Comment**:

It seems to be the case where IMM is returning BAD_HANDLE :
Feb 15  6:32:17.783637 osafclmd 
[3972:../../opensaf/src/imm/agent/imma_om_api.cc:7429] << search_init_common
Feb 15  6:32:17.784315 osafclmd 
[3972:../../opensaf/src/imm/agent/imma_mds.cc:0673] WA OpenSAF imm lib: Message 
loss detected for dest 565213425675031 service id:25
Feb 15  6:32:17.784330 osafclmd 
[3972:../../opensaf/src/imm/agent/imma_db.cc:0631] >> imma_mark_clients_stale
Feb 15  6:32:17.784338 osafclmd 
[3972:../../opensaf/src/imm/agent/imma_db.cc:0674] TR Search id 218 for handle 
d90002020f closed for stale imm-handle
Feb 15  6:32:17.784353 osafclmd 
[3972:../../opensaf/src/imm/agent/imma_db.cc:0683] WA marking handle as exposed
Feb 15  6:32:17.784359 osafclmd 
[3972:../../opensaf/src/imm/agent/imma_db.cc:0689] TR Stale marked client 
cl:217 node:2020f
Feb 15  6:32:17.784366 osafclmd 
[3972:../../opensaf/src/imm/agent/imma_db.cc:0775] >> isExposed
Feb 15  6:32:17.784371 osafclmd 
[3972:../../opensaf/src/imm/agent/imma_db.cc:0836] TR isExposed Returning 
Exposed:1
Feb 15  6:32:17.784377 osafclmd 
[3972:../../opensaf/src/imm/agent/imma_db.cc:0837] << isExposed
Feb 15  6:32:17.784383 osafclmd 
[3972:../../opensaf/src/imm/agent/imma_db.cc:0704] << imma_mark_clients_stale
Feb 15  6:32:17.785329 osafclmd 
[3972:../../opensaf/src/imm/agent/imma_om_api.cc:7630] T3 ERR_BAD_HANDLE: 
client is stale and exposed
Feb 15  6:32:17.785359 osafclmd 
[3972:../../opensaf/src/imm/agent/imma_om_api.cc:7807] >> saImmOmSearchFinalize
Feb 15  6:32:17.785367 osafclmd 
[3972:../../opensaf/src/imm/agent/imma_om_api.cc:7865] T1 IMM Handle d90002020f 
is stale
Feb 15  6:32:17.785376 osafclmd 
[3972:../../opensaf/src/imm/agent/imma_om_api.cc:7965] << saImmOmSearchFinalize
Feb 15  6:32:17.785384 osafclmd 
[3972:../../opensaf/src/imm/agent/imma_om_api.cc:0735] >> saImmOmFinalize
Feb 15  6:32:17.785391 osafclmd 
[3972:../../opensaf/src/imm/agent/imma_om_api.cc:0769] T1 Handle d90002020f is 
stale
Feb 15  6:32:17.785398 osafclmd 
[3972:../../opensaf/src/imm/agent/imma_om_api.cc:0867] T3 Handle d90002020f is 
stale
Feb 15  6:32:17.785405 osafclmd 
[3972:../../opensaf/src/imm/agent/imma_proc.cc:0147] >> 
imma_callback_ipc_destroy
Feb 15  6:32:17.785412 osafclmd 
[3972:../../opensaf/src/imm/agent/imma_proc.cc:0000] << 
imma_callback_ipc_destroy
Feb 15  6:32:17.785417 osafclmd 
[3972:../../opensaf/src/imm/agent/imma_proc.cc:0206] TR Deleting client node
Feb 15  6:32:17.785424 osafclmd 
[3972:../../opensaf/src/imm/agent/imma_init.cc:0326] >> imma_shutdown




---

** [tickets:#2325] clm: standby clmd crashed after failing to read node 
configuration from IMM.**

**Status:** review
**Milestone:** 5.0.2
**Created:** Fri Feb 24, 2017 09:32 AM UTC by Praveen
**Last Updated:** Fri Feb 24, 2017 10:01 AM UTC
**Owner:** Praveen


Issue is not reproducible.
While coming up as standby,  CLMD successfully initializes with IMM. It 
successfuly reads cluster related configuration. While reading node related 
configuration from IMM, CLMD make a calls to saImmOmSearchNext_2(). This API 
could not send any message to IMMND and failed:
Feb 15 06:32:17 SC-2-2 osafclmd[3972]: WA OpenSAF imm lib: Message loss 
detected for dest 565213425675031 service id:25
Feb 15 06:32:17 SC-2-2 osafimmnd[3930]: WA IMMND - Client Node Get Failed for 
cli_hdl:932008034831
Feb 15 06:32:17 SC-2-2 osafclmd[3972]: WA OpenSAF imm lib: Message loss 
detected for dest 565213425675031 service id:25
Feb 15 06:32:17 SC-2-2 osafclmd[3972]: WA marking handle as exposed

CLMD does not explicitly check  whether node config read was sucessful or not. 
It comes and completes the cold sync. When a payload joins the cluster, active 
CLMD checkpoints run time data for the node. Since node is not present on 
standby CLMD, it crashes:

Feb 15 06:33:26 SC-2-2 osafimmd[3915]: NO SBY: New Epoch for IMMND process at 
node 2020f old epoch: 22  new epoch:23
Feb 15 06:33:26 SC-2-2 osafclmd[3972]: ER Node is NULL,problem with the 
database.
Feb 15 06:33:26 SC-2-2 osafclmd[3972]: 
../../opensaf/src/clm/clmd/clms_mbcsv.c:468: ckpt_proc_node_rec: Assertion '0' 
failed.
Feb 15 06:33:27 SC-2-2 osafamfnd[4002]: NO 
'safComp=CLM,safSu=SC-2,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : 
Recovery is 'nodeFailfast'




---

Sent from sourceforge.net because [email protected] is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to