- **status**: accepted --> review
- **Comment**:
It seems to be the case where IMM is returning BAD_HANDLE :
Feb 15 6:32:17.783637 osafclmd
[3972:../../opensaf/src/imm/agent/imma_om_api.cc:7429] << search_init_common
Feb 15 6:32:17.784315 osafclmd
[3972:../../opensaf/src/imm/agent/imma_mds.cc:0673] WA OpenSAF imm lib: Message
loss detected for dest 565213425675031 service id:25
Feb 15 6:32:17.784330 osafclmd
[3972:../../opensaf/src/imm/agent/imma_db.cc:0631] >> imma_mark_clients_stale
Feb 15 6:32:17.784338 osafclmd
[3972:../../opensaf/src/imm/agent/imma_db.cc:0674] TR Search id 218 for handle
d90002020f closed for stale imm-handle
Feb 15 6:32:17.784353 osafclmd
[3972:../../opensaf/src/imm/agent/imma_db.cc:0683] WA marking handle as exposed
Feb 15 6:32:17.784359 osafclmd
[3972:../../opensaf/src/imm/agent/imma_db.cc:0689] TR Stale marked client
cl:217 node:2020f
Feb 15 6:32:17.784366 osafclmd
[3972:../../opensaf/src/imm/agent/imma_db.cc:0775] >> isExposed
Feb 15 6:32:17.784371 osafclmd
[3972:../../opensaf/src/imm/agent/imma_db.cc:0836] TR isExposed Returning
Exposed:1
Feb 15 6:32:17.784377 osafclmd
[3972:../../opensaf/src/imm/agent/imma_db.cc:0837] << isExposed
Feb 15 6:32:17.784383 osafclmd
[3972:../../opensaf/src/imm/agent/imma_db.cc:0704] << imma_mark_clients_stale
Feb 15 6:32:17.785329 osafclmd
[3972:../../opensaf/src/imm/agent/imma_om_api.cc:7630] T3 ERR_BAD_HANDLE:
client is stale and exposed
Feb 15 6:32:17.785359 osafclmd
[3972:../../opensaf/src/imm/agent/imma_om_api.cc:7807] >> saImmOmSearchFinalize
Feb 15 6:32:17.785367 osafclmd
[3972:../../opensaf/src/imm/agent/imma_om_api.cc:7865] T1 IMM Handle d90002020f
is stale
Feb 15 6:32:17.785376 osafclmd
[3972:../../opensaf/src/imm/agent/imma_om_api.cc:7965] << saImmOmSearchFinalize
Feb 15 6:32:17.785384 osafclmd
[3972:../../opensaf/src/imm/agent/imma_om_api.cc:0735] >> saImmOmFinalize
Feb 15 6:32:17.785391 osafclmd
[3972:../../opensaf/src/imm/agent/imma_om_api.cc:0769] T1 Handle d90002020f is
stale
Feb 15 6:32:17.785398 osafclmd
[3972:../../opensaf/src/imm/agent/imma_om_api.cc:0867] T3 Handle d90002020f is
stale
Feb 15 6:32:17.785405 osafclmd
[3972:../../opensaf/src/imm/agent/imma_proc.cc:0147] >>
imma_callback_ipc_destroy
Feb 15 6:32:17.785412 osafclmd
[3972:../../opensaf/src/imm/agent/imma_proc.cc:0000] <<
imma_callback_ipc_destroy
Feb 15 6:32:17.785417 osafclmd
[3972:../../opensaf/src/imm/agent/imma_proc.cc:0206] TR Deleting client node
Feb 15 6:32:17.785424 osafclmd
[3972:../../opensaf/src/imm/agent/imma_init.cc:0326] >> imma_shutdown
---
** [tickets:#2325] clm: standby clmd crashed after failing to read node
configuration from IMM.**
**Status:** review
**Milestone:** 5.0.2
**Created:** Fri Feb 24, 2017 09:32 AM UTC by Praveen
**Last Updated:** Fri Feb 24, 2017 10:01 AM UTC
**Owner:** Praveen
Issue is not reproducible.
While coming up as standby, CLMD successfully initializes with IMM. It
successfuly reads cluster related configuration. While reading node related
configuration from IMM, CLMD make a calls to saImmOmSearchNext_2(). This API
could not send any message to IMMND and failed:
Feb 15 06:32:17 SC-2-2 osafclmd[3972]: WA OpenSAF imm lib: Message loss
detected for dest 565213425675031 service id:25
Feb 15 06:32:17 SC-2-2 osafimmnd[3930]: WA IMMND - Client Node Get Failed for
cli_hdl:932008034831
Feb 15 06:32:17 SC-2-2 osafclmd[3972]: WA OpenSAF imm lib: Message loss
detected for dest 565213425675031 service id:25
Feb 15 06:32:17 SC-2-2 osafclmd[3972]: WA marking handle as exposed
CLMD does not explicitly check whether node config read was sucessful or not.
It comes and completes the cold sync. When a payload joins the cluster, active
CLMD checkpoints run time data for the node. Since node is not present on
standby CLMD, it crashes:
Feb 15 06:33:26 SC-2-2 osafimmd[3915]: NO SBY: New Epoch for IMMND process at
node 2020f old epoch: 22 new epoch:23
Feb 15 06:33:26 SC-2-2 osafclmd[3972]: ER Node is NULL,problem with the
database.
Feb 15 06:33:26 SC-2-2 osafclmd[3972]:
../../opensaf/src/clm/clmd/clms_mbcsv.c:468: ckpt_proc_node_rec: Assertion '0'
failed.
Feb 15 06:33:27 SC-2-2 osafamfnd[4002]: NO
'safComp=CLM,safSu=SC-2,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' :
Recovery is 'nodeFailfast'
---
Sent from sourceforge.net because [email protected] is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets