In the new set of logfiles I can not find the IMMND crash anywhere.
Also there is missing the syslog file for pl4.
I think there was a mistake when creating the sub-tarfile for pl3,
see below.
The logs_new.tgz contains four files:
pl3_logs.tgz
pl4_logs.tgz
sc1_logs.tgz
sc2_logs.tgz
But pl3_logs.tgz contains:
pl4_logs.tgz
sc1_logs.tgz
sc2_logs.tgz
That seems like an error in bundling.
And there is then no syslog for pl3 in pl3_logs.tgz
i unpacked all thes three sub-tarfiles but they contain redundant logfiles
for pl4, sc1 and sc2.
So if the crash occurred on pl3 this time, there is no way for me
to see the implementer-name/id that was involved in the crash.
-------
pl4_logs.tgz contains:
var/log/messages
var/log/opensaf/osafamfnd
var/log/opensaf/osafimmnd
var/log/opensaf/osafsmfnd
which includes the syslog for pl4.
But I cant find any immnd crash in it.
So I assume the crash this time occurred on pl3 but the syslog for
it is missing.
----------
sc1_logs.tgz contains:
var/log/opensaf/osafamfd
var/log/opensaf/osafamfnd
var/log/opensaf/osafimmnd
var/log/opensaf/osafsmfd
var/log/opensaf/osafsmfnd
var/log/messages
which includes the syslog for sc1, but the crash
does not occur here either and in fact the ticket suggests that this
crash is seen only (so far) on payloads.
------
sc2_logs.tgz contains:
var/log/messages
var/log/opensaf/osafamfd
var/log/opensaf/osafamfnd
var/log/opensaf/osafimmnd
var/log/opensaf/osafsmfd
var/log/opensaf/osafsmfnd
Seems to be correct syslog for sc2.
Same here, no IMMND crash observed.
So I need a set of syslogs that include the PL where the crash occurs and the
SC where the IMMND coord (generates the imm sync that causes tghe crash)
runs.
---
** [tickets:#855] immnd crash on payload node**
**Status:** accepted
**Milestone:** 4.3.3
**Created:** Tue Apr 15, 2014 10:11 AM UTC by surender khetavath
**Last Updated:** Fri Apr 25, 2014 03:56 PM UTC
**Owner:** Anders Bjornerstedt
case:
1) A component calls exit() when active_cbk is received.
All the components on all the nodes, due to continuous faults received
active-cbk and called exit() within the comp and cluster went for reboot. That
is expected. But immnd crashed on PL-4 and PL-5
/var/log/messages on PL-4 show:
Apr 15 15:12:01 PL-4 osafimmnd[7650]: ER Sync-verify: Established node has
different Implementer-id: 0 for name: @COMP2SU1TWONAPP, sync says 109.
Apr 15 15:12:01 PL-4 osafamfnd[7668]: NO
'safComp=IMMND,safSu=PL-4,safSg=NoRed,safApp=OpenSAF' faulted due to 'avaDown'
: Recovery is 'componentRestart'
(gdb) fr 2
#2 0x000000000043b7ec in ImmModel::finalizeSync (this=0x6baa00,
req=0x7fffcb25d160, isCoord=false,
isSyncClient=false) at ImmModel.cc:14482
14482 abort();
(gdb) l
14477
14478 if(!explained) {
14479 LOG_ER("Sync-verify: Established node has
different "
14480 "Implementer-id: %u for name: %s, sync
says %u. ",
14481 info->mId, implName.c_str(), ii->id);
14482 abort();
14483 }
14484
14485 } else if(info->mNodeId != ii->nodeId) {
14486 LOG_ER("Sync-verify: Missmatch on node-id "
(gdb) p explained
$1 = false
(gdb) q
logs attached and gdb output attached.
---
Sent from sourceforge.net because [email protected] is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.------------------------------------------------------------------------------
"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos. Get
unparalleled scalability from the best Selenium testing platform available.
Simple to use. Nothing to install. Get started now for free."
http://p.sf.net/sfu/SauceLabs
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets