---
** [tickets:#1157] IMMD coredump**
**Status:** unassigned
**Milestone:** 4.6.FC
**Created:** Tue Oct 07, 2014 12:57 AM UTC by Adrian Szwej
**Last Updated:** Tue Oct 07, 2014 12:57 AM UTC
**Owner:** nobody
Changeset: **4.6.M0 - 6009:b2ddaa23aae4**
When starting ~50 linux containers IMMD coredumps resulting in cluster reset.
Communication is TCP.
dtmd.conf configuration is:
DTM_SOCK_SND_RCV_BUF_SIZE=65536
DTM_CLUSTER_ID=1
DTM_NODE_IP=172.17.1.42
DTM_MCAST_ADDR=224.0.0.6
BatchSize reduced to 4096
opensafImm=opensafImm,safApp=safImmService
Name Type Value(s)
========================================================================
opensafImmSyncBatchSize SA_UINT32_T 4096
(0x1000)
When node PL-51 joins the cluster the following messages is seen in the syslog:
Oct 6 00:35:57 SC-1 osafdtmd[1028]: NO Established contact with 'PL-51'
Oct 6 00:35:57 SC-1 osafimmd[1063]: NO Extended intro from node 2330f
Oct 6 00:35:57 SC-1 osafimmd[1063]: NO Node 2330f request sync sync-pid:79
epoch:0
Oct 6 00:35:58 SC-1 osafimmnd[1072]: NO Announce sync, epoch:292
Oct 6 00:35:58 SC-1 osafimmnd[1072]: NO SERVER STATE: IMM_SERVER_READY -->
IMM_SERVER_SYNC_SERVER
Oct 6 00:35:58 SC-1 osafimmnd[1072]: NO NODE STATE-> IMM_NODE_R_AVAILABLE
Oct 6 00:35:58 SC-1 osafimmd[1063]: NO Successfully announced sync. New
ruling epoch:292
Oct 6 00:35:58 SC-1 osafimmloadd: NO Sync starting
Oct 6 00:36:00 SC-1 osafimmd[1063]: MDTM unsent message is more!=200
Oct 6 00:36:00 SC-1 osafimmnd[1072]: WA Director Service in NOACTIVE state
- fevs replies pending:9 fevs highest processed:20037
Oct 6 00:36:00 SC-1 osafamfnd[1143]: NO
'safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' :
Recovery is 'nodeFailfast'
Oct 6 00:36:00 SC-1 osafamfnd[1143]: ER
safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery
is:nodeFailfast
Oct 6 00:36:00 SC-1 osafamfnd[1143]: Rebooting OpenSAF NodeId = 131343 EE
Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId =
131343, SupervisionTime = 60
Oct 6 00:36:00 SC-1 opensaf_reboot: Rebooting local node; timeout=60
Oct 6 00:36:00 SC-1 osafimmnd[1072]: NO No IMMD service => cluster
restart, exiting
There is a coredump generated:
core_1412555760.osafimmd.1063
---
Sent from sourceforge.net because [email protected] is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets