Hi AndersBj,
Reviewed the patch.
Ack.
/Neel.
On Tuesday 03 December 2013 05:49 PM, Anders Bjornerstedt wrote:
> osaf/services/saf/immsv/immnd/immnd_evt.c | 7 ++++++-
> 1 files changed, 6 insertions(+), 1 deletions(-)
>
>
> If an IMMND receives a discard-node message and the node to be discarded
> is the node the IMMND is executing on, then this is a clear indication of
> cluster network partitioning, or "split brain". The action taken by the IMMND
> was to osafassert if this happened. But such an assert generates a coredump
> resulting in unnecesary tickets on and troubleshooting off the IMM.
>
> With this patch, the error is instead log'ed to the syslog and then IMMND
> exits.
> It should then be clear why the IMMND restarts and that it was not due to an
> error in the IMMND.
>
> diff --git a/osaf/services/saf/immsv/immnd/immnd_evt.c
> b/osaf/services/saf/immsv/immnd/immnd_evt.c
> --- a/osaf/services/saf/immsv/immnd/immnd_evt.c
> +++ b/osaf/services/saf/immsv/immnd/immnd_evt.c
> @@ -8407,7 +8407,12 @@ static void immnd_evt_proc_discard_node(
> SaUint32T arrSize = 0;
> TRACE_ENTER();
> osafassert(evt);
> - osafassert(evt->info.ctrl.nodeId != cb->node_id);
> + if(evt->info.ctrl.nodeId == cb->node_id) {
> + LOG_ER("immnd_evt_proc_discard_node for *this* node %u => "
> + "Cluster partitioned (\"split brain\") - exiting",
> + cb->node_id);
> + exit(1);
> + }
> LOG_NO("Global discard node received for nodeId:%x pid:%u",
> evt->info.ctrl.nodeId, evt->info.ctrl.ndExecPid);
> /* We should remember the nodeId/pid pair to avoid a redundant message
> causing a newly reattached node being discarded.
------------------------------------------------------------------------------
Sponsored by Intel(R) XDK
Develop, test and display web and hybrid apps with a single code base.
Download it for free now!
http://pubads.g.doubleclick.net/gampad/clk?id=111408631&iu=/4140/ostg.clktrk
_______________________________________________
Opensaf-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-devel