On Feb 21, 2008, at 2:05 PM, Andreas Kurz wrote:
On Thu, Feb 21, 2008 at 12:22 PM, Dejan Muhamedagic <[EMAIL PROTECTED]
> wrote:
Hi,
On Thu, Feb 21, 2008 at 12:19:55AM +0100, Johan Hoeke wrote:
LS,
Running a 2 node cluster, heartbeat-2.1.3-3 centos rpms, RH AS 4.6
While testing a "maintenance scenario" for the cluster I set all
resources to is_managed is false,
Feb 20 21:09:41 sierpinski pengine: [15725]: notice: native_print:
R_BB10PRD_DB (heartbeat::ocf:oracle): Started
sierpinski.uvt.nl (unmanaged)
and proceeded to shut oracle by hand, oracle being one of the
resources.
Feb 20 21:12:03 sierpinski oracle[23120]: [23145]: INFO: Oracle
instance
BB10PRD is down
Within minutes, the node was stonithed. The log shows that this was
right after the monitor operation for the oracle resource came
back with
return code 7:
Feb 20 21:12:03 sierpinski crmd: [4584]: info: process_lrm_event:
LRM
operation R_BB10PRD_DB_monitor_120000 (call=31, rc=7) complete
Feb 20 21:12:03 mandelbrot stonithd: [4580]: info:
stonith_operate_locally::2375: sending fencing op (RESET) for
sierpinski.uvt.nl to device external (rsc_id=R_ilo_sierpinski:0,
pid=5414)
Feb 20 21:12:03 mandelbrot stonithd: [4580]: info: Node
mandelbrot.uvt.nl try to help node sierpinski.uvt.nl to fence node
sierpinski.uvt.nl.
Conclusion: the monitor operation was still running even though the
resource was unmanaged, and it forced a fencing action.
Oops. So there's an on_fail=fence for this monitor operation. Is
that necessary?
I then made a script which in addition to changing the resources to
is_managed = false also set the monitor operations to disabled=true.
This worked, now I am able to shutdown oracle by hand without a
fencing
action starting up.
Questions:
It this expected behavior? Should monitor operations keep running
even
though the resources are set to is_managed=false?
Yes. There was some discussion about it and the majority of
votes went this way, i.e. that monitoring should continue even
for the unmanaged resources.
I also agree, that it is a good idea to continue monitoring for
unmanaged resources but I would see this behaviour as a bug if the
"on_fail" action is executed if its "fence".
right, unmanaged resources probably shouldn't have their on_fail
action executed when they fail.
did someone log a bug for this yet?
Regards,
Andreas
Is explicitly setting
the monitor operations to disable=true the "right way" to prevent
unwanted fencing actions during cluster maintenance?
I'd say yes. But note that I was also in favour of having
monitoring disabled by default.
Thanks,
Dejan
tia,
Johan
(happy to post hb_reports if requested)
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems