Andrew Beekhof wrote: > > On Feb 21, 2008, at 2:05 PM, Andreas Kurz wrote: > >> On Thu, Feb 21, 2008 at 12:22 PM, Dejan Muhamedagic >> <[EMAIL PROTECTED]> wrote: >>> Hi, >>> >>> >>> On Thu, Feb 21, 2008 at 12:19:55AM +0100, Johan Hoeke wrote: >>>> LS, >>>> >>>> Running a 2 node cluster, heartbeat-2.1.3-3 centos rpms, RH AS 4.6 >>>> >>>> While testing a "maintenance scenario" for the cluster I set all >>>> resources to is_managed is false, >>>> >>>> Feb 20 21:09:41 sierpinski pengine: [15725]: notice: native_print: >>>> R_BB10PRD_DB (heartbeat::ocf:oracle): Started >>>> sierpinski.uvt.nl (unmanaged) >>>> >>>> >>>> and proceeded to shut oracle by hand, oracle being one of the >>>> resources. >>>> >>>> Feb 20 21:12:03 sierpinski oracle[23120]: [23145]: INFO: Oracle >>>> instance >>>> BB10PRD is down >>>> >>>> >>>> Within minutes, the node was stonithed. The log shows that this was >>>> right after the monitor operation for the oracle resource came back >>>> with >>>> return code 7: >>>> >>>> Feb 20 21:12:03 sierpinski crmd: [4584]: info: process_lrm_event: LRM >>>> operation R_BB10PRD_DB_monitor_120000 (call=31, rc=7) complete >>>> >>>> Feb 20 21:12:03 mandelbrot stonithd: [4580]: info: >>>> stonith_operate_locally::2375: sending fencing op (RESET) for >>>> sierpinski.uvt.nl to device external (rsc_id=R_ilo_sierpinski:0, >>>> pid=5414) >>>> Feb 20 21:12:03 mandelbrot stonithd: [4580]: info: Node >>>> mandelbrot.uvt.nl try to help node sierpinski.uvt.nl to fence node >>>> sierpinski.uvt.nl. >>>> >>>> Conclusion: the monitor operation was still running even though the >>>> resource was unmanaged, and it forced a fencing action. >>> >>> Oops. So there's an on_fail=fence for this monitor operation. Is >>> that necessary? >>> >>> >>>> I then made a script which in addition to changing the resources to >>>> is_managed = false also set the monitor operations to disabled=true. >>>> This worked, now I am able to shutdown oracle by hand without a fencing >>>> action starting up. >>>> >>>> Questions: >>>> >>>> It this expected behavior? Should monitor operations keep running even >>>> though the resources are set to is_managed=false? >>> >>> Yes. There was some discussion about it and the majority of >>> votes went this way, i.e. that monitoring should continue even >>> for the unmanaged resources. >> >> I also agree, that it is a good idea to continue monitoring for >> unmanaged resources but I would see this behaviour as a bug if the >> "on_fail" action is executed if its "fence".
I wouldn't want a on_fail=restart to be executed either > > right, unmanaged resources probably shouldn't have their on_fail action > executed when they fail. > did someone log a bug for this yet? not me, sorry :| I did reopen bug 1768 as mentioned in this thread. Have changed my clusters since then so I can't easily recreate the error here I'm afraid. Hope you enjoyed your vacation Andrew, and welcome back! regards, Johan
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
