On 04/07/11 23:16, Ulrich Windl wrote: >>>> Tim Serong<[email protected]> schrieb am 04.07.2011 um 13:34 in Nachricht > <[email protected]>: >> On 04/07/11 19:48, Ulrich Windl wrote: >>> Hi! >>> >>> This was found in SLES11 SP1 (Version: >> 1.1.5-5bd2b9154d7d9f86d7f56fe0a74072a5a6590c60): A resource is being >> displayed as "(unmanaged) FAILED". >>> I used "crm resource manage prm" to set the resource back to managed mode. >> However the resource is still displayed as "unmanaged" by "crm_mon". When >> inspecting the resource with "crm configure", the attribute is there as 'meta >> is-managed="true"'. So I guess the change in the CIB did not make ist way to >> crm_mon. Don't ask me how or why; I'm asking you. >> >> I'd guess the cluster attempted to stop the resource for some reason, >> but the stop failed, and STONITH is not configured. In this situation, >> the cluster can't manage the resource (it's not safely/cleanly stopped, >> and there's no way to kill the node it was running on to be sure). > > Hi Tim! > > You are correct: When I had STONITH enabled both nodes were periodically > rebooting. That was not fun. I'm trying to find out what's going on. Not as > easy as I'd wish... > > I feel CRM is in "insulted mode": It does very little with failed resources. > Do I really have to reboot the node to enable resource management?
If "stop" fails, there's not much it can do, because in the worst case, there's no safe way to recover from that situation. On that note, you might find http://ourobengr.com/ha useful. That being said, if *you* are looking at the system and you know the resource is cleanly stopped (even though the cluster failed to stop it for some reason), try "crm resource cleanup prm" and see if it comes good again. Or, restart corosync/openais on that node. But! Check the logs to see why the stop failed in the first place, and fix that :) Regards, Tim -- Tim Serong <[email protected]> Senior Clustering Engineer, OPS Engineering, Novell Inc. _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
