On Thu, Aug 17, 2006 at 05:53:36PM +0200, Andrew Beekhof wrote: > On 8/17/06, Andrew Beekhof <[EMAIL PROTECTED]> wrote: > >On 8/17/06, Dejan Muhamedagic <[EMAIL PROTECTED]> wrote: > >> Hello, > >> > >> Don't know if that's the way it's supposed to be, but doesn't look > >> like it to me. Here's the sequence of events as observed: > >> > >> - a group is put into the unmanaged mode > >> - one of the resources from that group is stopped manually > >> - a monitor operation runs on this resource and crm_mon prints > >> FAILED next to it > >> > >> This particular resource is configured to be monitored every 2 > >> minutes and if I wait long enough the monitor operation will be > >> run. It seems like that these operations are scheduled by the LRM, > >> right? > >> > >> Is there a way to tell LRM not to run any operations on a resource > >> which is not managed? > > > > [...] > > > >example: apache needs the disk. > > > >if you put the disk into unmanaged mode and stop it (without the CRM > >knowing) then apache monitoring is going to fail. shortly after we'll > >try and restart it which will fail too and prevent apache from ever > >running on that node again until the failure is cleared. > > > >if the CRM knows the disk is stopped, then apache fail to start > > that should read "apache WONT fail to start" > > >because we wont try to... that would require starting the disk first > >which is something we're not allowed to do.
Yes, I completely agree with you. But you should take into account that resources going into the unmanaged mode means that a living human being requested it. We should assume that they know what they are doing (though positive experience tells us otherwise). I think that the cluster should relinquish control in that case. Anyway, one can probably screw the cluster by indiscriminately managing resources without prior notice. On Thu, Aug 17, 2006 at 08:38:26PM +0200, Lars Marowsky-Bree wrote: > On 2006-08-17T17:52:43, Andrew Beekhof <[EMAIL PROTECTED]> wrote: > > > the reason i've not done this in the past is because although we're > > not managing the resource, its doesn't necessarily follow that we > > don't care if its running or not (more specifically the resources that > > depend on it and *are* managed probably do care). > > > > example: apache needs the disk. > > "unmanaged" might be used as a maintenance mode thingy, during which the > resource might appear to be failed to a monitor passing by. No action > should come from this, though. That's why I'm actually using it: to do management without the cluster interfering. My idea was that once a resource is put into the "unmanaged" mode, the cluster should forget about it. Once it gets back to being managed, the cluster should not presume anything about its state, but establish it through probing. In case the state of affairs is such that the thing becomes unrecoverable, well, we can just cry foul. After all, somebody was messing with the resource and failed to put things back into a meaningful state. > _Probably_ this means that unmanaged=1 needs to propagate upwards. Hrm. > I'm not sure. In order to satisfy constraints? I guess that it's not easy to answer this, but probably the only right way would be to recalculate everything. Cheers, Dejan _______________________________________________________ Linux-HA-Dev: [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
