On Wed, 12 Feb 2020 15:33:08 -0600 Ken Gaillot <[email protected]> wrote:
> On Wed, 2020-02-12 at 16:17 +0100, Jehan-Guillaume de Rorthais wrote: > > Last mail for today :) > > > > At least three actions exists in the OCF specs and are not fully > > supported in > > Pacemaker. > > > > 1. recovery > > > > Today, Pacemaker replace this action with a stop/start or > > demote/stop/start/promote transition. I bet some RA (at least PAF) > > would be > > able to deal with a recovery themselves faster in one action than > > with 2 or 4 > > actions (without counting notify). > > In principle it should be possible to support "recover" when the stop > and start are scheduled on the same node. It would be similar to how > pacemaker currently changes stop+start to live migration only when > certain conditions are met. Sounds good! In fact, that's how we detect recovery during notify actions in PAF, if the resource is stopped and started in the same transition. > One question would be how to handle "recover" failures. My first > instinct is that if recover fails, the cluster should switch to > stop+start, similar to a failed live migration. An alternative would be > to retry the recover action up to the migration-threshold then switch > to stop+start. If live migration already behave like that, then the first instinct seems more coherent. But I have no strong opinion. > > 2. migration-to and migration-from > > > > These two actions are only available for non-clone resource today. > > > > I would really appreciate having them for multi-state resources. > > Think > > switchover roles between primary and secondaries. > > I don't follow how using that to switch roles would be different from > demote/promote with notifications. When switching over roles between a primary and a secondary, there might have some additional steps the resource need to handle. Today, PAF handle this during notify actions. But: 1. we need to detect switchover by ourselves 2. as you know, notify action return code is ignored. Should the switchover fail, we have to set a flag so the next action fails. This makes a lot of code not really welcomed in a RA :) > > How hard would it be to add these actions? Is it something that could > > Most changes in pacemaker are big projects; these certainly would be. > Anything that touches the scheduler tends to involve a lot of work. > There are 35K lines of scheduler-related code and it's difficult to > predict how a change in one part affects another. OK. Thankfully, there's regression test to at least test known behaviors. _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/developers ClusterLabs home: https://www.clusterlabs.org/
