Re: [ClusterLabs Developers] Opinions wanted: OCF agent types
On Thu, Aug 18, 2022 at 1:06 AM Reid Wahl wrote: > > On Wed, Aug 17, 2022 at 1:01 PM Ken Gaillot wrote: > > > > Hi all, > > > > OCF 1.1 hasn't been out that long but I'm already looking ahead to OCF > > 1.2 (which would remain backward-compatible). > > > > One big addition I'm contemplating is defining OCF resource agent > > types, to address these problems: > > > > * Fence agents have a completely different standard from OCF resource > > agents, and lack some of the features available to OCF agents (such as > > meaningful error statuses and exit reasons for failures). > > > > * Pacemaker's node health feature uses OCF agents to monitor node > > conditions, but there are some user pain points involved since they are > > indistinguishable from regular OCF agents. > > > > * In the past there has been discussion of implementing "storage > > agents" to help manage replication of external storage devices, > > primarily for disaster recovery purposes. > > > > Visually, the agent type would be another field in > > the agent specification, for example ocf:fence:heartbeat:iscsi or > > ocf:health:pacemaker:cpu. > > > > "Regular" OCF agents would be (for example) > > ocf:service:heartbeat:apache in full, but for backward compatibility > > "service" would be the default, and ocf:heartbeat:apache would continue > > to work. > > > > Alternatively, if we want to keep it to three fields, we could do > > something like ocf-fence:heartbeat:iscsi and ocf-health:pacemaker:cpu. > > > > The OCF standard would have a shared section that all agent types would > > be required to support. This could include things like exit status > > codes, environment variables, and the meta-data action. Each agent type > > would then have its own section with anything specific to that type -- > > for example, service agents need to support start and stop actions, > > while fence agents need to support off and optionally reboot. > > > > The benefits would include: > > > > * Agent writers would have fewer differences to worry about and > > libraries to learn. > > > > * Pacemaker and higher-level tools could easily distinguish agent types > > and respond intelligently. For example, higher-level shells could list > > all health agents and clone them automatically when used, and Pacemaker > > could automatically exempt health agents from health restrictions so > > that the agent can automatically detect when the node becomes healthy > > again. > > > > * We would have a framework for adding new types if the need arises. > > > > Thoughts? > > It sounds like a good idea. > > With regard to "service" as the default OCF resource agent type, this > may be confusing since we already have a "service" standard. stumbled over that as well ... maybe simply 'resource' Klaus > > > -- > > Ken Gaillot > > > > ___ > > Manage your subscription: > > https://lists.clusterlabs.org/mailman/listinfo/developers > > > > ClusterLabs home: https://www.clusterlabs.org/ > > > > > -- > Regards, > > Reid Wahl (He/Him) > Senior Software Engineer, Red Hat > RHEL High Availability - Pacemaker > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/developers > > ClusterLabs home: https://www.clusterlabs.org/ > ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/developers ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs Developers] Opinions wanted: OCF agent types
On Wed, Aug 17, 2022 at 1:01 PM Ken Gaillot wrote: > > Hi all, > > OCF 1.1 hasn't been out that long but I'm already looking ahead to OCF > 1.2 (which would remain backward-compatible). > > One big addition I'm contemplating is defining OCF resource agent > types, to address these problems: > > * Fence agents have a completely different standard from OCF resource > agents, and lack some of the features available to OCF agents (such as > meaningful error statuses and exit reasons for failures). > > * Pacemaker's node health feature uses OCF agents to monitor node > conditions, but there are some user pain points involved since they are > indistinguishable from regular OCF agents. > > * In the past there has been discussion of implementing "storage > agents" to help manage replication of external storage devices, > primarily for disaster recovery purposes. > > Visually, the agent type would be another field in > the agent specification, for example ocf:fence:heartbeat:iscsi or > ocf:health:pacemaker:cpu. > > "Regular" OCF agents would be (for example) > ocf:service:heartbeat:apache in full, but for backward compatibility > "service" would be the default, and ocf:heartbeat:apache would continue > to work. > > Alternatively, if we want to keep it to three fields, we could do > something like ocf-fence:heartbeat:iscsi and ocf-health:pacemaker:cpu. > > The OCF standard would have a shared section that all agent types would > be required to support. This could include things like exit status > codes, environment variables, and the meta-data action. Each agent type > would then have its own section with anything specific to that type -- > for example, service agents need to support start and stop actions, > while fence agents need to support off and optionally reboot. > > The benefits would include: > > * Agent writers would have fewer differences to worry about and > libraries to learn. > > * Pacemaker and higher-level tools could easily distinguish agent types > and respond intelligently. For example, higher-level shells could list > all health agents and clone them automatically when used, and Pacemaker > could automatically exempt health agents from health restrictions so > that the agent can automatically detect when the node becomes healthy > again. > > * We would have a framework for adding new types if the need arises. > > Thoughts? It sounds like a good idea. With regard to "service" as the default OCF resource agent type, this may be confusing since we already have a "service" standard. > -- > Ken Gaillot > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/developers > > ClusterLabs home: https://www.clusterlabs.org/ > -- Regards, Reid Wahl (He/Him) Senior Software Engineer, Red Hat RHEL High Availability - Pacemaker ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/developers ClusterLabs home: https://www.clusterlabs.org/
[ClusterLabs Developers] Opinions wanted: OCF agent types
Hi all, OCF 1.1 hasn't been out that long but I'm already looking ahead to OCF 1.2 (which would remain backward-compatible). One big addition I'm contemplating is defining OCF resource agent types, to address these problems: * Fence agents have a completely different standard from OCF resource agents, and lack some of the features available to OCF agents (such as meaningful error statuses and exit reasons for failures). * Pacemaker's node health feature uses OCF agents to monitor node conditions, but there are some user pain points involved since they are indistinguishable from regular OCF agents. * In the past there has been discussion of implementing "storage agents" to help manage replication of external storage devices, primarily for disaster recovery purposes. Visually, the agent type would be another field in the agent specification, for example ocf:fence:heartbeat:iscsi or ocf:health:pacemaker:cpu. "Regular" OCF agents would be (for example) ocf:service:heartbeat:apache in full, but for backward compatibility "service" would be the default, and ocf:heartbeat:apache would continue to work. Alternatively, if we want to keep it to three fields, we could do something like ocf-fence:heartbeat:iscsi and ocf-health:pacemaker:cpu. The OCF standard would have a shared section that all agent types would be required to support. This could include things like exit status codes, environment variables, and the meta-data action. Each agent type would then have its own section with anything specific to that type -- for example, service agents need to support start and stop actions, while fence agents need to support off and optionally reboot. The benefits would include: * Agent writers would have fewer differences to worry about and libraries to learn. * Pacemaker and higher-level tools could easily distinguish agent types and respond intelligently. For example, higher-level shells could list all health agents and clone them automatically when used, and Pacemaker could automatically exempt health agents from health restrictions so that the agent can automatically detect when the node becomes healthy again. * We would have a framework for adding new types if the need arises. Thoughts? -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/developers ClusterLabs home: https://www.clusterlabs.org/