Hi Lars and Andrew,
I considered about the way to tell tengine how long it should lengthen timeout
without telling STONITH resources' ids.
My idea is the following.
(1) add stonith op in <operations>.
For example:
<clone id="clnStonith">
[...snip...]
<group id="grpStonith">
<primitive id="cgpStonith-kdumpcheck" class="stonith"
type="external/kdumpcheck">
[...snip...]
<operations>
<op name="stonith" interval="0" id="cgpStonith-kdumpcheck-stonith"
timeout="60s"/>
</operations>
</primitive>
<primitive id="cgpStonith-ssh" class="stonith" type="external/ssh">
[...snip...]
<operations>
<op name="stonith" interval="0" id="cgpStonith-ssh-stonith"
timeout="20s"/>
</operations>
</primitive>
</group>
</clone>
(2) add 3 items in action graph.
i) CRM_meta_plugin_num: the number of STONITH plugin running in the cluster.
ii) CRM_meta_stonith_plugin_dataset: the information of each STONITH
plugin's id and timeout. The format is "resource_id=timeout_value(ms)", and
delimiter is " ".
iii) CRM_meta_total_plugin_timeout: the sum total of all STONITH plugins'
timeout values.
For example:
<crm_event id="22" operation="stonith" operation_key="stonith"
on_node="node1" on_node_uuid="c064967c-147b-4a28-a3f8-a3f23d637edd">
<attributes CRM_meta_on_node="node1"
CRM_meta_on_node_uuid="ebe5a7cb-608e-4df1-b2c1-5955c5083c2a"
CRM_meta_plugin_num="4"
CRM_meta_stonith_action="reboot"
CRM_meta_stonith_plugin_dataset="cgpStonith-kdumpcheck:0=60000
cgpStonith-ssh:0=20000 cgpStonith-kdumpcheck:1=60000 cgpStonith-ssh:1=20000"
CRM_meta_total_plugin_timeout="160000"
crm_feature_set="3.0" />
</crm_event>
(3) in tengine, lengthen its transition_timeout based on
CRM_meta_total_plugin_timeout.
It doesn't need to know which STONITH device is going to be used.
In addition, also lengthen timeout value which it notifies to stonithd.
(4) in stonithd, analyze CRM_meta_stonith_plugin_dataset with making use of
CRM_meta_plugin_num when it does fence operation.
And set timeout function for the plugin which it is going to execute
by SetTrackedProcTimeouts() as if lrmd does.
Honestly, I want to get the information of STONITH plugins which is running
on the node that it is going to do STONITH operation.
But I have no idea to get it in pengine.
I implemented a prototype, and it seems to work well.
I would like to hear your opinions.
Best Regards,
Satomi Taniguchi
_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/