Hi Lars and Andrew,

I considered about the way to tell tengine how long it should lengthen timeout
without telling STONITH resources' ids.

My idea is the following.

(1) add stonith op in <operations>.
    For example:
    <clone id="clnStonith">
      [...snip...]
      <group id="grpStonith">
<primitive id="cgpStonith-kdumpcheck" class="stonith" type="external/kdumpcheck">
          [...snip...]
          <operations>
<op name="stonith" interval="0" id="cgpStonith-kdumpcheck-stonith" timeout="60s"/>
          </operations>
        </primitive>
        <primitive id="cgpStonith-ssh" class="stonith" type="external/ssh">
          [...snip...]
          <operations>
<op name="stonith" interval="0" id="cgpStonith-ssh-stonith" timeout="20s"/>
          </operations>
        </primitive>
      </group>
    </clone>

(2) add 3 items in action graph.
    i) CRM_meta_plugin_num: the number of STONITH plugin running in the cluster.
ii) CRM_meta_stonith_plugin_dataset: the information of each STONITH plugin's id and timeout. The format is "resource_id=timeout_value(ms)", and delimiter is " ". iii) CRM_meta_total_plugin_timeout: the sum total of all STONITH plugins' timeout values.
    For example:
<crm_event id="22" operation="stonith" operation_key="stonith" on_node="node1" on_node_uuid="c064967c-147b-4a28-a3f8-a3f23d637edd">
      <attributes CRM_meta_on_node="node1"
                  CRM_meta_on_node_uuid="ebe5a7cb-608e-4df1-b2c1-5955c5083c2a"
                  CRM_meta_plugin_num="4"
                  CRM_meta_stonith_action="reboot"

CRM_meta_stonith_plugin_dataset="cgpStonith-kdumpcheck:0=60000 cgpStonith-ssh:0=20000 cgpStonith-kdumpcheck:1=60000 cgpStonith-ssh:1=20000"
                  CRM_meta_total_plugin_timeout="160000"
                  crm_feature_set="3.0" />
    </crm_event>

(3) in tengine, lengthen its transition_timeout based on
    CRM_meta_total_plugin_timeout.
    It doesn't need to know which STONITH device is going to be used.
    In addition, also lengthen timeout value which it notifies to stonithd.

(4) in stonithd, analyze CRM_meta_stonith_plugin_dataset with making use of
    CRM_meta_plugin_num when it does fence operation.
    And set timeout function for the plugin which it is going to execute
    by SetTrackedProcTimeouts() as if lrmd does.



Honestly, I want to get the information of STONITH plugins which is running
on the node that it is going to do STONITH operation.
But I have no idea to get it in pengine.

I implemented a prototype, and it seems to work well.
I would like to hear your opinions.


Best Regards,
Satomi Taniguchi

_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Reply via email to