What do the logs say?
jc

mike wrote:
> Thank you Jakob,
> I put them in a resource group as indicated but I am still seeing the 
> same behavior, i.e. if I stop httpd manually and then stop it from 
> restarting (by editing out the contents of /etc/init.d/httpd) the 
> cluster simply sits there and spins it wheels trying to restart httpd 
> on the primary node over and over and over again. At no point is a 
> failover initiated. Anyone know why stopping httpd in this manner will 
> not result in a failover?
>
> Here is my cib.xml
> <cib crm_feature_set="3.0.1" 
> dc-uuid="86b5c3f4-8202-45f7-91a8-64e17163bb7a" have-quorum="1" 
> remote-tls-port="0" validate-with="pacemaker-1.0"
> epoch="9" admin_epoch="0" num_updates="0" cib-last-written="Wed Apr  7 
> 15:11:06 2010">
>  <configuration>
>    <crm_config>
>      <cluster_property_set id="cib-bootstrap-options">
>        <nvpair id="nvpair.id17897268" name="symmetric-cluster" 
> value="true"/>
>        <nvpair id="nvpair.id17897737" name="no-quorum-policy" 
> value="stop"/>
>        <nvpair id="nvpair.id17897746" 
> name="default-resource-stickiness" value="0"/>
>        <nvpair id="nvpair.id17897755" 
> name="default-resource-failure-stickiness" value="0"/>
>        <nvpair id="nvpair.id17897413" name="stonith-enabled" 
> value="false"/>
>        <nvpair id="nvpair.id17897422" name="stonith-action" 
> value="reboot"/>
>        <nvpair id="nvpair.id17897431" name="startup-fencing" 
> value="true"/>
>        <nvpair id="nvpair.id17897704" name="stop-orphan-resources" 
> value="true"/>
>        <nvpair id="nvpair.id17897714" name="stop-orphan-actions" 
> value="true"/>
>        <nvpair id="nvpair.id17897723" name="remove-after-stop" 
> value="false"/>
>        <nvpair id="nvpair.id17898021" name="short-resource-names" 
> value="true"/>
>        <nvpair id="nvpair.id17898030" name="transition-idle-timeout" 
> value="5min"/>
>        <nvpair id="nvpair.id17898040" name="default-action-timeout" 
> value="20s"/>
>        <nvpair id="nvpair.id17897626" name="is-managed-default" 
> value="true"/>
>        <nvpair id="nvpair.id17897635" name="cluster-delay" value="60s"/>
>        <nvpair id="nvpair.id17897643" name="pe-error-series-max" 
> value="-1"/>
>        <nvpair id="nvpair.id17897653" name="pe-warn-series-max" 
> value="-1"/>
>        <nvpair id="nvpair.id17897329" name="pe-input-series-max" 
> value="-1"/>
>        <nvpair id="nvpair.id17897338" name="dc-version" 
> value="1.0.8-5443ff1ab132449ad5b236169403c6a23cf4168b"/>
>        <nvpair id="nvpair.id17897347" name="cluster-infrastructure" 
> value="Heartbeat"/>
>      </cluster_property_set>
>    </crm_config>
>    <nodes>
>      <node id="86b5c3f4-8202-45f7-91a8-64e17163bb7a" 
> uname="apauat1b.intranet.mydomain.com" type="normal"/>
>      <node id="dbd6016a-aab6-4130-87fb-80e954353b3b" 
> uname="apauat1a.intranet.mydomain.com" type="normal"/>
>    </nodes>
>    <resources>
>      <group id="web_cluster">
>        <primitive class="ocf" id="failover-ip" provider="heartbeat" 
> type="IPaddr">
>          <instance_attributes id="failover-ip-instance_attributes">
>            <nvpair id="failover-ip-instance_attributes-ip" name="ip" 
> value="172.28.185.55"/>
>          </instance_attributes>
>          <operations>
>            <op id="failover-ip-monitor-10s" interval="10s" 
> name="monitor"/>
>          </operations>
>        </primitive>
>        <primitive class="lsb" id="failover-apache" type="httpd">
>          <operations>
>            <op id="failover-apache-monitor-15s" interval="15s" 
> name="monitor"/>
>          </operations>
>        </primitive>
>      </group>
>    </resources>
>    <constraints/>
>    <rsc_defaults/>
>    <op_defaults/>
>  </configuration>
> </cib>
>
> Jakob Curdes wrote:
>> mike schrieb:
>>> Thank you Jakob,
>>> I did as you suggested (good idea btw) and what I saw was that 
>>> LinuxHA continually tried to restart it on the primary node. Is 
>>> there a setting that I can say "After X number of times trying to 
>>> restart, fail over" ?
>>>   
>> I think you need to read further down that page and use the settings in
>>
>> "Failover IP Service in a Group"
>>
>> What you probably actually want is to have IP and service running 
>> always on the same node.
>> (plus- last step - on the node with best connectivity).
>>
>> HTH,
>> Jakob Curdes
>>
>>

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to