Hi,
I am trying to use Pacemaker with corosync & facing following issues.
I want to know whether these are due to misconfiguration or these are known
issues.
I have two nodes in the cluster :- VIP-1 & VIP-2
The corosync version is :-
Corosync Cluster Engine, version '1.2.7' SVN revision '3008'
==================================================================
The crm_mon output is :-
============
Last updated: Thu Feb 24 17:44:33 2011
Stack: openais
Current DC: VIP-1 - partition with quorum
Version: 1.0.10-da7075976b5ff0bee71074385f8fd02f296ec8a3
2 Nodes configured, 2 expected votes
3 Resources configured.
============
Online: [ VIP-1 VIP-2 ]
ClusterIP (ocf::heartbeat:IPaddr2): Started VIP-1
WebSite (ocf::heartbeat:apache): Started VIP-1
My_Tomcat (ocf::heartbeat:tomcat): Started VIP-
==================================================================
My configuration is :-
[root@VIP-1 local]# crm configure show
node VIP-1
node VIP-2
primitive ClusterIP ocf:heartbeat:IPaddr2 \
params ip="172.16.201.23" cidr_netmask="32" \
op monitor interval="5s"
primitive My_Tomcat ocf:heartbeat:tomcat \
params catalina_home="/root/Softwares/apache-tomcat-6.0.26"
java_home="/root/Softwares/Java/linux/jdk1.6.0_21" \
op monitor interval="5s"
primitive WebSite ocf:heartbeat:apache \
params configfile="/etc/httpd/conf/httpd.conf" \
op monitor interval="5s"
property $id="cib-bootstrap-options" \
dc-version="1.0.10-da7075976b5ff0bee71074385f8fd02f296ec8a3" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
stonith-enabled="false" \
no-quorum-policy="ignore" \
last-lrm-refresh="1298547656"
rsc_defaults $id="rsc-options" \
resource-stickiness="2"
=======================================================================
Issue -1)
I observed that If any service is manually shutdown on the VIP-1, then the
corosync restarts it on the same node.
In the logs, I can see this :-
=================================================================================================================
Feb 24 18:14:32 VIP-1 pengine: [28098]: info: get_failcount: My_Tomcat has
failed 35 times on VIP-1
Feb 24 18:14:32 VIP-1 pengine: [28098]: notice: common_apply_stickiness:
My_Tomcat can fail 999965 more times on VIP-1 before being forced off
==================================================================================================================
I have not configured to restart the service for INFINITY times on VIP-1, so is
this default behavior?
Is there any configuration to tell the corosync to restart the service only for
two times on VIP-1 & if not started, then start it on VIP-2 ?
Issue -2)
I have changed the error codes in the Apache & Tomcat RA scripts, & returned
the error code=2 if the monitor fails.
Now, if I manually stop the service, then it is not restarted on the VIP-1 but
it is started on VIP-2.
The fail count of that service on VIP-1 is showing as 1.
Now, if I make the service manually down on the VIP-2, then it is not getting
started on the VIP-1 untill I clean up the resource.
So, is this known behavior or I have missed any configuration?
Let me know if you need more information.
Thanks,
Amit
________________________________
This email (message and any attachment) is confidential and may be privileged.
If you are not certain that you are the intended recipient, please notify the
sender immediately by replying to this message, and delete all copies of this
message and attachments. Any other use of this email by you is prohibited.
________________________________
_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais