On Fri, Nov 23, 2012 at 3:08 AM, Rafał Radecki <[email protected]> wrote:
> Hi all.
>
> I am currently making a Pacemaker/Corosync cluster which serves Tomcat
> resource in master/slave mode. This Tomcat serves Solr java application.
> My configuration is:
>
> node storage1
> node storage2
>
> primitive TSVIP ocf:heartbeat:IPaddr2 \
>         params ip="192.168.100.204" cidr_netmask="32" nic="eth0" \
>         op monitor interval="30s"
>
> primitive TomcatSolr ocf:polskapresse:tomcat6 \
>         op start interval="0" timeout="60" on-fail="stop" \
>         op stop interval="0" timeout="60" on-fail="stop" \
>         op monitor interval="31" role="Slave" timeout="60" on-fail="stop" \
>         op monitor interval="30" role="Master" timeout="60" on-fail="stop"
>
> ms TomcatSolrClone TomcatSolr \
>         meta master-max="1" master-node-max="1" clone-max="2"
> clone-node-max="1" notify="false" globally-unique="true" ordered="false"
> target-role="Master"
>
> colocation TomcatSolrClone_with_TSVIP inf: TomcatSolrClone:Master
> TSVIP:Started
> order TomcatSolrClone_after_TSVIP inf: TSVIP:start TomcatSolrClone:promote
>
> property $id="cib-bootstrap-options" \
>         dc-version="1.1.7-6.el6-148fccfd5985c5590cc601123c6c16e966b85d14" \
>         cluster-infrastructure="openais" \
>         expected-quorum-votes="4" \
>         stonith-enabled="false" \
>         no-quorum-policy="ignore" \
>         symmetric-cluster="true" \
>         default-resource-stickiness="1" \
>         last-lrm-refresh="1353594420"
> rsc_defaults $id="rsc-options" \
>         resource-stickiness="10" \
>         migration-threshold="1000000
>
> So logically I have:
> - one node with TSVIP and TomcatSolrClone Master;
> - one node with TomcatSolrClone Slave.
> I have set up replication beetwen Solr on TomcatSolrClone Master and Slave
> and written an ocf agent (attached).
> Few moments ago when I killed the Slave resource with 'pkill java' the
> resource was restarted on the same node despite the fact that the monitor
> action returned $OCF_ERROR_GENERIC and I have on-fail="stop" for TomcatSolr
> set (I have also tried "block" with same effect).
>
> Then I have added a migration threshold:
>
> ms TomcatSolrClone TomcatSolr \
>         meta master-max="1" master-node-max="1" clone-max="2"
> clone-node-max="1" notify="false" globally-unique="true" ordered="false"
> target-role="Started" \
>         params migration-threshold="1"
>
> and now when I kill java on Slave it does not start anymore (the Master is
> ok). But when I then kill java on Master (no resource running on both
> nodes) everything gets restarted by the cluster and Master and Slave are
> running afterwards.
> How to stop this restart when Slave and Master both fail?

Could you file a bug (https://bugs.clusterlabs.org) for this and
include a crm_report for your testcase?
Its likely that you've hit a bug.

>
> Best regards,
> Rafal.
>
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to