Re: [Linux-HA] R2 Two-node apache cluster with STONITH

Bjorn Oglefjorn Fri, 20 Apr 2007 08:26:18 -0700

If it seems counter intuitive, think of it like this:
   * test-1_DRAC is the DRAC installed in the chassis of
test-1.domainwhich has an address of
test-1.drac.domain


Then look here:
      <rsc_location id="test-1_DRAC_location" rsc="test-1_DRAC">
        <rule id="no_self_run_test-1_DRAC" score="-INFINITY">
          <expression attribute="#uname"
id="no_self_run_test-1_DRAC_expr_1" operation="eq" value="test-1.domain"/>
        </rule>
      </rsc_location>
      <rsc_location id="test-2_DRAC_location" rsc="test-2_DRAC">
        <rule id="no_self_run_test-2_DRAC" score="-INFINITY">
          <expression attribute="#uname"
id="no_self_run_test-2_DRAC_expr_1" operation="eq" value="test-2.domain"/>
        </rule>
      </rsc_location>

In other words, test-1_DRAC should never run on test-1.domain and
test-2_DRAC should never run on test-2.domain.

Once again:

[EMAIL PROTECTED] ~]# stonith -t external/drac4
DRAC_ADDR=test-2.drac.domainDRAC_LOGIN=root DRAC_PASSWD=******** -lS
stonith: external/drac4 device OK.
test-2.drac.domain

[EMAIL PROTECTED] ~]# stonith -t external/drac4
DRAC_ADDR=test-1.drac.domainDRAC_LOGIN=root DRAC_PASSWD=******** -lS
stonith: external/drac4 device OK.
test-1.drac.domain

--BO

On 4/19/07, Dejan Muhamedagic <[EMAIL PROTECTED]> wrote:


On Tue, Apr 17, 2007 at 03:55:07PM -0400, Bjorn Oglefjorn wrote:
> Alan, what is the list operation?  The node names are always FQDNs and
> always match.

Do they?

>From your CIB:

       <primitive id="test-1_DRAC" class="stonith" type="external/drac4"
provider="heartbeat">
         <operations>
           <op id="test-1_DRAC_reset" name="reset" timeout="3min"
prereq="nothing"/>
         </operations>
         <instance_attributes id="test-1_DRAC_inst_attr">
           <attributes>
             <nvpair id="test-1_DRAC_attr_0" name="DRAC_ADDR" value="
test-1.drac.domain"/>
             <nvpair id="test-1_DRAC_attr_1" name="DRAC_LOGIN"
value="root"/>
             <nvpair id="test-1_DRAC_attr_2" name="DRAC_PASSWD"
value="********"/>
           </attributes>
         </instance_attributes>
       </primitive>

Shouldn't attr_0 specify test-2.drac instead of test-1?

BTW, pulling network cables from a host results in a split brain.
That's to be avoided at all costs, that's why you have redundant
connections within the cluster. When that happens, both nodes will
try to kill each other. A better way to test for a dead node is to

# killall -9 heartbeat

(that's for linux; use kill -9 <pid> to kill the master process on
other platforms)

> --BO
>
> On 4/17/07, Alan Robertson <[EMAIL PROTECTED]> wrote:
> >
> >Andrew Beekhof wrote:
> >> On 4/17/07, Bjorn Oglefjorn <[EMAIL PROTECTED]> wrote:
> >>> I know that my plugin is getting called because of the logging that
the
> >>> plugin does.
> >>
> >> do we get to see that logging at all?  preferably in the context of
> >> the other log messages
> >>
> >>> That said, I also know my plugin is not receiving any 'reset'
> >>> operation request from heartbeat.  If you see below, request actions
> >are
> >>> logged.  The only actions logged when node failure is simulated are:
> >>> getconfignames, status, and gethosts, in that order.  We should also
> >see
> >>> getinfo-devid and reset operations logged, but they are never
present.
> >
> >I would assume that you're getting called with the list operation, but
> >not afterwards.
> >
> >If that's the case, then that means that for some reason not obvious to
> >me the stonith daemon doesn't think the names you gave it match
> >the host names it is being asked to reset.
> >
> >--
> >    Alan Robertson <[EMAIL PROTECTED]>
> >
> >"Openness is the foundation and preservative of friendship...  Let me
> >claim from you at all times your undisguised opinions." - William
> >Wilberforce
> >_______________________________________________
> >Linux-HA mailing list
> >[email protected]
> >http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >See also: http://linux-ha.org/ReportingProblems
> >
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems

--
Dejan
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] R2 Two-node apache cluster with STONITH

Reply via email to