Hi,

On Tue, Feb 26, 2008 at 10:59:26AM -0500, Doug Lochart wrote:
> On Mon, Feb 25, 2008 at 6:10 PM, Dejan Muhamedagic <[EMAIL PROTECTED]> wrote:
> > Hi,
> >
> >
> >  On Mon, Feb 25, 2008 at 03:36:31PM -0500, Doug Lochart wrote:
> >  > heartbeat 2.1.3_3 and drbd 8.0.8 (dopd and STONITH ip,i in use)
> >  >
> >  > I successfully was able to test my 2 node cluster simply by powering
> >  > the nodes off and on in varying order and the HA resources
> >  > successfully moved in each case (hurray).
> >  > Now I went back to my original test of previous frustration.  I yanked
> >  > all the ethernet cables from the primary machine (both LAN and
> >  > crossover)
> >  >
> >  > On the Secondary (unaffected) machine I see that STONITH tried to
> >  > shoot the other node for about 20 minutes before giving up.  Right now
> >  > my secomdary node says Secondary/Unknown and the Primary Node says
> >  > Primary/Unknown.
> >  >
> >  > First off is there a configurable parameter for STONITH on how long it 
> > tries?
> >
> >  No. It should be trying forever. That's what is in the cluster
> >  configuration, i.e. protect resources using the stonith, and the
> >  cluster shouldn't move until there was a successful reset
> >  operation.
> >
> >
> >  > When I plug the network back into the Primary immediately rebooted
> >  > (not sure why)
> >
> >  Either stonith or fastfail. The logs would say.
> >
> >
> >  > and when it came back up I was in split brain again.
> >  >
> >  > So whenever you have 2 nodes in a cluster and all redundant
> >  > communication paths have been suffered by default then you will have a
> >  > Split Brain that needs to be manually corrected.  Am I understanding
> >  > this right?
> >
> >  No, it should recover automatically. Please take a look at the
> >  logs or post them.
> 
> Dejan,  I plan to rerun the tests this morning.  Do I need to have any
> specific settings in drbd.conf in order for it to recover
> automatically?  If I did not say before I am using version 1 config
> files under heartbeat 2.1.3_3.

Hmm, I thought you were referring to heartbeat. If you have a
drbd split brain, then I'm not sure if I can help. If it's
something that happened just by switching from v1 to v2 then it
must be wrong usage. I suppose that you read the drbd howto?

Thanks,

Dejan

> thanks
> 
> Doug
> 
> 
> >
> >  Thanks,
> >
> >  Dejan
> >
> >
> >  > I am not complaining I am just trying to determine what I am to expect
> >  > so I can write up procedures and what not.  The failover worked great
> >  > with other tests.
> >  >
> >  > regards,
> >  >
> >  > Doug
> >  >
> >  >
> >  >
> >  > --
> >  > What profits a man if he gains the whole world yet loses his soul?
> >  > _______________________________________________
> >  > Linux-HA mailing list
> >  > [email protected]
> >  > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >  > See also: http://linux-ha.org/ReportingProblems
> >  _______________________________________________
> >  Linux-HA mailing list
> >  [email protected]
> >  http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >  See also: http://linux-ha.org/ReportingProblems
> >
> 
> 
> 
> -- 
> What profits a man if he gains the whole world yet loses his soul?
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to