Re: [Linux-HA] OCFS2 DRBD Heartbeat and high availability did not work out so well

Andrew Beekhof Wed, 08 Aug 2007 04:29:10 -0700

On 8/8/07, Eddie C <[EMAIL PROTECTED]> wrote:
> A shared firewire disk and a shared storage array are both options, but then
> you only have a single point of failure. Theoretically a good disk array is
> a very resilient single point of failure still is an SPOF.
>
> The great thing about DRBD active/active and OCFS2 is that you eliminate the
> SPOF.
>
> Also other options scsi locking/firewire are all related to specific
> hardware/sans. This solution would be very generic.
>
> Point taken most of my problems are DRBD, OCFS2 specific problems.
>
> I understand the points about the resource being un-managed not being a good
> thing. I know a fair amount about heartbeat colocations,orders in places.
> Here is a more technical description of what i was trying to do heartbeat
> wise.
>
> Resource 1 VIP IP (used IPADDR2)
> Resource 2 IP route (created an RA) for this
> Resource 3 Web Scraping utility (used init script)
> Resource 4 Process to work with web scraping and usenet data
> Resource 5 Usenet Scraping utility
> Resource 6 OCFS2 (cloned)
> Resource 7 DRBD (cloned)
>
> This was my first design
> Order1 - Start 7 before 6
> Group1 - Resource1 and Resource2 Process 3,4,5
>
> This worked well. but since everything was grouped a failed resource in
> Group1 caused everything to fail and possibly restart/move. Anyone connected
> lost connected as the VIP left and came back a few seconds later. This
> scenario was deemed unacceptable.
>
> So then i tried writing a bunch of co location rules.
> Collocate 45
> Collocate 34
> Collocate group1 and 4
> That had the same effect though as grouping. an item failed it would cause
> the collocation to fail, which would take down all the other collocation.
>
> What I really needed was away to say. I need this resource to run wherever
> VIP is running. VIP should only be running on a node with the shared disk
> running.
> PLACE seems only to be able to tell a resource to run on a node.
>
> So I tried that implementation
>
> Resource 1 VIP IP --PLACE node1 100
> Resource 2 IP route --PLACE node1 100
> Resource 3 Web Scraping utility --PLACE node1 100
> Resource 4 Process to work with web scraping and usenet data --PLACE node1
> 100
> Resource 5 Usenet Scraping utility --PLACE node1 100
>
> This worked well because now everything is loosely coupled, and could still
> failover, but failing over the VIP and route does not fail over resource 345
>
> So neither place nor collocation can really express I need this resource to
> run only where other resource is, but if this resource can not start don't
> fail the parent. But if the parent does fail I need the resource to evaluate
> that and move with it. A one way dependency.


finally some clue as to what version you're running!

please update, we've been able to do one-way colocation since 2.0.8

people really do make life hard on themselves when they don't provide
the relevant information to the people they want help from
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] OCFS2 DRBD Heartbeat and high availability did not work out so well

Reply via email to