A shared firewire disk and a shared storage array are both options, but then you only have a single point of failure. Theoretically a good disk array is a very resilient single point of failure still is an SPOF.
The great thing about DRBD active/active and OCFS2 is that you eliminate the SPOF. Also other options scsi locking/firewire are all related to specific hardware/sans. This solution would be very generic. Point taken most of my problems are DRBD, OCFS2 specific problems. I understand the points about the resource being un-managed not being a good thing. I know a fair amount about heartbeat colocations,orders in places. Here is a more technical description of what i was trying to do heartbeat wise. Resource 1 VIP IP (used IPADDR2) Resource 2 IP route (created an RA) for this Resource 3 Web Scraping utility (used init script) Resource 4 Process to work with web scraping and usenet data Resource 5 Usenet Scraping utility Resource 6 OCFS2 (cloned) Resource 7 DRBD (cloned) This was my first design Order1 - Start 7 before 6 Group1 - Resource1 and Resource2 Process 3,4,5 This worked well. but since everything was grouped a failed resource in Group1 caused everything to fail and possibly restart/move. Anyone connected lost connected as the VIP left and came back a few seconds later. This scenario was deemed unacceptable. So then i tried writing a bunch of co location rules. Collocate 45 Collocate 34 Collocate group1 and 4 That had the same effect though as grouping. an item failed it would cause the collocation to fail, which would take down all the other collocation. What I really needed was away to say. I need this resource to run wherever VIP is running. VIP should only be running on a node with the shared disk running. PLACE seems only to be able to tell a resource to run on a node. So I tried that implementation Resource 1 VIP IP --PLACE node1 100 Resource 2 IP route --PLACE node1 100 Resource 3 Web Scraping utility --PLACE node1 100 Resource 4 Process to work with web scraping and usenet data --PLACE node1 100 Resource 5 Usenet Scraping utility --PLACE node1 100 This worked well because now everything is loosely coupled, and could still failover, but failing over the VIP and route does not fail over resource 345 So neither place nor collocation can really express I need this resource to run only where other resource is, but if this resource can not start don't fail the parent. But if the parent does fail I need the resource to evaluate that and move with it. A one way dependency. On 8/8/07, Robert Wipfel <[EMAIL PROTECTED]> wrote: > > >>> On Wed, Aug 8, 2007 at 1:09 AM, in message > <[EMAIL PROTECTED]>, "Andrew > Beekhof" > <[EMAIL PROTECTED]> wrote: > > On 8/8/07, Eddie C <[EMAIL PROTECTED]> wrote: > > [...] > > >> My post was more of a rant then anything else. I was looking for 50 > people > >> or so to read my post and say. "You must be doing something wrong. I > run > >> DRBD with OCFS2 multi node actice/active and MySQL and its super fast > and > >> never crashes on two 100 mhz laptops" > > I know it's a bit off topic, but for this kind of setup (super low cost > nodes), a shared firewire disk actually seems to work quite well. You'll > have to load the firewire module with exclusive_login = 0, and then > both servers will happily share that $150 fw disk ;-) and it actually > works > well enough to run Xen VMs off shared disk, using OCFS2 in userspace > heartbeat mode, with Heartbeat2 managing the VMs as resources. There's > some setup info here: > http://www.oracle.com/technology/pub/articles/hunter_rac10gr2.html > > > Given how little you seem to be using heartbeat, perhaps the drbd or > > ocfs2 lists might be a more appropriate forum. > > > >> The concept is great an active/active disk partition and heartbeat with > no > >> fancy SAN's. It works for stretches 3 or 4 weeks. But then I run into a > >> weird locked directory that I cant delete or a file owned by '?'. Or > the > >> partition unmounts and the system will not reboot. > > I haven't checked prices recently, but multi-initiator serial attached > shared SCSI is also a recent option, with a number of vendors providing > low cost RAID enclosures, that can be shared across more than the > shared firewire disk limit of two nodes. E.g. Tom's hardware did a good > review http://www.tomshardware.com/2006/04/07/going_the_sas_storage_way/ > > Hth, > Robert > > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
