On 2006-01-21T11:05:58, Peter Kruse <[EMAIL PROTECTED]> wrote:

> except that ipfail relies on an external address, but
> I don't understand why the failure of an external address
> should cause a failover.  Even if you use multiple addresses
> to ping.

The idea is that you don't only rely on the interface which has a
linkbeat (to your switch/hub), but also is able to reach the router,
which is a superset of the former.

But yes, monitoring the NIC itself is a desireable property. (FailSafe
monitored the NIC for error counts etc too.) Those can be combined...

> Sure, I would love that.  But it's written in bash,
> and uses our own scripting library and ... you know ...
> "it works for us"... meaning we probably won't have
> the resources to support it. 

Well, the good thing is that as soon as you would indeed clean it up and
submit it, you'd be relieved of maintaining it yourself. It'd become a
standard feature of hb2, tested by more users (which would help with
your own future deployments, too), and fixed by us. And the mechanism
would be desireable to have for many things.

Sorry, I can't help this, I have to try and cheer users on to become
valuable contributors ;-)

Keep in mind that the CIB is meant for _low_ number of updates, it's not
meant to be efficient and fast at replicating many updates per second.
Reads are a bit better, because they'll be served locally, but still,
you'll max out at fairly low throughput.

> If you want to have a look at it however, I can send it to you, there
> are some ideas we took from the Failsafe agents, you will recognize.

Sure, I'd be glad to take a look at the code. I think those are probably
the most complex resource agents someone has written so far.

> That's what I thought, too.  If you set a resource group
> to unmanaged, the monitor actions are still called
> and failures are still recognized.  But not sure
> if it causes a failover.

It won't. But one could make the point that we also should not monitor
unmanaged resources, but cancel running monitor ops.

> Hm, ... yes, that's an idea, don't know why I thought it
> has to be stored in the cluster database.  That I probably
> will change, thanks.

Great. 

BTW, you're one of the few people running hb 2.x in production so far. I
think you may be interested in BrainShare 2006 in Salt Lake City in
March, Andrew and I are doing at least 4 tutorials on it there and will
be available for meetings during that time too...


Sincerely,
    Lars Marowsky-Brée

-- 
High Availability & Clustering
SUSE Labs, Research and Development
SUSE LINUX Products GmbH - A Novell Business     -- Charles Darwin
"Ignorance more frequently begets confidence than does knowledge"

_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Reply via email to