Re: Replication using totem protocol

lichtner Mon, 16 Jan 2006 15:50:22 -0800


On Mon, 16 Jan 2006, Jules Gosnell wrote:


> >2. When an HTTP request arrives, if the cluster which received does not
> >have R copies then it blocks (it waits until there are.) This should in
> >data centers because partitions are likely to be very short-lived (aka
> >virtual partitions, which are due to congestion, not to any hardware
> >issue.)
> >
> >
> Interesting. I was intending to actively repopulate the cluster
> fragment, as soon as the split was detected. I figure that
> - the longer that sessions spend without their full complement of
> backups, the more likely that a further failure may result in data loss.
> - the split is an exceptional cicumstance at which you would expect to
> pay an exceptional cost (regenerating missing primaries from backups and
> vice-versa)
>
> by waiting for a request to arrive for a session before ensuring it has
> its correct complement of backups, you extend the time during which it
> is 'at risk'. By doing this 'lazily', you will also have to perform an
> additional check on every request arrival, which you would not have to
> do if you had regenerated missing state at the point that you noticed
> the split.

Actually I didn't mean to say that you should do it lazily. You most
definitely do it aggressively, but I would not try to do _all_ the state
transfer ASAP, because this can kill availability.

If I had to do the state transfer using totem I would use priority queues,
so that you know that while the system is doing state transfer it is still
operating at, say, 80% efficiency.

It was not about lazy vs. greedy.

I believe that if you put some spare capacity in your cluster you will get
good availability. For example, if your minimum R is 2 and the normal
operating value is 4, when a node fails you will not be frantically doing
state transfer.

> >3. If at any time an HTTP reaches a server which does not have itself a
> >replica of the session it sends a client redirect to a node which does.
> >
> >
> WADI can relocate request to session, as you suggest (via redirect or
> proxy), or session to request, by migration. Relocation of request
> should scale better since requests are generally smaller and, in the web
> tier, may run concurrently through the same session, whereas sessions
> are generally larger and may only be migrated serially (since only one
> copy at a time may be 'active').

I would also just send a redirect. I don't think it's worth relocating a
session.

> > and possibly migration of some session for
> >proper load balancing.
> >
> >
> forcing the balancing of state around the cluster is something that I
> have considered with WADI, but not yet tried to implement. The type of
> load-balancer that is being used has a big impact here. If you cannot
> communicate a change of session location satisfactorily to the Http load
> balancer, then you have to just go with wherever it decides a session is
> located.... With SFSBs we should have much more control at the client
> side, so this becomes a real option.

In my opinion load balancing is not something that a cluster api can
address effectively. Half the problem is evaluating how busy the system is
in the first place.

> all in all, though, it sounds like we see pretty much eye to eye :-)

Better than the other way ..

> the lazy partition regeneration is an interesting idea and this is the
> second time it has been suggested to me, so I will give it some serious
> thought.

Again, I wasn't advocating lazy state transfer. But perhaps it has
applications somewhere.

> Thanks for taking the time to share your thoughts,

No problem.

Re: Replication using totem protocol

Reply via email to