On Sun, Oct 23, 2011 at 4:11 PM, Steven Troxell
<[email protected]> wrote:
> Hi all,
>
> Phil and I are working on a grad project for a distributed systems class,
> using accumulo.  One part of the project involves attempting to distribute
> the load balancer so it no longer must be run on master.  We wanted to just
> propose our idea to get feedback, and perhaps a few pointers.  What I'm
> thinking of is modifying/moving the balnace code form accumulo.server.master
> to accumulo.server.   We would then be treating the load balancing as a
> critical resource, and develop a layer implementing a mutual exclusion
> algorithm to determine which tablet server gets to run the load balancer.
> What I'm thinking right now is using an election algorithm, that essentially
> randomly selects a server at startup.  Other servers would then verify this
> server is up and running, and upon detecting it down, implement the election
> to determine which server the load balancing moves to. Does this sound like
> a good idea, are there any paritcular design aspects of accumulo we should
> be aware of in attempting to implement this?  Additionally we're still
> familarizing ourselves with the codebase, it would be helpful if someone
> could point us to where some of the current tserver synchronization takes
> place, so we can follow existing conventions.  I imagine something similar
> to the election algorithm/ tserver's resolving mutual exclusion that we
> propose already occurs for some other aspect of accumulo?
>
> thanks in advance for feedback/insight
>
> -Steve and Phil
>

Accumulo currently support fail over masters.  You can start multiple
master processes.  All will try to acquire the master lock in
zookeeper.  The first one to acquire the lock becomes the master.  The
other processes keep waiting to acquire the lock.

A generalization of what you proposes is to make any tserver be able
to become the master, removing the need to start a master process.
Chris T has proposed this before.  This could also be done w/ the
Accumulo GC.  Would need to determine what the drawbacks of this are,
such as increasing memory usage on tserver.

I suppose the ultimate generalization is that you start an accumulo
process and it can become a tserver, logger, master, and/or acummulo
GC process.  Do this in such a way that admins are given control of
which machine run which processes if needed.

Keith

Reply via email to