Thanks Paul. Sounds like that's the way to go then. We're just starting to
experiment a bit with DRBD so we'll give that a shot and see how it works
out.

On Tue, Jul 29, 2008 at 11:56 AM, paul <[EMAIL PROTECTED]> wrote:

> I'm currently running with your option B setup and it seems to be reliable
> for me (so far).  I use a combination of drbd and various hearbeat/LinuxHA
> scripts that handle the failover process, including a virtual IP for the
> namenode.  I haven't had any real-world unexpected failures to deal with,
> yet, but all manual testing has had consistent and reliable results.
>
>
>
> -paul
>
>
> On Tue, Jul 29, 2008 at 1:54 PM, Ryan Shih <[EMAIL PROTECTED]> wrote:
>
> > Dear Hadoop Community --
> >
> > I am wondering if it is already possible or in the plans to add
> capability
> > for multiple master nodes. I'm in a situation where I have a master node
> > that may potentially be in a less than ideal execution and networking
> > environment. For this reason, it's possible that the master node could
> die
> > at any time. On the other hand, the application must always be available.
> I
> > have accessible to me other machines but I'm still unclear on the best
> > method to add reliability.
> >
> > Here are a few options that I'm exploring:
> > a) To create a completely secondary Hadoop cluster that we can flip to
> when
> > we detect that the master node has died. This will double hardware costs,
> > so
> > if we originally have a 5 node cluster, then we would need to pull 5 more
> > machines out of somewhere for this decision. This is not the preferable
> > choice.
> > b) Just mirror the master node via other always available software, such
> as
> > DRBD for real time synchronization. Upon detection we could swap to the
> > alternate node.
> > c) Or if Hadoop had some functionality already in place, it would be
> > fantastic to be able to take advantage of that. I don't know if anything
> > like this is available but I could not find anything as of yet. It seems
> to
> > me, however, that having multiple master nodes would be the direction
> > Hadoop
> > needs to go if it is to be useful in high availability applications. I
> was
> > told there are some papers on Amazon's Elastic Computing that I'm about
> to
> > look for that follow this approach.
> >
> > In any case, could someone with experience in solving this type of
> problem
> > share how they approached this issue?
> >
> > Thanks!
> >
>

Reply via email to