On Fri, Sep 2, 2011 at 10:27 AM, Joseph Pallas <joseph.pal...@oracle.com> wrote: > Drifting off topic a bit … > > On Sep 1, 2011, at 12:12 PM, Ryan Rawson wrote: > >>> First, you have to learn: >>> 1) Linux HA >>> 2) DRDB >>> >>> Right out of the gate just to have a redundant name node. >> >> Eh, no one would do that. If you want a redundant name node your only >> choice is to use Mapr, which I would def recommend since you get a >> better nn "fail-over" w/o service interruption and significantly >> higher performance than hdfs. > > Really? People running offline analytics may be fine with an hour of > downtime > [<http://hadoopblog.blogspot.com/2010/02/hadoop-namenode-high-availability.html> > > <http://www.hortonworks.com/data-integrity-and-availability-in-apache-hadoop-hdfs/>] > for their M/R jobs, but people running interactive services do not find that > acceptable. > > Is my only option to avoid significant downtime in the event of a name node > failure a closed-source offering that has already demonstrated at least one > serious data-loss issue > <http://answers.mapr.com/questions/415/hbase-table-disappear-after-failover-attempt-and-fall-back>?
Well, actually... yes. HA/DRDB flip will take at the very least 10-30 seconds, and possibly 10 minutes or longer if your cluster is really big. Avatar node presumes a $250k netapp, and still has a 10-30 second flip time once you trigger it. The NN-HA work is still WIP. You could always use ceph, right? > > I don’t really mean to criticize MapR: they were victims of a hidden > dependency, but that’s what happens when you replace part of an integrated > stack. And that is why I find your suggestion that I should not expect to > use the integrated stack a little unnerving, because I'm looking at HBase for > an online application. >