Re: HBase and Cassandra on StackOverflow

Ryan Rawson Fri, 02 Sep 2011 10:48:13 -0700

On Fri, Sep 2, 2011 at 10:27 AM, Joseph Pallas <joseph.pal...@oracle.com> wrote:
> Drifting off topic a bit …
>
> On Sep 1, 2011, at 12:12 PM, Ryan Rawson wrote:
>
>>> First, you have to learn:
>>> 1) Linux HA
>>> 2) DRDB
>>>
>>> Right out of the gate just to have a redundant name node.
>>
>> Eh, no one would do that.  If you want a redundant name node your only
>> choice is to use Mapr, which I would def recommend since you get a
>> better nn "fail-over" w/o service interruption and significantly
>> higher performance than hdfs.
>
> Really?  People running offline analytics may be fine with an hour of 
> downtime 
> [<http://hadoopblog.blogspot.com/2010/02/hadoop-namenode-high-availability.html>
>  
> <http://www.hortonworks.com/data-integrity-and-availability-in-apache-hadoop-hdfs/>]
>  for their M/R jobs, but people running interactive services do not find that 
> acceptable.
>
> Is my only option to avoid significant downtime in the event of a name node 
> failure a closed-source offering that has already demonstrated at least one 
> serious data-loss issue 
> <http://answers.mapr.com/questions/415/hbase-table-disappear-after-failover-attempt-and-fall-back>?


Well, actually... yes.  HA/DRDB flip will take at the very least 10-30
seconds, and possibly 10 minutes or longer if your cluster is really
big.  Avatar node presumes a $250k netapp, and still has a 10-30
second flip time once you trigger it.  The NN-HA work is still WIP.

You could always use ceph, right?

>
> I don’t really mean to criticize MapR: they were victims of a hidden 
> dependency, but that’s what happens when you replace part of an integrated 
> stack.  And that is why I find your suggestion that I should not expect to 
> use the integrated stack a little unnerving, because I'm looking at HBase for 
> an online application.
>

Re: HBase and Cassandra on StackOverflow

Reply via email to