> JR> After chatting with some Facebook guys, we realized that one potential > JR> benefit from using HDFS is that the recovery from losing partial data in a > JR> node is more efficient. Suppose that one lost a single disk at a node. > HDFS > JR> can quickly rebuild the blocks on the failed disk in parallel.
HDFS replicates eagerly, which means that having a node down for longer than a timeout period will also mean that you do more work than you needed. Cassandra replicates (very) lazily, and I prefer laziness for the sake of efficiency. > JR> So, when this happens, the whole node probably has to be taken out > JR> and bootstrapped. The same problem exists when a single sstable file > JR> is corrupted. > I think recovering a single sstable is a useful thing, and it seems like > a better problem to solve. This is why we need to get #193 in. Going to the filesystem and deleting/fuzzing an SSTable on a node and then running a repair will cause a new SSTable to be created that overlays and reapairs the first based on the data from the other nodes. Thanks, Stu -----Original Message----- From: "Ted Zlatanov" <[email protected]> Sent: Tuesday, November 24, 2009 8:40am To: [email protected] Subject: Re: cassandra over hbase On Mon, 23 Nov 2009 11:58:08 -0800 Jun Rao <[email protected]> wrote: JR> After chatting with some Facebook guys, we realized that one potential JR> benefit from using HDFS is that the recovery from losing partial data in a JR> node is more efficient. Suppose that one lost a single disk at a node. HDFS JR> can quickly rebuild the blocks on the failed disk in parallel. This is a JR> bit hard to do in cassandra, since we can't easily find the data on the JR> failed disk from another node. This is an architectural issue, right? IIUC Cassandra simply doesn't care about disks. I think that's a plus, actually, because it simplifies the code and filesystems in my experience are better left up to the OS. For instance, we're evaluating Lustre and for many specific reasons it's significantly better for our needs than HDFS, so HDFS would be a tough sell. JR> So, when this happens, the whole node probably has to be taken out JR> and bootstrapped. The same problem exists when a single sstable file JR> is corrupted. I think recovering a single sstable is a useful thing, and it seems like a better problem to solve. Ted
