HBase/HDFS are maintaining block checksums, so presumably a corrupted block
would fail checksum validation. Increasing the number of replicas increases
the odds that you'll still have a valid block. I'm not an HDFS expert, but
I would be very surprised if HDFS is validating a "questionable block" via
byte-wise comparison over the network amongst the replica peers.

On Mon, Feb 23, 2015 at 12:25 PM, Michael Segel <[email protected]> wrote:

>
> On Feb 23, 2015, at 1:47 AM, Arinto Murdopo <[email protected]> wrote:
>
> We're running HBase (0.94.15-cdh4.6.0) on top of HDFS (Hadoop
> 2.0.0-cdh4.6.0).
> For all of our tables, we set the replication factor to 1 (dfs.replication
> = 1 in hbase-site.xml). We set to 1 because we want to minimize the HDFS
> usage (now we realize we should set this value to at least 2, because
> "failure is a norm" in distributed systems).
>
>
>
> Sorry, but you really want this to be a replication value of at least 3
> and not 2.
>
> Suppose you have corruption but not a lost block. Which copy of the two is
> right?
> With 3, you can compare the three and hopefully 2 of the 3 will match.
>
>

Reply via email to