Re: File loss at Nebraska

Brian Bockelman Tue, 09 Dec 2008 11:19:15 -0800


On Dec 9, 2008, at 4:58 PM, Edward Capriolo wrote:

Also it might be useful to strongly word hadoop-default.conf as many
people might not know a downside exists for using 2 rather then 3 as
the replication factor. Before reading this thread I would have
thought 2 to be sufficient.

I think 2 should be sufficient, but running with 2 replicas instead of3 exposes some namenode bugs which are harder to trigger.

For example, let's say your system has 100 nodes and 1M blocks. Let'ssay a namenode bug affects replica of block X on node Y and thenamenode doesn't realize it. Then, there is a 1% chance that whenanother node goes down, the block becomes missing. If this bug iscumulative or affects many blocks (I suspect about 500-1000 blocks areproblematic out of 1M), you're almost guaranteed to lose data whenevera single node goes down.

On the other hand, if you have 1000 block replica problems on the samecluster with 3 replicas, in order to lose files, two of the blockreplica problems must be the same block and the node which goes downmust hold the third block. The probability of this happening is(1e-6) * (1e-6) * (1/100) = 1e-14, or 0.0000000000001%.

So, even assuming that I did all my probability calculations wrong, asite running with 2 replicas is more than 10 orders of magnitude morelikely to discover inconsistencies or other bugs in the name node thana site with 3 replicas.

Accordingly, these sites are the "canaries in the coal mine" todiscover NameNode bugs.


Brian

Re: File loss at Nebraska

Reply via email to