Brian Bockelman wrote:

On Dec 9, 2008, at 4:58 PM, Edward Capriolo wrote:

Also it might be useful to strongly word hadoop-default.conf as many
people might not know a downside exists for using 2 rather then 3 as
the replication factor. Before reading this thread I would have
thought 2 to be sufficient.

I think 2 should be sufficient, but running with 2 replicas instead of 3 exposes some namenode bugs which are harder to trigger.

Whether 2 is sufficient or not, I completely agree with later part. We should treat this as what I think it fundamentally is : fixing Namenode.

I guess lately some of these bugs either got more likely or some similar bugs crept in.

Sticking with 3 is a very good advise for maximizing reliability.. but from a opportunistic developer point of view a big cluster running with replication of 2 is great test case :-).. over all I think is a good thing for Hadoop.

Raghu.

Reply via email to