On Dec 9, 2008, at 5:31 PM, Raghu Angadi wrote:

Brian Bockelman wrote:
On Dec 9, 2008, at 4:58 PM, Edward Capriolo wrote:
Also it might be useful to strongly word hadoop-default.conf as many
people might not know a downside exists for using 2 rather then 3 as
the replication factor. Before reading this thread I would have
thought 2 to be sufficient.
I think 2 should be sufficient, but running with 2 replicas instead of 3 exposes some namenode bugs which are harder to trigger.

Whether 2 is sufficient or not, I completely agree with later part. We should treat this as what I think it fundamentally is : fixing Namenode.

I guess lately some of these bugs either got more likely or some similar bugs crept in.

Sticking with 3 is a very good advise for maximizing reliability.. but from a opportunistic developer point of view a big cluster running with replication of 2 is great test case :-).. over all I think is a good thing for Hadoop.


Well, we're most likely here to stay: this is the secondary site for most of these files. As long as we can indeed identify lost files, it's fairly automated to retransfer. The amount of unique files on this site is around .1% or less of total, and we plan on setting only those to 3 replicas.

So, we'll be happy to provide whatever logs or debugging info is needed, as long as someone cares to keep on fixing bugs.

Brian

Reply via email to