On Dec 9, 2008, at 5:31 PM, Raghu Angadi wrote:
Brian Bockelman wrote:
On Dec 9, 2008, at 4:58 PM, Edward Capriolo wrote:
Also it might be useful to strongly word hadoop-default.conf as many
people might not know a downside exists for using 2 rather then 3 as
the replication factor. Before reading this thread I would have
thought 2 to be sufficient.
I think 2 should be sufficient, but running with 2 replicas instead
of 3 exposes some namenode bugs which are harder to trigger.
Whether 2 is sufficient or not, I completely agree with later part.
We should treat this as what I think it fundamentally is : fixing
Namenode.
I guess lately some of these bugs either got more likely or some
similar bugs crept in.
Sticking with 3 is a very good advise for maximizing reliability..
but from a opportunistic developer point of view a big cluster
running with replication of 2 is great test case :-).. over all I
think is a good thing for Hadoop.
Well, we're most likely here to stay: this is the secondary site for
most of these files. As long as we can indeed identify lost files,
it's fairly automated to retransfer. The amount of unique files on
this site is around .1% or less of total, and we plan on setting only
those to 3 replicas.
So, we'll be happy to provide whatever logs or debugging info is
needed, as long as someone cares to keep on fixing bugs.
Brian