The namenode is constantly receiving reports about what datanode has what blocks, and performing replication when a block becomes under replicated.
On Tue, Jun 23, 2009 at 6:18 PM, Stuart White <stuart.whi...@gmail.com>wrote: > In my Hadoop cluster, I've had several drives fail lately (and they've > been replaced). Each time a new empty drive is placed in the cluster, > I run the balancer. > > I understand that the balancer will redistribute the load of file > blocks across the nodes. > > My question is: will balancer also look at the desired replication of > a file, and if the actual replication of a file is less than the > desired (because the file had blocks stored on the lost drive), will > balancer re-replicate those lost blocks? > > If not, is there another tool that will ensure the desired replication > factor of files is satisfied? > > If this functionality doesn't exist, I'm concerned that I'm slowly, > silently losing my files as I replace drives, and I may not even > realize it. > > Thoughts? > -- Pro Hadoop, a book to guide you from beginner to hadoop mastery, http://www.amazon.com/dp/1430219424?tag=jewlerymall www.prohadoopbook.com a community for Hadoop Professionals