Periodically move blocks from full nodes to those with space
-------------------------------------------------------------

                 Key: HADOOP-429
                 URL: http://issues.apache.org/jira/browse/HADOOP-429
             Project: Hadoop
          Issue Type: Improvement
            Reporter: Bryan Pendleton


Continuance of Hadoop-386. The patch to that issue makes it possible to 
redistribute blocks (change replication up, wait for replication to succeed, 
then lower replication again). However, this requires a lot more space, is not 
automatic, and doesn't respect a reasonable I/O limit. I have actually had 
MapReduce jobs fail from block missing execptions after having recently changed 
the replication level (from 3 to 4, with no underreplications to start with) 
because the datanodes were too slow responding to requests while performing the 
necessary replications.

A good fix to this problem would be a low-priority thread on the NameNode that 
schedules low-priority replications of blocks on over-full machines, followed 
by the removal of the extra replications. It might be worth having a specific 
prototocol for asking for these low-priority copies to happen in the datanodes, 
so that they continue to service (and be available to service) normal block 
requests.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to