Periodically move blocks from full nodes to those with space
-------------------------------------------------------------
Key: HADOOP-429
URL: http://issues.apache.org/jira/browse/HADOOP-429
Project: Hadoop
Issue Type: Improvement
Reporter: Bryan Pendleton
Continuance of Hadoop-386. The patch to that issue makes it possible to
redistribute blocks (change replication up, wait for replication to succeed,
then lower replication again). However, this requires a lot more space, is not
automatic, and doesn't respect a reasonable I/O limit. I have actually had
MapReduce jobs fail from block missing execptions after having recently changed
the replication level (from 3 to 4, with no underreplications to start with)
because the datanodes were too slow responding to requests while performing the
necessary replications.
A good fix to this problem would be a low-priority thread on the NameNode that
schedules low-priority replications of blocks on over-full machines, followed
by the removal of the extra replications. It might be worth having a specific
prototocol for asking for these low-priority copies to happen in the datanodes,
so that they continue to service (and be available to service) normal block
requests.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira