[ http://issues.apache.org/jira/browse/HADOOP-170?page=comments#action_12376576 ]
Benjamin Reed commented on HADOOP-170: -------------------------------------- It's really JobTracker, not the fs, that knows how high to set the replication count since the JobTracker will know the number of mapper tasks. The JobTracker also knows when the job has finished and can set the replication count back to the original number. I guess in most cases the job files will just get deleted anyway at the end of the job. When you say "submitted job files", you are just talking about the job conf and jar file that correspond to a job, correct? > setReplication and related bug fixes > ------------------------------------ > > Key: HADOOP-170 > URL: http://issues.apache.org/jira/browse/HADOOP-170 > Project: Hadoop > Type: Improvement > Components: fs, dfs > Versions: 0.1.1 > Reporter: Konstantin Shvachko > Assignee: Konstantin Shvachko > Attachments: setReplication.patch > > Having variable replication (HADOOP-51) it is natural to be able to > change replication for existing files. This patch introduces the > functionality. > Here is a detailed list of issues addressed by the patch. > 1) setReplication() and getReplication() methods are implemented. > 2) DFSShell prints file replication for any listed file. > 3) Bug fix. FSDirectory.delete() logs delete operation even if it is not > successful. > 4) Bug fix. This is a distributed bug. > Suppose that file replication is 3, and a client reduces it to 1. > Two data nodes will be chosen to remove their copies, and will do that. > After a while they will report to the name node that the copies have been > actually deleted. > Until they report the name node assumes the copies still exist. > Now the client decides to increase replication back to 3 BEFORE the data nodes > reported the copies are deleted. Then the name node can choose one of the > data nodes, > which it thinks have a block copy, to replicate the block to new data nodes. > This setting is quite unusual but possible even without variable replications. > 5) Logging for name and data nodes is improved in several cases. > E.g. data nodes never logged that they deleted a block. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
