On Mon, Aug 15, 2011 at 7:10 PM, Christoph Schmitz <christoph.schm...@1und1.de> wrote: > Still, I have two questions: Shouldn't there be an automatic limit of > mapred.submit.replication to the number of data nodes? And more generally, > should I worry about this warning?
1> That'd somehow bind MR to HDFS more tightly. Besides, your HDFS won't attack busy nodes for job jar writes/etc forms of replication. if they report themselves as busy (loaded) or unavailable. The idea sounds good though, please file a JIRA for this? 2> No, there's no need to worry. You can safely lower the number if you'd like and get rid of these under-replication issues. -- Harsh J