Hi,

we're running an 8-node Hadoop cluster with CDH2. Recently, our monitoring 
tools caught warnings like this one when fsck'ing the HDFS:

/tmp/hadoop-tgp/mapred/system/job_201105191458_1857/job.jar:  Under replicated 
blk_-6996370258385460742_366223. Target Replicas is 10 but found 8 replica(s).
// Lots more like it on every file in the Distributed Cache.

Obviously, this means that the default replication factor of 
mapred.submit.replication=10 cannot be reached since we only have 8 datanodes. 
I found the place in the code (JobClient.java) where this property is consumed 
and used for replicating the job jar and the Distributed Cache, so I understand 
(kind of ;-) where the warning comes from.

Still, I have two questions: Shouldn't there be an automatic limit of 
mapred.submit.replication to the number of data nodes? And more generally, 
should I worry about this warning?
 
Thanks and best regards,

Christoph

-- 
Christoph Schmitz

1&1 Internet AG
Ernst-Frey-Straße 10 · DE-76135 Karlsruhe
Telefon: +49 721 91374-6733
christoph.schm...@1und1.de

Vorstände: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Robert Hoffmann, 
Markus Huhn, Hans-Henning Kettler, Dr. Oliver Mauss, Jan Oetjen
Aufsichtsratsvorsitzender: Michael Scheeren


Reply via email to