Hi, we're running an 8-node Hadoop cluster with CDH2. Recently, our monitoring tools caught warnings like this one when fsck'ing the HDFS:
/tmp/hadoop-tgp/mapred/system/job_201105191458_1857/job.jar: Under replicated blk_-6996370258385460742_366223. Target Replicas is 10 but found 8 replica(s). // Lots more like it on every file in the Distributed Cache. Obviously, this means that the default replication factor of mapred.submit.replication=10 cannot be reached since we only have 8 datanodes. I found the place in the code (JobClient.java) where this property is consumed and used for replicating the job jar and the Distributed Cache, so I understand (kind of ;-) where the warning comes from. Still, I have two questions: Shouldn't there be an automatic limit of mapred.submit.replication to the number of data nodes? And more generally, should I worry about this warning? Thanks and best regards, Christoph -- Christoph Schmitz 1&1 Internet AG Ernst-Frey-Straße 10 · DE-76135 Karlsruhe Telefon: +49 721 91374-6733 christoph.schm...@1und1.de Vorstände: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Robert Hoffmann, Markus Huhn, Hans-Henning Kettler, Dr. Oliver Mauss, Jan Oetjen Aufsichtsratsvorsitzender: Michael Scheeren