A compressed input file does not get partitioned, so the number of mappers is equal to the number of input files.
Hairong -----Original Message----- From: Dennis Kubes [mailto:[EMAIL PROTECTED] Sent: Wednesday, August 01, 2007 11:48 PM To: [email protected] Subject: Re: compressed input files I don't know about your record count but the link error means that you don't have the right version of glibc that was used to compile the hadoop native libraries. It shouldn't matter though as hadoop will fall back to the java versions if the native can't be used. Dennis Kubes Sandhya E wrote: > Hi > > I'm trying to pass .gz files as input to hadoop, and at the end of > mapreduce, the number of input records read from the input files is > around 480, and when I uncompress the files, the number of input > records read is around 3000. Why such a difference. Also there is a > warning mesg during start of execution: > 07/08/01 23:18:55 DEBUG util.NativeCodeLoader: Trying to load the > custom-built native-hadoop library... > 07/08/01 23:18:55 DEBUG util.NativeCodeLoader: Failed to load > native-hadoop with error: java.lang.UnsatisfiedLinkError: > /local/offline2/hadoop-0.13.0/lib/native/Linux-i386-32/libhadoop.so: > /lib/tls/libc.so.6: version `GLIBC_2.4' not found (required by > /local/offline2/hadoop-0.13.0/lib/native/Linux-i386-32/libhadoop.so) > > Can this be the reason. > > Many Thanks > Sandhya >
