Hello Hadoopers: I am trying to running the same map reduce job on HDFS and local file system. That is one time, I run the map reduce job on HDFS and another time I run the same map reduce job with the same input data on local file ext3 system without using HDFS. I found that the number of maps generated in local file system is always much larger than the case with HDFS. This seems strange to me, because the number of maps is decided by the number of the splits of the given map reduce input. And I input with the same map reduce job and the same input data, which should be split *into the same number of pieces and thus the same number of maps should be generated in both case. It seems to me that the major difference, if there is, should be that the HDFS need to copy the input data into HDFS sequence file format. But that should not have effects on the number of splits of the file. Why this will happen? Does any one encounter this before? Any insight on this phenomena?
Thanks. Richard. *
