Hi everyone,

  Using Hadoop-0.20.2, I'm trying to use MultiFileInputFormat which is supposed 
to put each file from the input directory in a SEPARATE split. So the number of 
Maps is equal to the number of input files. Yet, what I get is that each split 
contains multiple paths of input files, hence # of maps is < # of input files. 
Is it because "MultiFileInputFormat" is deprecated?

  In my implemented myMultiFileInputFormat I have only the following:

public RecordReader<LongWritable, Text> getRecordReader(InputSplit split, 
JobConf job, Reporter reporter){
                return (new myRecordReader((MultiFileSplit) split));
        }

Yet, in myRecordReader, for example one split has the following;
  
  " /tmp/input/file1:0+300
    /tmp/input/file2:0+199  "

  instead of each line in its own split.

    Why? Any clues?

          Thank you,
              Maha
  

Reply via email to