Hi Tom, I have file a JIRA ticket (MAPREDUCE-1743) for this issue. At the mean time, can you suggest an alternative approach to achieve what I want (supporting different input formats and get the input file name in each mapper)?
Yuanyuan |------------> | From: | |------------> >--------------------------------------------------------------------------------------------------------------------------------------------------| |Tom White <[email protected]> | >--------------------------------------------------------------------------------------------------------------------------------------------------| |------------> | To: | |------------> >--------------------------------------------------------------------------------------------------------------------------------------------------| |[email protected] | >--------------------------------------------------------------------------------------------------------------------------------------------------| |------------> | Date: | |------------> >--------------------------------------------------------------------------------------------------------------------------------------------------| |04/29/2010 09:42 AM | >--------------------------------------------------------------------------------------------------------------------------------------------------| |------------> | Subject: | |------------> >--------------------------------------------------------------------------------------------------------------------------------------------------| |Re: conf.get("map.input.file") returns null when using MultipleInputs in Hadoop 0.20 | >--------------------------------------------------------------------------------------------------------------------------------------------------| Hi Yuanyuan, I think you've found a bug - could you file a JIRA issue for this please? Thanks, Tom On Wed, Apr 28, 2010 at 11:04 PM, Yuanyuan Tian <[email protected]> wrote: > > > I have a problem in getting the input file name in the mapper when uisng > MultipleInputs. I need to use MultipleInputs to support different formats > for my inputs to the my MapReduce job. And inside each mapper, I also need > to know the exact input file that the mapper is processing. However, > conf.get("map.input.file") returns null. Can anybody help me solve this > problem? Thanks in advance. > > public class Test extends Configured implements Tool{ > > static class InnerMapper extends MapReduceBase implements > Mapper<Writable, Writable, NullWritable, Text> > { > ................ > ................ > > public void configure(JobConf conf) > { > String inputName=conf.get("map.input.file")); > ....................................... > } > > } > > public int run(String[] arg0) throws Exception { > JonConf job; > job = new JobConf(Test.class); > ........................................... > > MultipleInputs.addInputPath(conf, new Path("A"), > TextInputFormat.class); > MultipleInputs.addInputPath(conf, new Path("B"), > SequenceFileFormat.class); > ........................................... > } > } > > Yuanyuan
