Hi Tom,

I have file a JIRA ticket (MAPREDUCE-1743) for this issue. At the mean
time, can you suggest an alternative approach to achieve what I want
(supporting different input formats and get the input file name in each
mapper)?

Yuanyuan


|------------>
| From:      |
|------------>
  
>--------------------------------------------------------------------------------------------------------------------------------------------------|
  |Tom White <[email protected]>                                                
                                                                      |
  
>--------------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| To:        |
|------------>
  
>--------------------------------------------------------------------------------------------------------------------------------------------------|
  |[email protected]                                                
                                                                     |
  
>--------------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| Date:      |
|------------>
  
>--------------------------------------------------------------------------------------------------------------------------------------------------|
  |04/29/2010 09:42 AM                                                          
                                                                     |
  
>--------------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| Subject:   |
|------------>
  
>--------------------------------------------------------------------------------------------------------------------------------------------------|
  |Re: conf.get("map.input.file") returns null when using MultipleInputs        
in Hadoop 0.20                                                            |
  
>--------------------------------------------------------------------------------------------------------------------------------------------------|





Hi Yuanyuan,

I think you've found a bug - could you file a JIRA issue for this please?

Thanks,
Tom

On Wed, Apr 28, 2010 at 11:04 PM, Yuanyuan Tian <[email protected]> wrote:
>
>
> I have a problem in getting the input file name in the mapper  when uisng
> MultipleInputs. I need to use MultipleInputs to support different formats
> for my inputs to the my MapReduce job. And inside each mapper, I also
need
> to know the exact input file that the mapper is processing. However,
> conf.get("map.input.file") returns null. Can anybody help me solve this
> problem? Thanks in advance.
>
> public class Test extends Configured implements Tool{
>
>        static class InnerMapper extends MapReduceBase implements
> Mapper<Writable, Writable, NullWritable, Text>
>        {
>                ................
>                ................
>
>                public void configure(JobConf conf)
>                {
>                        String inputName=conf.get("map.input.file"));
>                        .......................................
>                }
>
>        }
>
>        public int run(String[] arg0) throws Exception {
>                JonConf job;
>                job = new JobConf(Test.class);
>                ...........................................
>
>                MultipleInputs.addInputPath(conf, new Path("A"),
> TextInputFormat.class);
>                MultipleInputs.addInputPath(conf, new Path("B"),
> SequenceFileFormat.class);
>                ...........................................
>        }
> }
>
> Yuanyuan

Reply via email to