Hey Mike,

There is a much easier way to do this. We've answered a very similar
question in detail before at: http://search-hadoop.com/m/ZOmmJ1PZJqt1
(Question has a way for the stable/old API, and my response has the
way for new API). Does this help?

On Thu, Jun 14, 2012 at 8:24 AM, Michael Parker
<michael.g.par...@gmail.com> wrote:
> Hi all,
>
> I'm new to Hadoop MR and decided to make a go at using only the new
> API. I have a series of log files (who doesn't?), where a different
> date is encoded in each filename. The log files are so few that I'm
> not using HDFS. In my main method, I accept the input directory
> containing all the log files as the first command line argument:
>
>  Configuration conf = new Configuration();
>  String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
>  Path inputDir = new Path(otherArgs[0]);
>  ...
>  Job job1 = new Job(conf, "job1");
>  FileInputFormat.addInputPath(job1, inputDir);
>
> I actually have two jobs chained using a JobControl, but I think
> that's irrelevant. The problem is that the Mapper of this job cannot
> get the filename by accessing key "mapred.input.file" of the Context
> object that is either passed to the setup method of the mapper, or
> available through the Context object in the call to map. Dumping the
> configuration like so:
>
>  StringWriter writer = new StringWriter();
>  Configuration.dumpConfiguration(context.getConfiguration(), writer);
>  System.out.println("configuration=" + writer.toString());
>
> Reveals that there is a "mapred.input.dir" key that contains the path
> passed as a command line argument and assigned to inputDir in my main
> method, but the processed filename within that path is still
> inaccessible. Any ideas how to get this?
>
> Thanks,
> Mike



-- 
Harsh J

Reply via email to