Thanks for the prompt reply, this worked like a charm! - Mike
On Wed, Jun 13, 2012 at 10:51 PM, Harsh J <ha...@cloudera.com> wrote: > Hey Mike, > > There is a much easier way to do this. We've answered a very similar > question in detail before at: http://search-hadoop.com/m/ZOmmJ1PZJqt1 > (Question has a way for the stable/old API, and my response has the > way for new API). Does this help? > > On Thu, Jun 14, 2012 at 8:24 AM, Michael Parker > <michael.g.par...@gmail.com> wrote: >> Hi all, >> >> I'm new to Hadoop MR and decided to make a go at using only the new >> API. I have a series of log files (who doesn't?), where a different >> date is encoded in each filename. The log files are so few that I'm >> not using HDFS. In my main method, I accept the input directory >> containing all the log files as the first command line argument: >> >> Configuration conf = new Configuration(); >> String[] otherArgs = new GenericOptionsParser(conf, >> args).getRemainingArgs(); >> Path inputDir = new Path(otherArgs[0]); >> ... >> Job job1 = new Job(conf, "job1"); >> FileInputFormat.addInputPath(job1, inputDir); >> >> I actually have two jobs chained using a JobControl, but I think >> that's irrelevant. The problem is that the Mapper of this job cannot >> get the filename by accessing key "mapred.input.file" of the Context >> object that is either passed to the setup method of the mapper, or >> available through the Context object in the call to map. Dumping the >> configuration like so: >> >> StringWriter writer = new StringWriter(); >> Configuration.dumpConfiguration(context.getConfiguration(), writer); >> System.out.println("configuration=" + writer.toString()); >> >> Reveals that there is a "mapred.input.dir" key that contains the path >> passed as a command line argument and assigned to inputDir in my main >> method, but the processed filename within that path is still >> inaccessible. Any ideas how to get this? >> >> Thanks, >> Mike > > > > -- > Harsh J