On May 4, 2009, at 6:07 PM, Todd Lipcon wrote:

The issue here is that your mapper and reducer classes are being
instantiated in a different JVM from your main() function. In order to pass
data to them, you need to use the Configuration object.

Since you have a simple String here, this should be pretty simple. Something
like:

conf.set("com.example.tool.pattern", otherArgs[2]);

then in the configure() function of your Mapper/Reducer, simply retrieve it
using conf.get("com.example.tool.pattern");


Thanks for the pointer. I'm using Hadoop 0.20.0 and my mapper which extends Mapper<Object, Text, Text, IntWritable> doesn't seem to have a configure() method.

Looking at the API I see the superclass has a setup method. Thus in my class I do:

public static class MoleculeMapper extends Mapper<Object, Text, Text, IntWritable> {

        private Text matches = new Text();
        private String pattern;

        public void setup(Context context) {
pattern = context.getConfiguration().get("net.rguha.dc.data.pattern");
            System.out.println("pattern = " + pattern);
        }
       ....
    }

In my main method I have

Configuration conf = new Configuration();
String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
conf.set("net.rguha.dc.data.pattern", otherArgs[2]);

However, even with this, pattern turns out to be null when printed in setup().

I just started on Hadoop a day or two ago, and my understanding is that 0.20.0 had some pretty major refactoring. As a result a lot of examples I come across on the Net don't seem to work. Could the lack of the configure() method be due to the refactoring?

-------------------------------------------------------------------
Rajarshi Guha  <rg...@indiana.edu>
GPG Fingerprint: D070 5427 CC5B 7938 929C  DD13 66A1 922C 51E7 9E84
-------------------------------------------------------------------
Q:  What's polite and works for the phone company?
A:  A deferential operator.


Reply via email to