On May 4, 2009, at 6:07 PM, Todd Lipcon wrote:
The issue here is that your mapper and reducer classes are being
instantiated in a different JVM from your main() function. In order
to pass
data to them, you need to use the Configuration object.
Since you have a simple String here, this should be pretty simple.
Something
like:
conf.set("com.example.tool.pattern", otherArgs[2]);
then in the configure() function of your Mapper/Reducer, simply
retrieve it
using conf.get("com.example.tool.pattern");
Thanks for the pointer. I'm using Hadoop 0.20.0 and my mapper which
extends Mapper<Object, Text, Text, IntWritable> doesn't seem to have a
configure() method.
Looking at the API I see the superclass has a setup method. Thus in my
class I do:
public static class MoleculeMapper extends Mapper<Object, Text,
Text, IntWritable> {
private Text matches = new Text();
private String pattern;
public void setup(Context context) {
pattern =
context.getConfiguration().get("net.rguha.dc.data.pattern");
System.out.println("pattern = " + pattern);
}
....
}
In my main method I have
Configuration conf = new Configuration();
String[] otherArgs = new GenericOptionsParser(conf,
args).getRemainingArgs();
conf.set("net.rguha.dc.data.pattern", otherArgs[2]);
However, even with this, pattern turns out to be null when printed in
setup().
I just started on Hadoop a day or two ago, and my understanding is
that 0.20.0 had some pretty major refactoring. As a result a lot of
examples I come across on the Net don't seem to work. Could the lack
of the configure() method be due to the refactoring?
-------------------------------------------------------------------
Rajarshi Guha <rg...@indiana.edu>
GPG Fingerprint: D070 5427 CC5B 7938 929C DD13 66A1 922C 51E7 9E84
-------------------------------------------------------------------
Q: What's polite and works for the phone company?
A: A deferential operator.