Thank you very much, Paco and Jason. It works! For any users who may be curious what this may look like in code, here is a small snippet of mine:
file: myLittleMRProgram.java package.org.apache.hadoop.examples; public static class Reduce extends MapReduceBase implements Reducer<Text, LongWritable, Text, LongWritable> { private int nTax = 0; public void configure(JobConf job) { super.configure(job); String Tax = job.get("nTax"); nTax = Integer.parseInt(Tax); } public void reduce() throws IOException { .... System.out.println("nTax is: " + nTax); } .... main() { .... conf.set("nTax", other_args.get(2)); JobClient.runJob(conf); .... return 0; } -------- -SM On Tue, Aug 19, 2008 at 5:02 PM, Jason Venner <[EMAIL PROTECTED]> wrote: > Since the map & reduce tasks generally run in a separate java virtual > machine and on distinct machines from your main task's java virtual machine, > there is no sharing of variables between the main task and the map or reduce > tasks. > > The standard way is to store the variable in the Configuration (or JobConf) > object in your main task > Then in the configure method of your map and reduce task class, extract the > variable value from the JobConf object. > > You will need to implement an overriding to the configure method in your > map and reduce classes. > > This will also require that the variable value be serializable. > > For lots of large variables this can be expensive. > > > Sandy wrote: > >> Hello, >> >> >> My M/R program is going smoothly, except for one small problem. I have a >> "global" variable that is set by the user (and thus in the main function), >> that I want one of my reduce functions to access. This is a read-only >> variable. After some reading in the forums, I tried something like this: >> >> file: MyGlobalVars.java >> package org.apache.hadoop.examples; >> public class MyGlobalVars { >> static public int nTax; >> } >> ------ >> >> file: myLittleMRProgram.java >> package.org.apache.hadoop.examples; >> map function() { >> System.out.println("in map function, nTax is: " + MyGlobalVars.nTax); >> } >> .... >> main() { >> MyGlobalVars.nTax = other_args.get(2); >> System.out.println("in main function, nTax is: " + MyGlobalVars.nTax); >> .... >> JobClient.runJob(conf); >> .... >> return 0; >> } >> -------- >> >> When I run it, I get: >> in main function, nTax is 20 (which is what I want) >> in map function, nTax is 0 (<--- this is not right). >> >> >> I am a little confused on how to resolve this. I apologize in advance if >> this is an blatant java error; I only began programming in the language a >> few weeks ago. >> >> Since Map Reduce tries to avoid the whole shared-memory scene, I am more >> than willing to have each reduce function receive a local copy of this >> user >> defined value. However, I am a little confused on what the best way to do >> this would be. As I see it, my options are: >> >> 1.) write the user defined value to the hdfs in the main function, and >> have >> it read from the hdfs in the reduce function. I can't quite figure out the >> code to this though. I know how to specify -an- input file for the map >> reduce task, but if I did it this way, won't I need to specify two >> separate >> input files? >> >> 2. Put it in the construction of the reduce object (I saw this mentioned >> in >> the archives). How would I accomplish this exactly when the value is user >> defined? Parameter Passing? If so, won't this require me changing the >> underlying map reduce base (which makes me a touch nervous, since i'm >> still >> very new to hadoop). >> >> What would be the easiest way to do this? >> >> Thanks in advance for the help. I appreciate your time. >> >> -SM >> >> >> > -- > Jason Venner > Attributor - Program the Web <http://www.attributor.com/> > Attributor is hiring Hadoop Wranglers and coding wizards, contact if > interested >