Thank you for your kind reply. I am running hadoop in the distributed mode.
It might be out of the topic. For a lot of algorithms (especially data mining/machine learning algorithms), we do need many iterations. However, hadoop needs to save all the intermediate step output as the input of the next iteration. I doubt it would be efficient. Is hadoop the right choice for data mining/machine learning algorithm? Can hadoop go with gpu computing? Wei -----Original Message----- From: Arindam Khaled [mailto:[email protected]] Sent: Friday, December 17, 2010 1:14 PM To: [email protected] Subject: Re: Please help with hadoop configuration parameter set and get There's no guarantee that it might work for a distributed environment as a couple of developers suggested. I didn't have the time to play with it in the distributed mode, rather, executed it in a standalone environment. Arindam On Dec 17, 2010, at 12:07 PM, Arindam Khaled wrote: > Wei, > > I implemented one of the algorithms, BFIDA*, included the paper "Out- > of-Core Parallel Frontier Search with MapReduce" by Alexander > Reinefeld and Thorsten Sch{\"u}tt. FYI, they implemented BFS for 15- > puzzle using map-reduce and MPI, not Hadoop. > > For brevity, I am not including the whole source code. Here is my > pseudo-code: > > public class standAloneIDA { > > static boolean solved = false; > public static class BFIDAMapClass extends Mapper<Object, Text, > Text, IntWritable> > { > public void map(Object key, Text value, Context context) > throws IOException, InterruptedException > { > String line = value.toString(); > while(there can be more new moves) > { > emit("the new board", "move"); //key, value pair > } > } > > } > > public static class BFIDAReducer > extends Reducer<Text,IntWritable,Text,IntWritable> { > private IntWritable result = new IntWritable(); > > public void reduce(Text key, Iterable<IntWritable> values, > Context context > ) throws IOException, InterruptedException { > String line = ""; > > ArrayList<Integer> previousMoves = new ArrayList<Integer> (); > > for(IntWritable val: values) > { > String valInt = val.toString(); > int[] moves = stringToArrayInt(valInt); > > for(int bit: moves) > { > if(!previousMoves.contains(bit)) > previousMoves.add(bit); > } > } > > for(int val: previousMoves) > { > line = line + Integer.toString(val); > } > > if (isSolved(key.toString(), size)) > { > solved = true; > } > > result.set(Integer.parseInt(line)); > context.write(key, result); > } > } > > public static void main(String[] args) throws Exception { > // logger.isDebugEnabled(); > Configuration conf = new Configuration(); > String[] otherArgs = new GenericOptionsParser(conf, > args).getRemainingArgs(); > if (otherArgs.length != 2) { > System.err.println("Usage: testUnit2 <in> <out>"); > System.exit(2); > } > > > > while(!solved) > { > set up the file system > include the input and output filenames > Job job = new Job(conf, "some name"); > job.setJarByClass(standAloneIDA.class); > job.setMapperClass(BFIDAMapClass.class); > job.setReducerClass(BFIDAReducer.class); > job.setOutputKeyClass(Text.class); > job.setOutputValueClass(IntWritable.class); > job.setMapOutputKeyClass(Text.class); > job.setMapOutputValueClass(IntWritable.class); > > job.waitForCompletion(true); > } > } > } > > Please excuse me if there are missing braces. There might be more > efficient ways to setup the jobs and file system. I didn't have much > time -- so, I ended up with something that worked for me then. Let > me know if you have more questions. > > Kind regards, > > Arindam Khaled > > > > > > On Dec 17, 2010, at 10:31 AM, Peng, Wei wrote: > >> Arindam, how to set this global static Boolean variable? >> I have tried to do something similarly yesterday in the following: >> Public class BFSearch >> { >> Private static boolean expansion; >> Public static class MapperClass {if no nodes expansion = false;} >> Public static class ReducerClass >> Public static void main {expansion = true; run job; >> print(expansion)} >> } >> In this case, expansion is still true. >> I will look at hadoop counter and report back here later. >> >> Thank you for all your help >> Wei >> >> -----Original Message----- >> From: Arindam Khaled [mailto:[email protected]] >> Sent: Friday, December 17, 2010 10:35 AM >> To: [email protected] >> Subject: Re: Please help with hadoop configuration parameter set >> and get >> >> I did something like this using a global static boolean variable >> (flag) while I was implementing breadth first IDA*. In my case, I set >> the flag to something else if a solution was found, which was >> examined >> in the reducer. >> >> I guess in your case, since you know that if the mappers don't >> produce >> anything the reducers won't have anything as input, if I am not >> wrong. >> >> And I had chaining map-reduce jobs ( >> http://developer.yahoo.com/hadoop/tutorial/module4.html >> ) running until a solution was found. >> >> >> Kind regards, >> >> Arindam Khaled >> >> >> >> >> >> On Dec 17, 2010, at 12:58 AM, Peng, Wei wrote: >> >>> Hi, >>> >>> >>> >>> I am a newbie of hadoop. >>> >>> Today I was struggling with a hadoop problem for several hours. >>> >>> >>> >>> I initialize a parameter by setting job configuration in main. >>> >>> E.g. Configuration con = new Configuration(); >>> >>> con.set("test", "1"); >>> >>> Job job = new Job(con); >>> >>> >>> >>> Then in the mapper class, I want to set "test" to "2". I did it by >>> >>> context.getConfiguration().set("test","2"); >>> >>> >>> >>> Finally in the main method, after the job is finished, I check the >>> "test" again by >>> >>> job.getConfiguration().get("test"); >>> >>> >>> >>> However, the value of "test" is still "1". >>> >>> >>> >>> The reason why I want to change the parameter inside Mapper class is >>> that I want to determine when to stop an iteration in the main >>> method. >>> For example, for doing breadth-first search, when there is no new >>> nodes >>> are added for further expansion, the searching iteration should >>> stop. >>> >>> >>> >>> Your help will be deeply appreciated. Thank you >>> >>> >>> >>> Wei >>> >> >
