RE: Please help with hadoop configuration parameter set and get

Peng, Wei Fri, 17 Dec 2010 10:44:14 -0800

Thank you for your kind reply.
I am running hadoop in the distributed mode.


It might be out of the topic.
For a lot of algorithms (especially data mining/machine learning
algorithms), we do need many iterations. However, hadoop needs to save
all the intermediate step output as the input of the next iteration. I
doubt it would be efficient.
Is hadoop the right choice for data mining/machine learning algorithm?
Can hadoop go with gpu computing? 

Wei

-----Original Message-----
From: Arindam Khaled [mailto:[email protected]] 
Sent: Friday, December 17, 2010 1:14 PM
To: [email protected]
Subject: Re: Please help with hadoop configuration parameter set and get

There's no guarantee that it might work for a distributed environment  
as a couple of developers suggested. I didn't have the time to play  
with it in the distributed mode, rather, executed it in a standalone  
environment.

Arindam





On Dec 17, 2010, at 12:07 PM, Arindam Khaled wrote:

> Wei,
>
> I implemented one of the algorithms, BFIDA*, included the paper "Out- 
> of-Core Parallel Frontier Search with MapReduce" by Alexander  
> Reinefeld and Thorsten Sch{\"u}tt. FYI, they implemented BFS for 15- 
> puzzle using map-reduce and MPI, not Hadoop.
>
> For brevity, I am not including the whole source code. Here is my  
> pseudo-code:
>
> public class standAloneIDA {
>
>       static boolean solved = false;
>       public static class BFIDAMapClass extends Mapper<Object, Text,  
> Text, IntWritable>
>       {
>               public void map(Object key, Text value, Context context)
>               throws IOException, InterruptedException
>               {
>                       String line = value.toString();
>                       while(there can be more new moves)
>                       {
>                               emit("the new board", "move"); //key,
value pair
>                       }
>               }
>
>       }
>
>       public static class BFIDAReducer
>       extends Reducer<Text,IntWritable,Text,IntWritable> {
>               private IntWritable result = new IntWritable();
>
>               public void reduce(Text key, Iterable<IntWritable>
values,
>                               Context context
>               ) throws IOException, InterruptedException {
>                       String line = "";
>
>                       ArrayList<Integer> previousMoves = new
ArrayList<Integer> ();
>
>                       for(IntWritable val: values)
>                       {
>                               String valInt = val.toString();
>                               int[] moves = stringToArrayInt(valInt);
>
>                               for(int bit: moves)
>                               {
>                                       if(!previousMoves.contains(bit))
>                                               previousMoves.add(bit);
>                               }
>                       }
>
>                       for(int val: previousMoves)
>                       {
>                               line = line + Integer.toString(val);
>                       }
>
>                       if (isSolved(key.toString(), size))
>                       {
>                               solved = true;
>                       }
>
>                       result.set(Integer.parseInt(line));
>                       context.write(key, result);
>               }
>       }
>       
>       public static void main(String[] args) throws Exception {
>               //              logger.isDebugEnabled();
>               Configuration conf = new Configuration();
>               String[] otherArgs = new GenericOptionsParser(conf,  
> args).getRemainingArgs();
>               if (otherArgs.length != 2) {
>                       System.err.println("Usage: testUnit2 <in>
<out>");
>                       System.exit(2);
>               }
>               
>               
>               
>               while(!solved)
>               {
>                       set up the file system
>                       include the input and output filenames
>                       Job job = new Job(conf, "some name");
>                       job.setJarByClass(standAloneIDA.class);
>                       job.setMapperClass(BFIDAMapClass.class);
>                       job.setReducerClass(BFIDAReducer.class);
>                       job.setOutputKeyClass(Text.class);
>                       job.setOutputValueClass(IntWritable.class);
>                       job.setMapOutputKeyClass(Text.class);
>                       job.setMapOutputValueClass(IntWritable.class);
>
>                       job.waitForCompletion(true);
>               }
>       }
> }
>
> Please excuse me if there are missing braces. There might be more  
> efficient ways to setup the jobs and file system. I didn't have much  
> time -- so, I ended up with something that worked for me then. Let  
> me know if you have more questions.
>               
> Kind regards,
>
> Arindam Khaled
>
>
>
>
>
> On Dec 17, 2010, at 10:31 AM, Peng, Wei wrote:
>
>> Arindam, how to set this global static Boolean variable?
>> I have tried to do something similarly yesterday in the following:
>> Public class BFSearch
>> {
>>      Private static boolean expansion;
>>      Public static class MapperClass {if no nodes expansion = false;}
>>      Public static class ReducerClass
>>      Public static void main {expansion = true; run job;
>> print(expansion)}
>> }
>> In this case, expansion is still true.
>> I will look at hadoop counter and report back here later.
>>
>> Thank you for all your help
>> Wei
>>
>> -----Original Message-----
>> From: Arindam Khaled [mailto:[email protected]]
>> Sent: Friday, December 17, 2010 10:35 AM
>> To: [email protected]
>> Subject: Re: Please help with hadoop configuration parameter set  
>> and get
>>
>> I did something like this using a global static boolean variable
>> (flag) while I was implementing breadth first IDA*. In my case, I set
>> the flag to something else if a solution was found, which was  
>> examined
>> in the reducer.
>>
>> I guess in your case, since you know that if the mappers don't  
>> produce
>> anything the reducers won't have anything as input, if I am not  
>> wrong.
>>
>> And I had chaining map-reduce jobs (
>> http://developer.yahoo.com/hadoop/tutorial/module4.html
>> ) running until a solution was found.
>>
>>
>> Kind regards,
>>
>> Arindam Khaled
>>
>>
>>
>>
>>
>> On Dec 17, 2010, at 12:58 AM, Peng, Wei wrote:
>>
>>> Hi,
>>>
>>>
>>>
>>> I am a newbie of hadoop.
>>>
>>> Today I was struggling with a hadoop problem for several hours.
>>>
>>>
>>>
>>> I initialize a parameter by setting job configuration in main.
>>>
>>> E.g. Configuration con = new Configuration();
>>>
>>> con.set("test", "1");
>>>
>>> Job job = new Job(con);
>>>
>>>
>>>
>>> Then in the mapper class, I want to set "test" to "2". I did it by
>>>
>>> context.getConfiguration().set("test","2");
>>>
>>>
>>>
>>> Finally in the main method, after the job is finished, I check the
>>> "test" again by
>>>
>>> job.getConfiguration().get("test");
>>>
>>>
>>>
>>> However, the value of "test" is still "1".
>>>
>>>
>>>
>>> The reason why I want to change the parameter inside Mapper class is
>>> that I want to determine when to stop an iteration in the main  
>>> method.
>>> For example, for doing breadth-first search, when there is no new
>>> nodes
>>> are added for further expansion, the searching iteration should  
>>> stop.
>>>
>>>
>>>
>>> Your help will be deeply appreciated. Thank you
>>>
>>>
>>>
>>> Wei
>>>
>>
>

RE: Please help with hadoop configuration parameter set and get

Reply via email to