Hello Team,

I have started to understand about Hadoop Mapreduce and was able to set-up a 
single cluster single node execution environment.

I want to now extend this to a multi node environment.
I have the following questions and it would very helpful if somebody can help:
1. For multiple nodes I understand I should add the URL of the secondary nodes 
in the slaves.xml. Am I correct?
2. What should be installed on the secondary nodes for executing a job/task?
3. I understand I can set the map/reduce classes as a jar to the Job - through 
the JobConf - so does this mean I need not really install/copy my map/reduce 
code on all the secondary nodes?
4. How do I route the data to these nodes? Is it required for the Map Reduce to 
execute on the machines which has the data stored (DFS)?

Any samples for doing this would help.
Request for suggestions.

Regards
Girish
Ph: +91-9916212114

Reply via email to