Doug Cutting wrote:
> So you want to use different sets of machines for dfs than for 
> MapReduce?  An easy way to achieve this is to install Hadoop separately 
> and start dfs only there ('bin/hadoop-daemon.sh start namenode; 
> bin/hadoop-daemons.sh start datanode', or use the new bin/start-dfs.sh 
> script).  Then, in your Nutch installation, start only the MapReduce 
> daemons, using a different conf/slaves file ('bin/hadoop-daemon.sh start 
> jobtracker; bin/hadoop-daemons.sh start tasktracker', or use the new 
> bin/start-mapred.sh script).  Just make sure that your Nutch 
> installation is configured to talk to the same namenode as your Hadoop 
> installation, and make sure that you don't run bin/start-all.sh from 
> either installation.  Does that make sense?

That makes complete sense. Conceptually, there is a daemon for dfs that says
on which machines DFS lives. All the MapReduce machines will always point to
this index server. However, to run MapReduce on subsets of the machines, I
use a different conf file. I like the fact that the scripts are now more
disjoint. Thanks.

Reply via email to