Re: Confusion about the Hadoop conf/slaves file

Scott Simpson Mon, 10 Apr 2006 17:27:23 -0700

Doug Cutting wrote:

>Scott Simpson wrote:
>> Suppose I want to run Nutch 0.8 searches on separate machines than I 
>> crawl on. Is there a way to separate this so my crawling operation
>> (MapReduce) doesn't happen on my search machines?


>You could have two different configuration directories and set
>HADOOP_CONF_DIR (or use cd).

Excuse my ignorance on this issue. Say I have 5 machines in my Hadoop
cluster and I only list two of them in the configuration file when I do a
"fetch" or a "generate". Won't this just store the data on the two nodes
since that is all I've listed for my crawling machines? I'm trying to crawl
on two but store my data across all five.

Re: Confusion about the Hadoop conf/slaves file

Reply via email to