In the slaves file, 'localhost' should only be used alone, not with other hosts, since 'localhost' is not a name that other hosts can use to refer to a host. It's equivalent to 127.0.0.1, the loopback address. So, if you're specifying more than one host, it's best to use real hostnames or IP addresses.

Doug

Ilya Vishnevsky wrote:
Hello! I'm trying to run nutch on two computers. Here is content of my
"slaves" file:

localhost
morpheus

When I type bin/start-al.sh, I get the next output:

starting namenode, logging to
/nutch/search/logs/hadoop-nutch-namenode-Thanatos.out
localhost
: ssh: localhost
: no address associated with name
morpheus: starting datanode, logging to
/nutch/search/logs/hadoop-nutch-datanode-morpheus.out
starting jobtracker, logging to
/nutch/search/logs/hadoop-nutch-jobtracker-Thanatos.out
localhost
: ssh: localhost
: no address associated with name
morpheus: starting tasktracker, logging to
/nutch/search/logs/hadoop-nutch-tasktracker-morpheus.out


It seems like it for some reason nutch can't run datanode and
tasktracker on localhost (thanatos).

At the same time if I have only localhost in my "slaves" file, I get the
following output:

starting namenode, logging to
/nutch/search/logs/hadoop-nutch-namenode-Thanatos.out
localhost: starting datanode, logging to
/nutch/search/logs/hadoop-nutch-datanode-Thanatos.out
starting jobtracker, logging to
/nutch/search/logs/hadoop-nutch-jobtracker-Thanatos.out
localhost: starting tasktracker, logging to
/nutch/search/logs/hadoop-nutch-tasktracker-Thanatos.out

I.e. all nodes and trackers are run properly.
Why can it be?

Here is content of my hadoop-site.xml:


<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>

<property>
  <name>fs.default.name</name>
  <value>localhost:50010</value>
  <description>
The name of the default file system. Either the literal string "local" or a host:port for NDFS.
  </description>
</property>

<property>
  <name>mapred.job.tracker</name>
  <value>local</value>
  <description>
The host and port that the MapReduce job tracker runs at. If "local", then jobs are run in-process as a single map and reduce task.
  </description>
</property>

<property> <name>mapred.map.tasks</name>
  <value>2</value>
  <description>
    define mapred.map tasks to be number of slave hosts
</description> </property> <property> <name>mapred.reduce.tasks</name>
  <value>2</value>
  <description>
    define mapred.reduce tasks to be number of slave hosts
</description> </property>
<property>
  <name>dfs.name.dir</name>
  <value>c:/cygwin/nutch/filesystem/name</value>
</property>

<property>
  <name>dfs.data.dir</name>
  <value>c:/cygwin/nutch/filesystem/data</value>
</property>

<property>
  <name>mapred.system.dir</name>
  <value>/nutch/filesystem/mapreduce/system</value>
</property>

<property>
  <name>mapred.local.dir</name>
  <value>/nutch/filesystem/mapreduce/local</value>
</property>

<property>
  <name>dfs.replication</name>
  <value>1</value>
</property>

</configuration>

Reply via email to