In the slaves file, 'localhost' should only be used alone, not with
other hosts, since 'localhost' is not a name that other hosts can use to
refer to a host. It's equivalent to 127.0.0.1, the loopback address.
So, if you're specifying more than one host, it's best to use real
hostnames or IP addresses.
Doug
Ilya Vishnevsky wrote:
Hello! I'm trying to run nutch on two computers. Here is content of my
"slaves" file:
localhost
morpheus
When I type bin/start-al.sh, I get the next output:
starting namenode, logging to
/nutch/search/logs/hadoop-nutch-namenode-Thanatos.out
localhost
: ssh: localhost
: no address associated with name
morpheus: starting datanode, logging to
/nutch/search/logs/hadoop-nutch-datanode-morpheus.out
starting jobtracker, logging to
/nutch/search/logs/hadoop-nutch-jobtracker-Thanatos.out
localhost
: ssh: localhost
: no address associated with name
morpheus: starting tasktracker, logging to
/nutch/search/logs/hadoop-nutch-tasktracker-morpheus.out
It seems like it for some reason nutch can't run datanode and
tasktracker on localhost (thanatos).
At the same time if I have only localhost in my "slaves" file, I get the
following output:
starting namenode, logging to
/nutch/search/logs/hadoop-nutch-namenode-Thanatos.out
localhost: starting datanode, logging to
/nutch/search/logs/hadoop-nutch-datanode-Thanatos.out
starting jobtracker, logging to
/nutch/search/logs/hadoop-nutch-jobtracker-Thanatos.out
localhost: starting tasktracker, logging to
/nutch/search/logs/hadoop-nutch-tasktracker-Thanatos.out
I.e. all nodes and trackers are run properly.
Why can it be?
Here is content of my hadoop-site.xml:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.default.name</name>
<value>localhost:50010</value>
<description>
The name of the default file system. Either the literal string
"local" or a host:port for NDFS.
</description>
</property>
<property>
<name>mapred.job.tracker</name>
<value>local</value>
<description>
The host and port that the MapReduce job tracker runs at. If
"local", then jobs are run in-process as a single map and
reduce task.
</description>
</property>
<property>
<name>mapred.map.tasks</name>
<value>2</value>
<description>
define mapred.map tasks to be number of slave hosts
</description>
</property>
<property>
<name>mapred.reduce.tasks</name>
<value>2</value>
<description>
define mapred.reduce tasks to be number of slave hosts
</description>
</property>
<property>
<name>dfs.name.dir</name>
<value>c:/cygwin/nutch/filesystem/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>c:/cygwin/nutch/filesystem/data</value>
</property>
<property>
<name>mapred.system.dir</name>
<value>/nutch/filesystem/mapreduce/system</value>
</property>
<property>
<name>mapred.local.dir</name>
<value>/nutch/filesystem/mapreduce/local</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>