Hi,

I was able to run a two-node cluster and crawl a 60mb index inside two virtual 
machines. However,
when I switched to real machines where I don't have root access, I wasn't able 
to run the hadoop 
cluster. 

I am using 4 nodes for the new cluster. The nodes do not have disks so I am 
using a common disk 
where I created the path exactly as I needed except that local, filesystem, 
home are symbolic 
links.

Here is how the nutch directory looks like

drwxr-xr-x  2 volos parsa 4096 Jan 28 17:58 dis_search
lrwxrwxrwx  1 volos parsa   21 Jan 28 18:01 filesystem -> /tmp/volos/filesystem
lrwxrwxrwx  1 volos parsa   15 Jan 28 18:05 home -> /tmp/volos/home
lrwxrwxrwx  1 volos parsa   16 Jan 28 18:04 local -> /tmp/volos/local
drwxr-xr-x  5 volos parsa 4096 Jan 28 17:58 parsanode042
drwxr-xr-x  5 volos parsa 4096 Jan 28 17:58 parsanode043
drwxr-xr-x  5 volos parsa 4096 Jan 28 17:58 parsanode044
drwxr-xr-x  5 volos parsa 4096 Jan 28 17:58 parsanode045
drwxr-xr-x 11 volos parsa 4096 Jan 28 19:48 search
lrwxrwxrwx  1 volos parsa   34 Jan 25 18:12 tomcat -> 
../apache-tomcat-7.0.6-src/output/


Parsanode042-045 directories contain the actual folders for filesystem, local, 
and home for each node
respectively. /tmp/volos/XXX are links to parsanode0YY/XXX

E.g.

lrwxrwxrwx 1 volos parsa 57 Jan 28 18:02 filesystem -> 
/home/parsacom/users/volos/nutch/parsanode042/filesystem/
lrwxrwxrwx 1 volos parsa 51 Jan 28 18:05 home -> 
/home/parsacom/users/volos/nutch/parsanode042/home/
lrwxrwxrwx 1 volos parsa 52 Jan 28 18:04 local -> 
/home/parsacom/users/volos/nutch/parsanode042/local/

Note that search is the same directory for every node. Dis_search is the 
distribution that I am going to use
for each distribute search server.

When I start bin/start-all.sh from parsanode042 (master node) everything goes 
properly since all .out files do not show anything.

starting namenode, logging to 
/home/parsacom/users/volos/nutch/search/logs/hadoop-volos-namenode-parsanode042.out
192.168.10.44: starting datanode, logging to 
/home/parsacom/users/volos/nutch/search/logs/hadoop-volos-datanode-parsanode044.out
192.168.10.43: starting datanode, logging to 
/home/parsacom/users/volos/nutch/search/logs/hadoop-volos-datanode-parsanode043.out
192.168.10.42: starting datanode, logging to 
/home/parsacom/users/volos/nutch/search/logs/hadoop-volos-datanode-parsanode042.out
192.168.10.45: starting datanode, logging to 
/home/parsacom/users/volos/nutch/search/logs/hadoop-volos-datanode-parsanode045.out
192.168.10.42: starting secondarynamenode, logging to 
/home/parsacom/users/volos/nutch/search/logs/hadoop-volos-secondarynamenode-parsanode042.out
starting jobtracker, logging to 
/home/parsacom/users/volos/nutch/search/logs/hadoop-volos-jobtracker-parsanode042.out
192.168.10.42: starting tasktracker, logging to 
/home/parsacom/users/volos/nutch/search/logs/hadoop-volos-tasktracker-parsanode042.out
192.168.10.43: starting tasktracker, logging to 
/home/parsacom/users/volos/nutch/search/logs/hadoop-volos-tasktracker-parsanode043.out
192.168.10.45: starting tasktracker, logging to 
/home/parsacom/users/volos/nutch/search/logs/hadoop-volos-tasktracker-parsanode045.out
192.168.10.44: starting tasktracker, logging to 
/home/parsacom/users/volos/nutch/search/logs/hadoop-volos-tasktracker-parsanode044.out

However I cannot access parsanode042:50070 and when I exectute bin/stop-all.sh 
I have the following output

stopping jobtracker
192.168.10.42: stopping tasktracker
192.168.10.43: stopping tasktracker
192.168.10.45: stopping tasktracker
192.168.10.44: stopping tasktracker
stopping namenode
192.168.10.44: stopping datanode
192.168.10.43: stopping datanode
192.168.10.45: stopping datanode
192.168.10.42: stopping datanode
192.168.10.42: no secondarynamenode to stop

The log of the secondarynamenode contains

Exception in thread "main" java.io.IOException: Call to /192.168.10.42:9000 
failed on local exception: java.io.IOException: Connection reset by peer
        at org.apache.hadoop.ipc.Client.wrapException(Client.java:775)
        at org.apache.hadoop.ipc.Client.call(Client.java:743)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
        at $Proxy4.getProtocolVersion(Unknown Source)
        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359)
        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:346)
        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:383)
        at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:314)
        at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:291)
        at 
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNameNode.java:134)
        at 
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.<init>(SecondaryNameNode.java:115)
        at 
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.main(SecondaryNameNode.java:469)
Caused by: java.io.IOException: Connection reset by peer
        at sun.nio.ch.FileDispatcher.read0(Native Method)
        at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:202)
        at sun.nio.ch.IOUtil.read(IOUtil.java:175)
        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:243)
        at 
org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:55)
        at 
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
        at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
        at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
        at java.io.FilterInputStream.read(FilterInputStream.java:116)
        at 
org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:276)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
        at java.io.DataInputStream.readInt(DataInputStream.java:370)
        at 
org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:501)
        at org.apache.hadoop.ipc.Client$Connection.run(Client.java:446)

What can be the reason that secondary namenode does not run ?

Why does not the namenode work properly (http://parsanode042:50070 does not 
respond although the port is open).

Thanks,
-Stavros

Reply via email to