Re: [Nutch-general] Problem in Distributed file system

Sunil Kumar PK Sat, 30 Sep 2006 21:33:04 -0700

Thanks Dennis, I was also hanging with the same problem.


On 10/1/06, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:


Change your config to be as below.  Localhost will only work on the
machine the namenode is running on.  The other machines must point to the
server the namenode and jobtracker are running on.

Dennis

<configuration>
<property>
    <name>fs.default.name</name>
    <value>machinename.domainname.com:9000</value>
</property>

<property>

    <name>mapred.job.tracker</name>
    <value>machinename.domainname.com:9001</value>
</property>


Dennis
>
> Hi all,
>           Im Using nutch 0.8.1, i have done distributed crawling having
3
> machines MASTER,  NODE1, NODE2
>
> In Master i can able to see the file system directories using the
command
> bin/hadoop dfs -ls
> but in NODE1 and NODE2 i can not able to see the files system
direcories,
> any one please help how can i saw the contents inside the NODE
> machines................
>
> my config
>
> slaves
> LOCALHOST
> NODE1
> NODE2
>
> hadoop-site.xml having
>
> <configuration>
>  <property>
>     <name>fs.default.name</name>
>     <value>localhost:9000</value>
>  </property>
>
>  <property>
>     <name>dfs.name.dir</name>
>     <value>/tmp/hadoop/dfs/name</value>
>  </property>
>
>  <property>
>     <name>dfs.data.dir</name>
>     <value>/tmp/hadoop/dfs/data</value>
>  </property>
>
>  <property>
>     <name>dfs.replication</name>
>     <value>2</value>
>  </property>
>
>  <property>
>     <name>dfs.datanode.port</name>
>     <value>50010</value>
>     <description>The port number that the dfs datanode server uses as a
> starting
>               point to look for a free port to listen on.
>     </description>
>  </property>
>
>  <property>
>    <name>dfs.info.port</name>
>    <value>50070</value>
>    <description>The base port number for the dfs namenode web
> ui.</description>
>  </property>
>
>  <property>
>    <name>dfs.datanode.dns.nameserver</name>
>    <value>192.168.0.1</value>
>    <description>The host name or IP address of the name server (DNS)
>       which a DataNode should use to determine the host name used by the
>       NameNode for communication and display purposes.
>     </description>
>  </property>
>
> <!-- map/reduce properties -->
>
>  <property>
>     <name>mapred.job.tracker</name>
>     <value>localhost:9001</value>
>  </property>
>
>  <property>
>    <name>mapred.job.tracker.info.port</name>
>    <value>50030</value>
>    <description>The port that the MapReduce job tracker info webserver
> runs
> at.
>    </description>
>  </property>
>
>  <property>
>    <name>mapred.task.tracker.output.port</name>
>    <value>50040</value>
>    <description>The port number that the MapReduce task tracker output
> server uses as a starting
>                point to look for a free port to listen on.
>    </description>
>  </property>
>
>  <property>
>    <name>mapred.task.tracker.report.port</name>
>    <value>50050</value>
>    <description>The port number that the MapReduce task tracker report
> server uses as a starting
>                point to look for a free port to listen on.
>    </description>
>  </property>
>
>  <property>
>    <name>tasktracker.http.port</name>
>    <value>50060</value>
>    <description>The default port for task trackers to use as their http
> server.
>    </description>
>  </property>
>
>  <property>
>     <name>mapred.local.dir</name>
>     <value>/tmp/hadoop/mapred/local</value>
>  </property>
>
>  <property>
>    <name>mapred.temp.dir</name>
>    <value>/tmp/hadoop/mapred/temp</value>
>    <description>A shared directory for temporary files.
>    </description>
>  </property>
>
>  <property>
>    <name>mapred.system.dir</name>
>    <value>/tmp/hadoop/mapred/system</value>
>    <description>The shared directory where MapReduce stores control
files.
>    </description>
>  </property>
>
>  <property>
>     <name>mapred.tasktracker.dns.nameserver</name>
>     <value>192.168.0.1</value>
>     <description>The host name or IP address of the name server (DNS)
>       which a TaskTracker should use to determine the host name used by
>       the JobTracker for communication and display purposes.
>     </description>
>  </property>
>
>  <property>
>    <name>tasktracker.http.threads</name>
>    <value>10</value>
>    <description>The number of worker threads that for the http server.
> This
> is
>                used for map output fetching
>    </description>
>  </property>
>
>  <property>
>    <name>mapred.map.tasks</name>
>    <value>10</value>
>    <description>The default number of map tasks per job.  Typically set
>    to a prime several times greater than number of available hosts.
>    Ignored when mapred.job.tracker is "local".
>    </description>
>  </property>
>
>  <property>
>    <name>mapred.reduce.tasks</name>
>    <value>2</value>
>    <description>The default number of reduce tasks per job.  Typically
set
>    to a prime close to the number of available hosts.  Ignored when
>    mapred.job.tracker is "local".
>    </description>
>  </property>
>
>  <property>
>    <name>mapred.reduce.parallel.copies</name>
>    <value>5</value>
>    <description>The default number of parallel transfers run by reduce
>    during the copy(shuffle) phase.
>    </description>
>  </property>
>
> </configuration>
>
> also my log files having the WARN like in file
> hadoop-root-namenode-mohanlal.qburst.local.log
>
> 2006-09-30 14:08:26,919 WARN  fs.FSNamesystem - Zero targets found,
> forbidden1.size=1 forbidden2.size()=0
>
> please help me...................
>
> Regards
> Mohan Lal
> --
> View this message in context:
>
http://www.nabble.com/Problem-in-Distributed-file-system-tf2360944.html#a6577395
> Sent from the Nutch - User mailing list archive at Nabble.com.
>
>

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Re: [Nutch-general] Problem in Distributed file system

Reply via email to