Hi all,
          Im Using nutch 0.8.1, i have done distributed crawling having 3
machines MASTER,  NODE1, NODE2

In Master i can able to see the file system directories using the command
bin/hadoop dfs -ls
but in NODE1 and NODE2 i can not able to see the files system direcories,
any one please help how can i saw the contents inside the NODE
machines................

my config

slaves
LOCALHOST
NODE1
NODE2

hadoop-site.xml having

<configuration>  
 <property>  
    <name>fs.default.name</name>  
    <value>localhost:9000</value>  
 </property>  

 <property>  
    <name>dfs.name.dir</name>  
    <value>/tmp/hadoop/dfs/name</value>  
 </property> 

 <property>  
    <name>dfs.data.dir</name>  
    <value>/tmp/hadoop/dfs/data</value>  
 </property> 

 <property>  
    <name>dfs.replication</name>  
    <value>2</value>  
 </property> 

 <property>
    <name>dfs.datanode.port</name>
    <value>50010</value>
    <description>The port number that the dfs datanode server uses as a
starting 
                point to look for a free port to listen on.
    </description>
 </property>

 <property>
   <name>dfs.info.port</name>
   <value>50070</value>
   <description>The base port number for the dfs namenode web
ui.</description>
 </property>

 <property>
   <name>dfs.datanode.dns.nameserver</name>
   <value>192.168.0.1</value>
   <description>The host name or IP address of the name server (DNS)
      which a DataNode should use to determine the host name used by the
      NameNode for communication and display purposes.
    </description>
 </property>

<!-- map/reduce properties -->

 <property>  
    <name>mapred.job.tracker</name>  
    <value>localhost:9001</value>  
 </property>  

 <property>
   <name>mapred.job.tracker.info.port</name>
   <value>50030</value>
   <description>The port that the MapReduce job tracker info webserver runs
at.
   </description>
 </property>

 <property>
   <name>mapred.task.tracker.output.port</name>
   <value>50040</value>
   <description>The port number that the MapReduce task tracker output
server uses as a starting
               point to look for a free port to listen on.
   </description>
 </property>

 <property>
   <name>mapred.task.tracker.report.port</name>
   <value>50050</value>
   <description>The port number that the MapReduce task tracker report
server uses as a starting
               point to look for a free port to listen on.
   </description>
 </property>

 <property>
   <name>tasktracker.http.port</name>
   <value>50060</value>
   <description>The default port for task trackers to use as their http
server.
   </description>
 </property>

 <property>  
    <name>mapred.local.dir</name>  
    <value>/tmp/hadoop/mapred/local</value>  
 </property>

 <property>
   <name>mapred.temp.dir</name>
   <value>/tmp/hadoop/mapred/temp</value>
   <description>A shared directory for temporary files.
   </description>
 </property>

 <property>
   <name>mapred.system.dir</name>
   <value>/tmp/hadoop/mapred/system</value>
   <description>The shared directory where MapReduce stores control files.
   </description>
 </property>

 <property>
    <name>mapred.tasktracker.dns.nameserver</name>
    <value>192.168.0.1</value>
    <description>The host name or IP address of the name server (DNS)
        which a TaskTracker should use to determine the host name used by
        the JobTracker for communication and display purposes.
    </description>
 </property>

 <property>
   <name>tasktracker.http.threads</name>
   <value>10</value>
   <description>The number of worker threads that for the http server. This
is
               used for map output fetching
   </description>
 </property>

 <property>
   <name>mapred.map.tasks</name>
   <value>10</value>
   <description>The default number of map tasks per job.  Typically set
   to a prime several times greater than number of available hosts.
   Ignored when mapred.job.tracker is "local".  
   </description>
 </property>

 <property>
   <name>mapred.reduce.tasks</name>
   <value>2</value>
   <description>The default number of reduce tasks per job.  Typically set
   to a prime close to the number of available hosts.  Ignored when
   mapred.job.tracker is "local".
   </description>
 </property>

 <property>
   <name>mapred.reduce.parallel.copies</name>
   <value>5</value>
   <description>The default number of parallel transfers run by reduce
   during the copy(shuffle) phase.
   </description>
 </property>

</configuration>

also my log files having the WARN like in file
hadoop-root-namenode-mohanlal.qburst.local.log

2006-09-30 14:08:26,919 WARN  fs.FSNamesystem - Zero targets found,
forbidden1.size=1 forbidden2.size()=0

please help me...................

Regards
Mohan Lal
-- 
View this message in context: 
http://www.nabble.com/Problem-in-Distributed-file-system-tf2360944.html#a6577395
Sent from the Nutch - User mailing list archive at Nabble.com.


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to