Hej, I am configuring Nutch for just crawling webs in several machines (currently I want to test with only one). Building Nutch with ant was successfully
bin/hadoop namenode -format bin/start-all.sh They show correct logs bin/hadoop dfs -put urls urls bin/hadoop dfs -ls They show the urls directory correctly But when I launch it the fetcher starts but does not show any message of parsing and it stops in the second depth. The crawl-urlfilter and nutch-default are well configured because they work great using local filesystem (instead of hdfs). I guess it is because nutch-site is empty. What should be its content? core-site.xml: <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>fs.default.name</name> <value>hdfs://localhost:9000/</value> <description> The name of the default file system. Either the literal string "local" or a host:port for NDFS. </description> </property> </configuration> --------------------------------------- hdfs-site.xml: <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>dfs.name.dir</name> <value>/root/filesystem/name</value> </property> <property> <name>dfs.data.dir</name> <value>/root/filesystem/data</value> </property> <property> <name>dfs.replication</name> <value>1</value> </property> </configuration> --------------------------------------- mapred-site.xml: <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>mapred.job.tracker</name> <value>hdfs://localhost:9001/</value> <description> The host and port that the MapReduce job tracker runs at. If "local", then jobs are run in-process as a single map and reduce task. </description> </property> <property> <name>mapred.map.tasks</name> <value>2</value> <description> define mapred.map tasks to be number of slave hosts </description> </property> <property> <name>mapred.reduce.tasks</name> <value>2</value> <description> define mapred.reduce tasks to be number of slave hosts </description> </property> <property> <name>mapred.system.dir</name> <value>/root/filesystem/mapreduce/system</value> </property> <property> <name>mapred.local.dir</name> <value>/root/filesystem/mapreduce/local</value> </property> </configuration> -- View this message in context: http://old.nabble.com/Configurin-nutch-site.xml-tp27245750p27245750.html Sent from the Nutch - User mailing list archive at Nabble.com.