Why do you launch hdfs if you don't want use it ? What are the logs saying, all fetch urls are logged usually ? But nothing is displaid
2010/1/20, Santiago Pérez <elara...@gmail.com>: > > Hej, > > I am configuring Nutch for just crawling webs in several machines (currently > I want to test with only one). > Building Nutch with ant was successfully > > bin/hadoop namenode -format > bin/start-all.sh > > They show correct logs > > bin/hadoop dfs -put urls urls > bin/hadoop dfs -ls > > They show the urls directory correctly > > But when I launch it the fetcher starts but does not show any message of > parsing and it stops in the second depth. The crawl-urlfilter and > nutch-default are well configured because they work great using local > filesystem (instead of hdfs). I guess it is because nutch-site is empty. > > What should be its content? > > core-site.xml: > > <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> > > <!-- Put site-specific property overrides in this file. --> > > <configuration> > > <property> > <name>fs.default.name</name> > <value>hdfs://localhost:9000/</value> > <description> > The name of the default file system. Either the literal string > "local" or a host:port for NDFS. > </description> > </property> > > </configuration> > > > --------------------------------------- > > hdfs-site.xml: > > <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> > > <!-- Put site-specific property overrides in this file. --> > > <configuration> > > <property> > <name>dfs.name.dir</name> > <value>/root/filesystem/name</value> > </property> > > <property> > <name>dfs.data.dir</name> > <value>/root/filesystem/data</value> > </property> > > <property> > <name>dfs.replication</name> > <value>1</value> > </property> > > </configuration> > > > --------------------------------------- > > > mapred-site.xml: > > <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> > > <!-- Put site-specific property overrides in this file. --> > > <configuration> > > <property> > <name>mapred.job.tracker</name> > <value>hdfs://localhost:9001/</value> > <description> > The host and port that the MapReduce job tracker runs at. If > "local", then jobs are run in-process as a single map and > reduce task. > </description> > </property> > > <property> > <name>mapred.map.tasks</name> > <value>2</value> > <description> > define mapred.map tasks to be number of slave hosts > </description> > </property> > > <property> > <name>mapred.reduce.tasks</name> > <value>2</value> > <description> > define mapred.reduce tasks to be number of slave hosts > </description> > </property> > > <property> > <name>mapred.system.dir</name> > <value>/root/filesystem/mapreduce/system</value> > </property> > > <property> > <name>mapred.local.dir</name> > <value>/root/filesystem/mapreduce/local</value> > </property> > > </configuration> > -- > View this message in context: > http://old.nabble.com/Configurin-nutch-site.xml-tp27245750p27245750.html > Sent from the Nutch - User mailing list archive at Nabble.com. > > -- -MilleBii-