Re: Configurin nutch-site.xml

MilleBii Wed, 20 Jan 2010 13:12:32 -0800

Why do you launch hdfs if you don't want use it ?

What are the logs saying,  all fetch urls are logged usually ? But
nothing is displaid


2010/1/20, Santiago Pérez <elara...@gmail.com>:
>
> Hej,
>
> I am configuring Nutch for just crawling webs in several machines (currently
> I want to test with only one).
> Building Nutch with ant was successfully
>
>    bin/hadoop namenode -format
>    bin/start-all.sh
>
> They show correct logs
>
>   bin/hadoop dfs -put urls urls
>   bin/hadoop dfs -ls
>
> They show the urls directory correctly
>
> But when I launch it the fetcher starts but does not show any message of
> parsing and it stops in the second depth. The crawl-urlfilter and
> nutch-default are well configured because they work great using local
> filesystem (instead of hdfs). I guess it is because nutch-site is empty.
>
> What should be its content?
>
> core-site.xml:
>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>
> <!-- Put site-specific property overrides in this file. -->
>
> <configuration>
>
> <property>
>   <name>fs.default.name</name>
>   <value>hdfs://localhost:9000/</value>
>   <description>
>     The name of the default file system. Either the literal string
>     "local" or a host:port for NDFS.
>   </description>
> </property>
>
> </configuration>
>
>
> ---------------------------------------
>
> hdfs-site.xml:
>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>
> <!-- Put site-specific property overrides in this file. -->
>
> <configuration>
>
> <property>
>   <name>dfs.name.dir</name>
>   <value>/root/filesystem/name</value>
> </property>
>
> <property>
>   <name>dfs.data.dir</name>
>   <value>/root/filesystem/data</value>
> </property>
>
> <property>
>   <name>dfs.replication</name>
>   <value>1</value>
> </property>
>
> </configuration>
>
>
> ---------------------------------------
>
>
> mapred-site.xml:
>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>
> <!-- Put site-specific property overrides in this file. -->
>
> <configuration>
>
> <property>
>   <name>mapred.job.tracker</name>
>   <value>hdfs://localhost:9001/</value>
>   <description>
>     The host and port that the MapReduce job tracker runs at. If
>     "local", then jobs are run in-process as a single map and
>     reduce task.
>   </description>
> </property>
>
> <property>
>   <name>mapred.map.tasks</name>
>   <value>2</value>
>   <description>
>     define mapred.map tasks to be number of slave hosts
>   </description>
> </property>
>
> <property>
>   <name>mapred.reduce.tasks</name>
>   <value>2</value>
>   <description>
>     define mapred.reduce tasks to be number of slave hosts
>   </description>
> </property>
>
> <property>
>   <name>mapred.system.dir</name>
>   <value>/root/filesystem/mapreduce/system</value>
> </property>
>
> <property>
>   <name>mapred.local.dir</name>
>   <value>/root/filesystem/mapreduce/local</value>
> </property>
>
> </configuration>
> --
> View this message in context:
> http://old.nabble.com/Configurin-nutch-site.xml-tp27245750p27245750.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
>
>


-- 
-MilleBii-

Re: Configurin nutch-site.xml

Reply via email to