Re: Configurin nutch-site.xml

Santiago Pérez Wed, 20 Jan 2010 13:21:37 -0800

I launch the hdfs because I want to make it work in one computer and when it
works, launching in several as a distributed version.


Which logs do you need to check?


MilleBii wrote:
> 
> Why do you launch hdfs if you don't want use it ?
> 
> What are the logs saying,  all fetch urls are logged usually ? But
> nothing is displaid
> 
> 2010/1/20, Santiago Pérez <elara...@gmail.com>:
>>
>> Hej,
>>
>> I am configuring Nutch for just crawling webs in several machines
>> (currently
>> I want to test with only one).
>> Building Nutch with ant was successfully
>>
>>    bin/hadoop namenode -format
>>    bin/start-all.sh
>>
>> They show correct logs
>>
>>   bin/hadoop dfs -put urls urls
>>   bin/hadoop dfs -ls
>>
>> They show the urls directory correctly
>>
>> But when I launch it the fetcher starts but does not show any message of
>> parsing and it stops in the second depth. The crawl-urlfilter and
>> nutch-default are well configured because they work great using local
>> filesystem (instead of hdfs). I guess it is because nutch-site is empty.
>>
>> What should be its content?
>>
>> core-site.xml:
>>
>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>>
>> <!-- Put site-specific property overrides in this file. -->
>>
>> <configuration>
>>
>> <property>
>>   <name>fs.default.name</name>
>>   <value>hdfs://localhost:9000/</value>
>>   <description>
>>     The name of the default file system. Either the literal string
>>     "local" or a host:port for NDFS.
>>   </description>
>> </property>
>>
>> </configuration>
>>
>>
>> ---------------------------------------
>>
>> hdfs-site.xml:
>>
>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>>
>> <!-- Put site-specific property overrides in this file. -->
>>
>> <configuration>
>>
>> <property>
>>   <name>dfs.name.dir</name>
>>   <value>/root/filesystem/name</value>
>> </property>
>>
>> <property>
>>   <name>dfs.data.dir</name>
>>   <value>/root/filesystem/data</value>
>> </property>
>>
>> <property>
>>   <name>dfs.replication</name>
>>   <value>1</value>
>> </property>
>>
>> </configuration>
>>
>>
>> ---------------------------------------
>>
>>
>> mapred-site.xml:
>>
>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>>
>> <!-- Put site-specific property overrides in this file. -->
>>
>> <configuration>
>>
>> <property>
>>   <name>mapred.job.tracker</name>
>>   <value>hdfs://localhost:9001/</value>
>>   <description>
>>     The host and port that the MapReduce job tracker runs at. If
>>     "local", then jobs are run in-process as a single map and
>>     reduce task.
>>   </description>
>> </property>
>>
>> <property>
>>   <name>mapred.map.tasks</name>
>>   <value>2</value>
>>   <description>
>>     define mapred.map tasks to be number of slave hosts
>>   </description>
>> </property>
>>
>> <property>
>>   <name>mapred.reduce.tasks</name>
>>   <value>2</value>
>>   <description>
>>     define mapred.reduce tasks to be number of slave hosts
>>   </description>
>> </property>
>>
>> <property>
>>   <name>mapred.system.dir</name>
>>   <value>/root/filesystem/mapreduce/system</value>
>> </property>
>>
>> <property>
>>   <name>mapred.local.dir</name>
>>   <value>/root/filesystem/mapreduce/local</value>
>> </property>
>>
>> </configuration>
>> --
>> View this message in context:
>> http://old.nabble.com/Configurin-nutch-site.xml-tp27245750p27245750.html
>> Sent from the Nutch - User mailing list archive at Nabble.com.
>>
>>
> 
> 
> -- 
> -MilleBii-
> 
> 

-- 
View this message in context: 
http://old.nabble.com/Configurin-nutch-site.xml-tp27245750p27248860.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Re: Configurin nutch-site.xml

Reply via email to