Thnks S.P for your quik response

> 1. Check logs/hadoop.log file. Do you see any lines
> containing the
> string "fetching". Such lines should clearly show
> what URLs have been
> fetched.

there are many fetchinf line there, i think it's not for this reason.

> 2. One reason may be that all URLs are blocked in
> conf/crawl-urlfilter.txt. Did you edit this file as per the
> tutorial?
> If not, this is most certainly the problem. An easy way to
> allow all
> URLs would be to replace the .- in the end with .+
> 

yes, like this: 
# accept hosts in MY.DOMAIN.NAME
+^http://([a-z0-9]*\.)*hustoo.net/
# skip everything else
+.

what do you think about Tomcat 
6.0\webapps\nutch-0.9\WEB-INF\classesnutch-site.xml:
<configuration>
<property>

    <name>searcher.dir</name>

    <value>C:\nutch-0.9\crawl\</value>

  </property>
</configuration>

in ths first i think it's a "\" problem or the path in generally ??

THANKS for any suggestion..

__________________________________________________
Do You Yahoo!?
En finir avec le spam? Yahoo! Mail vous offre la meilleure protection possible 
contre les messages non sollicités 
http://mail.yahoo.fr Yahoo! Mail

Reply via email to