Hi all,

I have  a problem with the crawl/fetch of 1 website (www.lequipe.fr), although 
it works for fine another (www.lemonde.fr).

Here are the errors:
ERROR [MAT] 2006-11-22 00:36:20,860 - Http.invoke0(?) | 
java.lang.IllegalArgumentException: null metadata
ERROR [MAT] 2006-11-22 00:36:20,870 - Http.invoke0(?) | at 
org.apache.nutch.protocol.Content.<init>(Content.java:60)
ERROR [MAT] 2006-11-22 00:36:20,870 - Http.invoke0(?) | at 
org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput(HttpBase.java:196)
ERROR [MAT] 2006-11-22 00:36:20,870 - Http.invoke0(?) | at 
org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:162)

Don't understand why metadata is null when there are some metadata on the 
pages... 

I also have this messsage just before:
INFO [MAT] 2006-11-22 00:36:32,477 - HttpBase.getProtocolOutput(194) | 
Skipping: http://www.lequipe.fr/ exceeds fetcher.max.crawl.delay, max=30, 
Crawl-Delay=120

and i can't find this property in nutch-site.xml

Any help would be WELCOME !! as I am now lost ...

Thanks a lot by advance,

Mat




        

        
                
___________________________________________________________________________ 
Yahoo! Mail réinvente le mail ! Découvrez le nouveau Yahoo! Mail et son 
interface révolutionnaire.
http://fr.mail.yahoo.com

Reply via email to