Hello, Thanks. I am running nutch 0.8.1. What is this property for? Should I set it at 120 as requested by the error message? Another prolem that I have is that on some website, all pages are not fetched, and even more weird, some which are doesn't actually exist...
Thanking you in advance, Mat ----- Message d'origine ---- De : Sami Siren <[EMAIL PROTECTED]> À : [email protected] Envoyé le : Mercredi, 22 Novembre 2006, 22h40mn 10s Objet : Re: Fetch fails frgrfg gfsdgffsd wrote: > Hi all, > > I have a problem with the crawl/fetch of 1 website (www.lequipe.fr), > although it works for fine another (www.lemonde.fr). > > Here are the errors: > ERROR [MAT] 2006-11-22 00:36:20,860 - Http.invoke0(?) | > java.lang.IllegalArgumentException: null metadata > ERROR [MAT] 2006-11-22 00:36:20,870 - Http.invoke0(?) | at > org.apache.nutch.protocol.Content.<init>(Content.java:60) > ERROR [MAT] 2006-11-22 00:36:20,870 - Http.invoke0(?) | at > org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput(HttpBase.java:196) > ERROR [MAT] 2006-11-22 00:36:20,870 - Http.invoke0(?) | at > org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:162) > > Don't understand why metadata is null when there are some metadata on the > pages... > what version of nutch are you running? > I also have this messsage just before: > INFO [MAT] 2006-11-22 00:36:32,477 - HttpBase.getProtocolOutput(194) | > Skipping: http://www.lequipe.fr/ exceeds fetcher.max.crawl.delay, max=30, > Crawl-Delay=120 > > and i can't find this property in nutch-site.xml You need to add it there. <property> <name>fetcher.max.crawl.delay</name> <value> your value here </value> </property> -- Sami Siren ___________________________________________________________________________ Yahoo! Mail réinvente le mail ! Découvrez le nouveau Yahoo! Mail et son interface révolutionnaire. http://fr.mail.yahoo.com
