After testting: grabbing urls to fetch from unfetch urls takes 15 hours (ouch!) and fetching 1000 urls only take some minutes (idem for parsing)
I'm guessing one of those phase is taking a very long time: 2013-01-31 13:46:19,387 INFO crawl.Generator - Generator: Selecting best-scoring urls due for fetch. 2013-01-31 13:46:19,387 INFO crawl.Generator - Generator: filtering: true 2013-01-31 13:46:19,387 INFO crawl.Generator - Generator: normalizing: true Does someone know how to log each of those steps? Or have any clue about what happened? -- View this message in context: http://lucene.472066.n3.nabble.com/Very-long-time-just-before-fetching-and-just-after-parsing-tp4037673p4037881.html Sent from the Nutch - User mailing list archive at Nabble.com.

