Hello:
I am trying do recrawling with Nutch-0.9. I have done some Google
searching but I haven't an answer that works.
I had hopes for the script located at:
http://www.mail-archive.com/[email protected]/msg09096.html
I tried this script for re-crawling and it has the same problem after a
couple of re-crawls:
----- Merge Indexes (Step 7 of 8) -----
merging indexes to: crawl/index
Adding crawl/NEWindexes/part-00000
IndexMerger: java.io.IOException: Target
crawl/index/merge-output already exists
(also, this script has a un-related bug as it references the variable
$rank but $rank is not defined. I guess this is supposed to be topN.)
Has anybody found the solution to sucessfully re-crawling with 0.9?
thanks,
-Lee