Updating the crawldb with all segments should work. Don't forget the -filter option.
On Monday 20 June 2011 16:54:12 Dietrich wrote: > How can one remove documents from a specific domain from an existing Nutch > db? Addding a filter to regex-urlfilter.txt seems to prevent them from > being added to the linkDb, but documents already in there are not > affected at all, and I could not see how else to do this. > It can't possibly be that I have to completely recreate the crawl folder, > is it? -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350

