If the url is already in WebDB, it will not be added again. (WebDBInjector calls WebDBWriter.addPageIfNotPresent(page)).
Rgrds, Thomas On 2/13/06, Hasan Diwan <[EMAIL PROTECTED]> wrote: > > I've written a perl script to build up a urls file to crawl from RSS > feeds. Will nutch handle duplicate URLs in the crawl file or would > that logic need to be in my perl script? > -- > Cheers, > Hasan Diwan <[EMAIL PROTECTED]> >
