If the url is already in WebDB, it will not be added again. (WebDBInjector
calls WebDBWriter.addPageIfNotPresent(page)).

Rgrds, Thomas





On 2/13/06, Hasan Diwan <[EMAIL PROTECTED]> wrote:
>
> I've written a perl script to build up a urls file to crawl from RSS
> feeds. Will nutch handle duplicate URLs in the crawl file or would
> that logic need to be in my perl script?
> --
> Cheers,
> Hasan Diwan <[EMAIL PROTECTED]>
>

Reply via email to