Re: Way to fetch only new sites

A Laxmi Thu, 01 Aug 2013 06:18:21 -0700

Jaydeep - I have the same problem as well. When I run a fresh crawl, only
the urls in the webpage table are being crawled over and over, it was
ignoring the new urls in seed.txt.



On Thu, Aug 1, 2013 at 9:03 AM, Jayadeep Reddy
<[email protected]>wrote:

> I am using Nutch 2.1 every time I run crawl from dmoz directory my existing
> crawled pages in the database are fetched again(Taking long time/). Is
> there a way to crawl only new sites.
>
> Thank you
>
> --
> Jayadeep Reddy.S,
> M.D & C.E.O
> e Health Access Pvt.Ltd
> www.ehealthaccess.com
> Hyderabad-Chennai-Banglore
> http://www.youtube.com/watch?v=0k5LX8mw6Sk
>

Re: Way to fetch only new sites

Reply via email to