Laxmi Some one in the group should have a solution to skip database table while crawling new sites. I searched online but cant find one.
On Thu, Aug 1, 2013 at 6:47 PM, A Laxmi <[email protected]> wrote: > Jaydeep - I have the same problem as well. When I run a fresh crawl, only > the urls in the webpage table are being crawled over and over, it was > ignoring the new urls in seed.txt. > > > On Thu, Aug 1, 2013 at 9:03 AM, Jayadeep Reddy > <[email protected]>wrote: > > > I am using Nutch 2.1 every time I run crawl from dmoz directory my > existing > > crawled pages in the database are fetched again(Taking long time/). Is > > there a way to crawl only new sites. > > > > Thank you > > > > -- > > Jayadeep Reddy.S, > > M.D & C.E.O > > e Health Access Pvt.Ltd > > www.ehealthaccess.com > > Hyderabad-Chennai-Banglore > > http://www.youtube.com/watch?v=0k5LX8mw6Sk > > > -- Jayadeep Reddy.S, M.D & C.E.O e Health Access Pvt.Ltd www.ehealthaccess.com Hyderabad-Chennai-Banglore http://www.youtube.com/watch?v=0k5LX8mw6Sk

