Hi Andrez

A great job for the adaptive fetch patch..

I have a few suggestion to make.Here are some things which i felt was the
need.

Currently i do an intranet crawl

The next time around i need all the urls in the webdb to be considered for
fetching (it should not be done on the basis of db default fetch interval)

 So at the end of a crawl we should set all the pages in the db to a state
where it will be considered (for a refetch and then look at the adaptive
fetch schedule)

Maybe we can add a function which will do this so that people using crawl
can make use of this function.(a new function with  a minor modification in
update database which so that it will replace the
db.defautl.fetch.intervalin the webdb to zero)

During the next crawl, According to  the adaptive fetch logic,it will decide
whether to  fetch it or not.


This will be useful in testing the product based upon intranet crawls and
give the end user more control on the fetch (instead of it changing randomly
and we dont know whether it is gonna be fetched at all and be checked for
adaptive fetch logic)


Rgds
Prabhu

Reply via email to