Håvard W. Kongsgård wrote:

I searched the mail archive and found this http://www.mail-archive.com/[email protected]/msg01308.html - Is there in the current version of nutch on way to update the crawl without fetching every doc again?
- Is the nutch team planning an updating function?


The "crawl" command is just for those who are too lazy to run all 4 steps by hand... ;-)

There is nothing magical about this. Just follow the standard workflow:

generate, fetch, updatedb, invertlinks, generate, fetch ...

dedup

index

search

--
Best regards,
Andrzej Bialecki     <><
___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Reply via email to