Håvard W. Kongsgård wrote:
I searched the mail archive and found this
http://www.mail-archive.com/[email protected]/msg01308.html
- Is there in the current version of nutch on way to update the crawl
without fetching every doc again?
- Is the nutch team planning an updating function?
The "crawl" command is just for those who are too lazy to run all 4
steps by hand... ;-)
There is nothing magical about this. Just follow the standard workflow:
generate, fetch, updatedb, invertlinks, generate, fetch ...
dedup
index
search
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com