Håvard W. Kongsgård wrote:

I searched the mail archive and found this http://www.mail-archive.com/[email protected]/msg01308.html - Is there in the current version of nutch on way to update the crawl without fetching every doc again?
- Is the nutch team planning an updating function?


The "crawl" command is just for those who are too lazy to run all 4 steps by hand... ;-)

There is nothing magical about this. Just follow the standard workflow:

generate, fetch, updatedb, invertlinks, generate, fetch ...

dedup

index

search

--
Best regards,
Andrzej Bialecki     <><
___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com




-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to