I'm new to nutch. Several days ago, I finish building a simple intranet se 
based on nutch 0.6.
and I've spend two week to read the source code of nutch 0.6.

Now I want to build a bigger one. I want to crawl the pages from several 
website I specific.
My server is a poor machine with 1CPU 1G Mem and 320G HD, the bandwidth is 
10Mbps
I want to provide a search service about some specific domain. so i choose some 
big websites, and crawl them. 
so my question is :
Must I update all the site(crawl the sites) in one crawl procedure,may I crawl 
one site per day
and run a program to index them together, I wonder if the crawl procedure last 
too long ,how can I provide my service? Is there any good system for me to 
study?
any advices would be greatly appreciated.



-- 
Best regards,
 Heart                            mailto:[EMAIL PROTECTED]

Reply via email to