I am not an expert on it, but I am doing something similar.
So you got 100k pages, that is very few to nutch's standard.
I think crawling will be the slow part, not because hardware, but because of
that if you crawling  fast then 1page/second per site, you may be blocked by
some site. 
If you really want to update it everyday, this may be a problem.

the searching stuff is really fast, I was worried about it woo, but once I
saw my AMD 1800+ pc(1G mem) can do a search less than 0.1 second, I didn't
bother myself looking into this problem anymore. I saw someone on this list
doing crawling/searching on a PIII with resealable speed.

Regards
Pan

Tomislav Poljak wrote:
> 
> I need help determining hardware specs for crawling 100 sites with 1000
> pages each. Regular re-crawl is needed probably every day (maybe even
> more often). So will one server meet these crawling requirements (only
> crawling, searching will be handled by other machine)? If so, what
> hardware specification would be recommended (how much Ram, CPU's, hard
> disk space)?
> 
> Thanks,
>        Tomislav
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/help-with-hardware-requirements-tf4333859.html#a12381466
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to