Hey folks,

We're looking at launching a search engine in the beginning of the new year that will eventually grow to being a multi-billion page index. Three questions:

First, and most important for now, does anyone have any useful numbers for what the hardware requirements are to run such an engine? I have numbers for how fast I can get the crawler's working. But not for how many pages can be served off of each search node and how much processing power is required for the indexing, etc.

Second, what all needs to be done to Nutch yet in order for it to be able to handle billions of pages? Is there a general list of requirements?

Third, if nutch isn't capable of doing what we need, what is the expected upper limit for it? Using the map/reduce version.

Thanks,

--
Ken van Mulder
Wavefire Technologies Corporation

http://www.wavefire.com
250.717.0200 (ext 113)


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
Nutch-developers mailing list
Nutch-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to