> >1. "Souped-up" DB server - Dual CPU, 4 GB Ram (min) RAID 5 or 10, 1-2 >NICS > > This is the 'fetcher' server?
This is you fetch/crawler/indexer -- create the final segments here, then move them to the search server. That way if a search server goes down, simply move the segment to another server. >2. Basic Search Servers - Single/Dual CPU, Maximum RAM, Single IDE/SATA >drive (or 2 for redundancy) > > These are the 'fetched' segment backup and search servers? If I have 10 Million pages / server, this is good thing: 2kbyte * 10 = 20 GByte RAM? Or there is enought 10 GByte, and later put more if it need? Actually, you'll want 20GB ram if you're trying to displace MSN as the fastest search engine. Believe it or not, Lucene is EXTREMELY fast even when reading from disk (whose the genius who wrote that software?). I would keep about 4-8MM/pages per server and give about 1GB per million. Let the Linux file caching system do it's magic. After the first 20-30 searches, things should be pretty fast. Take a look at filangy.com - search is pretty fast and we're hitting the disk. The only drawback is that from disk we see things starting to slow down if more that 5-6 searches happen simultaneously. That's 5-6 per second -- and we usually improve by adding another server. Given that 1GB stick are much cheper than 2GB sticks, oyu'll find adding another cheap server is cheaper that adding more RAM. And the2GB sticks are suported is more high-end server -- so cheap hardware cannot be user anymore. >3. Basic Web Servers - Single/Dual CPU, Medium RAM > > In this boxs I will put 1-2 GByte RAM. I would like put frontend Apache2 and mod_jk2, this is bottleneck, or in this way I will tunning somethings: static images, web pages etc. caching? Or better way Tomcats directly to the WEB? Go with tomcat straight for now -- you don't want the search pages to take the Apache/mod_jk2 hit everytime. Later you can split up the static pages in a separate site that can be on apache. For loading images, make a separate url image.domain.com and load those from there.
