Hi Sean Dean-3: “I have one index just above 20 million that takes up about 29GB in space.” It's very very great! The difficult for me is the size of the indexes! It's too large! If 20 million's indexes is only ~30G, the difficulty can be solve! If your idear become a reality,search 100 million will become a reality too! This is also my goal. Come on! Look forward to your test results! ................................................................. I'm going to have to disagree and also explain my reasoning.
A 32GB SLC (or even MLC if we talk about size) is capable of holding a 20 million page index. I have one index just above 20 million that takes up about 29GB in space. Hadoop is designed to run on commodity PCs but when I talk about a "1U server" its more like I'm referring to use a 1U server chassis with commodity type parts inside. The reason for this is co-location, I cant place multiple desktop or towers at the datacenter. You would only need 5 machines to search 100 million pages, although this isn't taking speed into consideration. ........................................ -- View this message in context: http://www.nabble.com/Search-performance-for-large-indexes-%28%3E100M-docs%29-tp21315030p21469788.html Sent from the Nutch - User mailing list archive at Nabble.com.