On Sun, Jan 3, 2010 at 10:42 AM, Karl Wettin <karl.wet...@gmail.com> wrote: > > 3 jan 2010 kl. 16.32 skrev Yonik Seeley: > >> Perhaps this is just a huge index, and not enough of it can be cached in >> RAM. >> Adding additional clauses to a boolean query incrementally destroys >> locality. >> >> 104GB of index and 4GB of RAM means you're going to be hitting the >> disk constantly. You need more hardware - if you're requirements are >> low (low query volume, high query latency of a few seconds OK) then >> you can probably get away with a single box... just either get a SSD >> or get more RAM (like 32G or more). >> >> If you want higher query volumes or consistent sub-second search, >> you're going to have to go distributed. >> Roll your own or look at Solr. > > I'm not sure I agree. > > A 104GB index says nothing about the date field. And it says nothing about > the range of the query.
Given that there are 500M docs, one can make an educated guess that much of this 104GB is index and not just stored fields. IMO, it's simply too many docs and too big of a ratio between RAM and index size for "good" query performance. But I don't think we've heard what the requirements for this index are. A quick "ls -l" of the index directory would be revealing though. -Yonik http://www.lucidimagination.com --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org