Some articles say we can implement google like search using lucene. Searching google for words returns millions of hits with the first page diasplayed. Does google process millions of hits before paginating (caching) with scoring/rating? Loading an array of millions of hits like id => db-id key/value pairs into a var would bring the server down. So I guess there is this select-count-star like method missing in lucene that returns just the count of hits (integer) instead of actual hits array? I am finding no concrete solutions to limit the number of hits to save the server from going down from processing a large data set. ON the other hand, I find no method that would return just the count of hits instead of the actual hits.
How do we go about rebuilding indexexs on a high activity site if it takes a few minutes for each rebuild? Not optimizing indexes will be not optimal :(
