hey chris, i was using the hits.doc method while iterating,,, you've given me some hope!!!!!! i will look into the FieldCache
Chris Hostetter <[EMAIL PROTECTED]> wrote: : currently , i am iterating through about 200-300 of the top docs and : creating the groups (so, as of now, the groups are partial) , my : response time HAS to be at most 500-600 milli (query + groupings) or my : company will probably go with a commercial search engine such as FAST or : something of the sort if when you say you are "iterating through about 200-300" docs, you mean you are using either the Hits.doc method, or the IndexReader.document method on those 200-300 docs, then I am 99.9999% confident any of the approaches described so far will be faster then what you are doing now. : the data i must prove lucene can handle is up to 25 million 2-4Kb : docs, i dont think it is feasable to creat groups on a result set : consisting of a few thousand results in a 1/2 second, or am i wrong? With 25 million docs, caching BitSets for every group may not be feasible (depending on how many grouping you have) but using either the TermEnum iterating or FieldCache enumerating described recently it should be possible. if you expect the number of results from each search (a few thousand) to be a small fraction of the total number of docs (25 million) then my money is on the FieldCache approach because you'll only be iterating over the matching docs (instead of over every doc and then testing to see if it's in your results) -Hoss --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------- Do you Yahoo!? With a free 1 GB, there's more in store with Yahoo! Mail.