I think the common use case of TopScoreDocCollector is to request 10 results, then ask for 20 and so on. You ask for N results because you want to display them, or manipulate them in some way.
However, if we do add this to the base PQ, it means an extra check for every put() call. We've tried very hard recently to remove 'if' checks in TopScoreDocCollector and TopFieldCollector, so adding one back does not seem very appealing to me, especially since I IMO the use case of requesting 10K results w/o knowing first there are more or so that match the query, is not so common. Of course, for the what I call common case, if I do know there are 10K results, and request for 10K, but PQ starts w/ 10 (for example) and gradually grows its array, I also lose performance, not just because of extra 'if', but I also allocate ~20K slots (counting re-allocations). You could write your own PQ which does that, although to achieve that we'll need to make some methods on PQ not final. You will also need to write your own TopScoreDocCollector version, since the latter does not accept PQ and is a final class. But perhaps others think it's worth it to add to PQ an extra 'if' and avoid unnecessary large allocations? Or ... can we make PQ's methods not final, and let others extend it and do what they want, not just impl lessThan()? We can also have a factory method for PQ which returns a gradually growing PQ, or an "init once" PQ, the latter will be used by TSDC and TFC and the former can be asked for specifically. Or ... we can do nothing, and say that one can write his own Collector, and use Sun's PQ (or any other), if one has a case like "I need 10K results, but I don't know how many are there, and specifically I want to optimize for the case of 1 result". BTW, notice that Sun's PQ does not have some of the methods we use today, such as insertWithOverflow(), add() and updateTop(). Those were also recently added for better performance. Shai On Mon, Jun 22, 2009 at 2:39 AM, Claudio . <spcicli...@hotmail.com> wrote: > Hi, > > > The PriorityQueue used in TopDocCollector does not optimize the memory > usage. If I do this search: > > TopDocs topDocs = searcher.search(query, null, 10000); > > And only 1 document is returned. The PriorityQueue will create an Object[] > of size 10000 + 1, but only one position of the array is used. > > The Lucene could implement a PriorityQueue with size extensible. > > The PriorityQueue of the Sun is size extensible. Why does not use it? > > Thanks > ------------------------------ > What can you do with the new Windows Live? Find > out<http://www.microsoft.com/windows/windowslive/default.aspx> >