ok I missed 1483 completely. As a side comment, why not add setNextReader to HitCollector and then a getDocId(int doc) method which will do the doc + base arithmetic? I think it's very easy for someone to forget to add that (+ base) to doc. You could then just change TopDocCollector to call getDocId() instead of duplicating it into TopScoreDocCollector.
Isn't that something you'd want all HitCollector implementations to use? I consider some extensions of HitCollector we have - we now will probably want to change them to extend MultiReaderHitCollector, but we'll have to remember to do that +base arithmatic everywhere, instead of calling getDocId(). I understand that changing the call to getDocId is the same as adding "+ base", from an effort perspective, but I think it's better this way. It does involve an additional method call, but I wonder how good compilers will handle that. Anyway, I don't want to add topDocs and getTotalHits to HitCollector, it will destroy its generic purpose. An interface is also problematic, as it just means all of these collectors have these methods declared, but they need to implement them. An abstract class grants you w/ both. So in case you agree that the logic of MultiReaderHitCollector can (and should?) be in HitCollector, we can create an abstract class called ScoringCollector (or if nobody objects TopDocsCollector) which will implement these two methods. In case you disagree, we can have that abstract class extend MultiReaderHitCollector instead. I'm in favor of the first option as at least as it looks in the code, HitCollector is not extended by any class anymore, except TopDocCollector which is marked as deprecated, and 3 anonymous implementations. So it looks like HitCollector itself is "deprecated" as far as the Lucene core code sees it. What do you think? Shai On Mon, Mar 23, 2009 at 4:43 PM, Michael McCandless < luc...@mikemccandless.com> wrote: > > If we're already creating a new TopScoreDocCollector (when was it > > added? I must have been dozing off while this happened...) > > This was LUCENE-1483. > > > How about if we introduce an abstract ScoringCollector (about the > > name later) which implements topDocs() and getTotalHits() and there > > will be several implementations of it, such as: > > TopScoreDocCollector, which sorts the documents by their score, in > > descending order only, TopFieldDocCollector - for sorting by fields, > > and additional sort-by collectors. > > This sounds good... but the challenge is we also need to get both > HitCollector and MultiReaderHitCollector in there. > > HitCollector is the simplest way to create a custom collector. > MultiReaderHitCollector (added with LUCENE-1483) is the more > performant way, since it lets your collector operate per-segment. All > non-deprecated core collectors in Lucene now subclass > MultiReaderHitCollector. > > So would we make separate subclasses for each of them to add > getTotalHits() / topDocs()? EG TopDocsHitCollector and > TopDocsMultiReaderHitCollector? It's getting confusing. > > Or maybe we just add totalHits() and topDocs() to HitCollector even > though for advanced case (non-top-N-collection) the methods would not > be used? > > Or... maybe this is a time when an interface is the lesser evil: we > could make a TopDocs interface that the necessary classes implement? > > Mike > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-dev-h...@lucene.apache.org > >