> I do agree we should be returning null not EMPTY_DOCIDSET since > Filter.getDocIDSet's jdoc clearly states a null return means no docs are > accepted.
I disagree. NULL is a stupid indicator, if something is empty it should return something empty. I also don't like somebody returning NULL instead of Collections.emptySet() or like that. We should document Filter and also Weight.scorer to always return non-null and also supply a Scorer.EMPTY_SCORER. If you want to optimize something, you can always change you loops to first do next() and then exit early. > And I think I'm on the opposite side of the fence on null returns, at least for > advanced APIs -- I'd prefer not to hide information (by returning an empty > instance) in this case. But other cases I think should never return null -- eg > once you have a non-null DocIdSet, then its .iterator() should never return > null. I agree. DocIdSet.iterator() should return EMPTY_DOCIDSETITERATOR. > Mike > > On Tue, Jan 4, 2011 at 5:38 PM, Smiley, David W. <dsmi...@mitre.org> > wrote: > > I'm looking through the trunk code on various implementations of > Filter.getDocIdSet(IndexReader). It is often needed to get an instance of > Terms and then do other work from there. Looking at > MultiTermQueryWrapperFilter, the first set of lines to do this is: > > > > public DocIdSet getDocIdSet(IndexReader reader) throws IOException { > > final Fields fields = MultiFields.getFields(reader); > > if (fields == null) { > > // reader has no fields > > return DocIdSet.EMPTY_DOCIDSET; > > } > > > > final Terms terms = fields.terms(query.field); > > if (terms == null) { > > // field does not exist > > return DocIdSet.EMPTY_DOCIDSET; > > } > > .... > > > > When I look at the javadoc for MultiFields.getFields(reader), I see some > Javadoc (apparently written by Michael McCandless, CC'ed), with the > following javadoc snippet : > > * <p><b>NOTE</b>: this is a slow way to access postings. > > * It's better to get the sub-readers (using {...@link > > * Gather}) and iterate through them > > * yourself. > > > > If this is the case, then why is MultiFields.getFields(reader) used 43 times > across Lucene/Solr whereas ReaderUtil.Gather is only used 5 times? If it's a > TODO then perhaps a JIRA issue needs to be created. I don't find helpful > examples of how to use ReaderUtil.Gather... the existing 5 uses are all within > MultiFields & ReaderUtil. > > > > FWIW, in a Lucene Filter I wrote, I've been using this code snippet > successfully: > > > > Terms terms = reader.fields().terms(fieldName); > > > > On a related topic, I think that if Filter.getDocIdSet() is documented that it > may return null, then it's better code design to consequently return null in > appropriate circumstances instead of DocIdSet.EMPTY_DOCIDSET. That said, > FWIW, I prefer API design that favors non-null when you can get away with > it, like this case. So I'm in favor of making getDocIdSet() be documented to > not return null (and follow through throughout the codebase). Admittedly > some callers might have short-circuit logic. > > > > ~ David Smiley > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional > commands, e-mail: dev-h...@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org