Hi,
tying in with the previous thread "Statically store sub-collections for
search", I'm trying to focus on the root of the problem that has
occurred to me.
At first, I generate a TermsFilter with potentially many terms in one term:
-----------------------------------------
List<Term> docnames = new ArrayList<>(resource.getDocIDs().size());
for (String docid : resource.getDocIDs()) {
docnames.add(new Term("id", docid));
}
TermsFilter filter = new TermsFilter(docnames);
-----------------------------------------
This filter is used to generate a DocIdSet object holding the allowable
documents in a loop over the atomic segments of my IndexReader reader:
-----------------------------------------
for (AtomicReaderContext atomic : reader.leaves()) {
DocIdSet docids = filter.getDocIdSet(atomic,
atomic.reader().getLiveDocs());
DocIdSetIterator iterator = docids.iterator();
while (iterator.nextDoc() != DocIdSetIterator.NO_MORE_DOCS) {
...
}
...
}
-----------------------------------------
The while-loop is never entered, i.e. there are no documents in docids.
However, it does return a DocIdSetIterator object and is not null. The
same technique works fine with another Filter (a QueryWrapperFilter). Is
this a bug or am I addressing the TermsFilter (or the resuling DocIdSet)
in the wrong way? Are there any working examples for how to get a
properly populated DocIdSet from a TermsFilter?
I read that the iterator() method has to be implemented for every
DocIdSet implementation. Also, TermsFilter.getDocIdSet() seems to return
null or a FixedBitSet which seems to implement its iterator() by an
OpenBitSetIterator.
Best,
Carsten
--
Institut für Deutsche Sprache | http://www.ids-mannheim.de
Projekt KorAP | http://korap.ids-mannheim.de
Tel. +49-(0)621-43740789 | [email protected]
Korpusanalyseplattform der nächsten Generation
Next Generation Corpus Analysis Platform
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]