Hi,
I've got a setup in which I would like to perform an arbitrary query
over one field (typically realised through a WildcardQuery) and the
matches are returned as a SpanQuery because the result payloads are
further processed using Span.next() and Span.getPayload(). This works
fine with the following code (extract), using Lucene 4.0.0:
---------------------------------------------------------------------
// these fields are initialized externally through public methods:
private final MultiReader reader;
private final String termString;
private final String fieldname;
private final int maxHits;
private Map<Term, TermContext> termContexts = new HashMap<>();
WildcardQuery wildcard;
Term term = new Term(fieldname, termString);
SpanQuery query; // Lucene query
Spans luceneSpans;
wildcard = new WildcardQuery(term);
query = (SpanQuery) new
SpanMultiTermQueryWrapper<>(wildcard).rewrite(reader);
spans = query.getSpans(atomic, matchingTitleIDs.bits(), termContexts);
for (AtomicReaderContext atomic : reader.getContext().leaves()) {
spans = query.getSpans(atomic, matchingTitleIDs.bits(), termContexts);
while (luceneSpans.next() && total <= maxHits) {
...
}
}
---------------------------------------------------------------------
Now, I'd like to add the option to filter the resulting Spans object by
another WildcardQuery on a different field that contains document
titles. My intuitive approach would have been to use a filter like this:
Filter filter = new QueryWrapperFilter(new WildcardQuery(new
Term(titlefield, titles)));
The filter is applied in a dedicated method with this line:
DocIdSet matchingTitleIDs = filter.getDocIdSet(context, new
Bits.MatchAllBits(0));
And subsequently, the getSpan() call from above is substituted by:
spans = query.getSpans(atomic, matchingTitleIDs.bits(), termContexts);
However, this yields either a NullPointerException when there are no
hits or does not affect the results at all in comparison to no filtering.
I've come across the thread "lucene-4.0: QueryWrapperFilter & docBase"
[1] in which Uwe suggests not to use QueryWrapperFilter, but to use
another Query and to combine it using a Boolean Query in such a
scenario, if I understand correctly. Does this still claim for Lucene 4.0?
However, I am not sure how to use a BooleanQuery here because I need the
SpanQuery result.
Any thoughts about what I'm doing wrong and how to fix this?
Thank you very much!
Carsten
[1]
http://mail-archives.apache.org/mod_mbox/lucene-java-user/201210.mbox/%3CCABY_-Z7r=z0301yf1-1uvbqyw3jf48srpuhe6syt1eh28vn...@mail.gmail.com%3E
--
Institut für Deutsche Sprache | http://www.ids-mannheim.de
Projekt KorAP | http://korap.ids-mannheim.de
Tel. +49-(0)621-43740789 | [email protected]
Korpusanalyseplattform der nächsten Generation
Next Generation Corpus Analysis Platform
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]