On Jul 7, 2011, at 5:14 PM, Jahangir Anwari wrote:
> I did noticed a strange issue though. When the query is just a
> PhraseQuery(e.g. "everlasting glory"), getWeightedSpanTerms() returns all
> the span terms along with their span positions. But when the query is a
> BooleanQuery containing phrase and non-phrase terms(e.g. "everlasting
> glory"+unity), getWeightedSpanTerms() returns all the span terms but the
> span positions are returned only for the phrase terms(i.e. "everlasting" and
> "glory"). Span positions for the non-phrase term(i.e. "unity") is empty. Any
> ideas why this could be happening?
Positions are only collected for "position sensitive" queries. The Highlighter
framework that I plugged this into already runs through the TokenStream one
token at a time - to highlight a TermQuery, there is no need to consult
positions - just highlight every occurrence seen while marching through the
TokenStream. Which means there is no need to find those positions either.
If you are looking for those positions, here is a patch to calculate them for
TermQuerys as well. If you open a JIRA issue, seems like a reasonable option to
add to the class.
Index:
lucene/contrib/highlighter/src/java/org/apache/lucene/search/highlight/WeightedSpanTermExtractor.java
===================================================================
---
lucene/contrib/highlighter/src/java/org/apache/lucene/search/highlight/WeightedSpanTermExtractor.java
(revision 1143407)
+++
lucene/contrib/highlighter/src/java/org/apache/lucene/search/highlight/WeightedSpanTermExtractor.java
(working copy)
@@ -133,7 +133,7 @@
sp.setBoost(query.getBoost());
extractWeightedSpanTerms(terms, sp);
} else if (query instanceof TermQuery) {
- extractWeightedTerms(terms, query);
+ extractWeightedSpanTerms(terms, new
SpanTermQuery(((TermQuery)query).getTerm()));
} else if (query instanceof SpanQuery) {
extractWeightedSpanTerms(terms, (SpanQuery) query);
} else if (query instanceof FilteredQuery) {
- Mark Miller
lucidimagination.com
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]