Do stopfilters create non-contiguous token positions?
I was interested in experimenting with the highlighter and using the
TokenSources.getTokenStream(TermPositionVector
<file:///C:\mysvn\lucene\build\docs\api\org\apache\lucene\index\TermPosition
Vector.html> tpv, boolean
tokenPositionsGuaranteedContiguous) method
The javadocs for this method note that:
tokenPositionsGuaranteedContiguous - true if the token position numbers have
no overlaps or gaps.
The example used for comparison to re-Analyzing the the text includes
stopwords ("timings above were using a stemmer/lowercaser/stopword combo").
I was curious if a stopwords, by definition meant that tokens were not
contiguous? Is this still true if the the query uses the same analyzer and
filters out the same stopwords?
Thanks,
Dan