Hmm, I suspect this is a bug in the position length implementation of
CommonGramsFilter.
This filter inserts additional tokens (bigrams) around stopwords, so
if you have this is a test it will create this this_is is is_a a
a_test and so on, so it can be viewed as a conditional
shinglefilter.
But
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-4.x/157/
1 tests failed.
REGRESSION: org.apache.lucene.analysis.core.TestRandomChains.testRandomChains
Error Message:
last stage: inconsistent endOffset at pos=41: 7 vs 19; token=i_i i i i i u i i
u f i i u f d i i u f d s i i u f d s