Hi First posting to list, but here goes .
I'm using WordDelimiterGraphFilter on a field and came across a curious additional positional "hole" generated by the filter while playing with the analysis tool. For input "wibble , wobble" (space either side of the comma so it's a separate token), the output introduces an additional positional hole after the comma, i.e. Term position Wibble 1 , 2 Wobble 4 * The positionlength for each is 1, so no obvious graph-span going on. Its not just comma, any punctuation would do, e.g. "wibble ! wobble" I know it's a bit contrived, and it doesn't break anything in production but it just puzzled me. The question is - is this by design ?. Its not the behaviour of the old WordDelimiterFilter filter. Setup: Solr 6.6.3 Field: <fieldType name="text_en_allies" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.WordDelimiterGraphFilterFactory" generateWordParts="1" splitOnNumerics="0" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="1" splitOnCaseChange="1" preserveOriginal="1" stemEnglishPossessive="1"/> ... </analyzer> Thanks for any insight. Kelvyn Scrupps Developer for Allies Computing ______________________________________________________________________ This email has been scanned by the Symantec Email Security.cloud service (http://www.symanteccloud.com) for Allies Computing Ltd ______________________________________________________________________