Hello Stephen, Using Solr 8.8.1 i tried to reproduce your strange problem, copied your schema and indexed a single document. As expected, i got exactly one result for all four combinations, also using both the default Lucene QParser and the Edismax QParser.
So it appears to work just fine here on 8.8.1. The WordDelimeterGraph is relatively new and had only few issues. Maybe you can try to see if it works without the Graph-type token filters, using the old WordDelimeter That one is tried and tested. Regards, Markus Op vr 2 sep. 2022 om 21:57 schreef Stephen Lewis Bianamara < [email protected]>: > Hey Solr Users, > > I've noticed an odd behavior between word graph delimiter and the sow > parameter. When the word graph delimiter gets invoked and sow=true, there > is the possibility to miss results which include alpha num splitting but > aren't exact matches. So if I have a document with "ABC123 DEF456_GHI", the > combination of sow=true and WordDelimeterGraph seem to break queries for > "def456". See full repro below. > > I believe this is a bug. Could someone please take a look at my repro and > confirm my repro, or let me know if something is misconfigured here? > > *Repro* > > - solr 9 with this field type definition for field "test_en" > > <fieldType name="text_en" class="solr.TextField" positionIncrementGap="100" > autoGeneratePhraseQueries="true"> <analyzer type="index"> <tokenizer class= > "solr.WhitespaceTokenizerFactory"/> <filter class= > "solr.WordDelimiterGraphFilterFactory" generateWordParts="1" > generateNumberParts="1" catenateAll="1" preserveOriginal="1" > splitOnCaseChange="1"/> <filter class="solr.FlattenGraphFilterFactory"/> < > filter class="solr.LowerCaseFilterFactory"/> <filter class= > "solr.SnowballPorterFilterFactory"/> </analyzer> <analyzer type="query"> < > tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class= > "solr.WordDelimiterGraphFilterFactory" generateWordParts="1" > generateNumberParts="1" catenateAll="1" preserveOriginal="1" > splitOnCaseChange="1"/> <filter class="solr.LowerCaseFilterFactory"/> < > filter class="solr.SnowballPorterFilterFactory"/> </analyzer> </fieldType> > > - Create document {"id": 1, "test_en": ["ABC123 DEF456_GHI"]} > - Query the following; all should hit, but one combination misses > - sow=true, q=def456 > - misses > - sow=true, q=abc123 > - hits > - sow=false, q=def456 > - hits > - sow=false, q=abc123 > - hits >
