Ere Maijala created LUCENE-7698:
-----------------------------------

             Summary: CommonGramsQueryFilter in the query analyzer chain breaks 
phrase queries
                 Key: LUCENE-7698
                 URL: https://issues.apache.org/jira/browse/LUCENE-7698
             Project: Lucene - Core
          Issue Type: Bug
          Components: core/queryparser
    Affects Versions: 6.4.1
            Reporter: Ere Maijala


CommonGramsQueryFilter breaks phrase queries. The behavior also seems to change 
with addition or removal of adjacent terms.

Steps to reproduce:

1.) Download and extract Solr (in my test case version 6.4.1) somewhere.
2.) Modify 
server/solr/configsets/sample_techproducts_configs/conf/managed-schema and 
modify text_general fieldType by adding CommonGrams(Query)Filter before 
stopWordFilter:

    <fieldType name="text_general" class="solr.TextField" 
positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.CommonGramsFilterFactory" ignoreCase="true" 
words="stopwords.txt" />
        <filter class="solr.StopFilterFactory" ignoreCase="true" 
words="stopwords.txt" />
        <!-- in this example, we will only use synonyms at query time
        <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" 
ignoreCase="true" expand="false"/>
        -->
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.CommonGramsQueryFilterFactory" ignoreCase="true" 
words="stopwords.txt"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" 
words="stopwords.txt" />
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" 
ignoreCase="true" expand="true"/>
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
    </fieldType>

3.) Add "with" to 
server/solr/configsets/sample_techproducts_configs/conf/stopwords.txt and make 
sure the file has correct line endings (extracted from Solr zip it seems to 
contain DOS/Windows lien endings which may break things).

4.) Run the techproducts example with "bin/solr -e techproducts"

5.) Browse to 
<http://localhost:8983/solr/techproducts/select?q=%22iPod%20with%20Video%22&debugQuery=true>

6.) Observe that parsedquery in the debug output is empty

7.) Browse to 
<http://localhost:8983/solr/techproducts/select?q=%22Apple%2060%20GB%20iPod%20with%20Video%20Playback%20Black%22&debugQuery=true>

8.) Observe that parsedquery contains ipod_with as expected but not with_video.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to