Hi,

I have updated my solr instance from 4.5.1 to 4.7.1. Now the parsed query seems to be not correct.

Query: /*q=*:*&fq=title:T&E&debug=true */

Before the update the parsed filter query is "*/+title:t&e +title:t +title:e/*". After the update the parsed filter query is "*/+((title:t&e title:t)/no_coord) +title:e/*". It seems like a bug within the query parser.

I also have validated the parsed filter query with the analysis component. The result was "*/+title:t&e +title:t +title:e/*".

The behavior is equal on all special characters that split words into 2 parts.

I use the following WordDelimiterFilter on query side:

<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="0" splitOnNumerics="0" preserveOriginal="1"/>

Thanks.

Johannes


Additional informations:

Debug before the update:

<lstname="debug">
<strname="rawquerystring">*:*</str>
<strname="querystring">*:*</str>
<strname="parsedquery">MatchAllDocsQuery(*:*)</str>
<strname="parsedquery_toString">*:*</str>
<lstname="explain"/>
<strname="QParser">LuceneQParser</str>
<arrname="filter_queries">
<str>(title:((T&E)))</str>
</arr>
*<arrname="parsed_filter_queries"> **
**<str>+title:t&e +title:t +title:e</str> **
**</arr> *
...

Debug after the update:

<lstname="debug">
<strname="rawquerystring">*:*</str>
<strname="querystring">*:*</str>
<strname="parsedquery">MatchAllDocsQuery(*:*)</str>
<strname="parsedquery_toString">*:*</str>
<lstname="explain"/>
<strname="QParser">LuceneQParser</str>
<arrname="filter_queries">
<str>(title:((T&E)))</str>
</arr>
*<arrname="parsed_filter_queries"> **
**<str>+((title:t&e title:t)/no_coord) +title:e</str> **
**</arr>*
...

"title"-field definition:

<fieldType name="text_title" class="solr.TextField" positionIncrementGap="100" omitNorms="true">
      <analyzer type="index">
        <charFilter class="solr.HTMLStripCharFilterFactory"/>
<charFilter class="solr.MappingCharFilterFactory" mapping="mapping.txt"/>
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="1" splitOnCaseChange="1" splitOnNumerics="1" preserveOriginal="1" stemEnglishPossessive="0"/>
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <charFilter class="solr.HTMLStripCharFilterFactory"/>
<charFilter class="solr.MappingCharFilterFactory" mapping="mapping.txt"/>
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="false"/> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="0" splitOnNumerics="0" preserveOriginal="1"/>
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
    </fieldType>

Reply via email to