Hi,

  I'm trying to figure out search behaviour related to similar terms, one
with and without the hyphen. Both of them are generating a different result
set , the search without the hyphen is bringing back more result compared
to the other. Here's the fieldtype definition :

<fieldType name="text" class="solr.TextField" positionIncrementGap="100"
autoGeneratePhraseQueries="true">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" />
<filter class="solr.SynonymFilterFactory" synonyms="synonyms/synonyms.txt"
ignoreCase="true" expand="true"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"
generateNumberParts="1" catenateWords="0" catenateNumbers="1"
catenateAll="0" splitOnCaseChange="0"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms/synonyms.txt"
ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" />
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"
generateNumberParts="1" catenateWords="0" catenateNumbers="1"
catenateAll="0" splitOnCaseChange="0"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
</fieldType>

If I run the search term through the analyzer, the final indexed data for
both term (hyphen and without) results in  --> *inventor templat*

I was under the impression that based on my analyzers, both search term
will produce same result.

Here's the output from debug and splainer.

*Inventor-template*
*-------------------------*

<str name="parsedquery">(+DisjunctionMaxQuery(((+CommandSrch:inventor
+CommandSrch:templat) | text:"inventor templat"^1.5 | Description:"inventor
templat"^2.0 | title:"inventor templat"^3.5 | keywords:"inventor
templat"^1.2)~0.01) Source2:sfdcarticles^9.0 Source2:downloads^5.0
FunctionQuery(1.0/(3.16E-11*float(ms(const(1472083200000),date(PublishDate)))+1.0)))/no_coord</str>

<str name="parsedquery_toString">+((+CommandSrch:inventor
+CommandSrch:templat) | text:"inventor templat"^1.5 | Description:"inventor
templat"^2.0 | title:"inventor templat"^3.5 | keywords:"inventor
templat"^1.2)~0.01
1.0/(3.16E-11*float(ms(const(1472083200000),date(PublishDate)))+1.0)</str>

>From Splainer:

10.974786 Sum of the following:
 9.203462 Dismax (max plus:0.01 times others)
   9.198681 title:"inventor templat"

   0.4781131 text:"inventor templat"

 1.7644342 Source2:sfdcarticles

 0.006889837 
1.0/(3.16E-11*float(ms(const(1472083200000),date(PublishDate)))+1.0)


*Inventor template*
*--------------------------*

<str name="parsedquery">(+(+DisjunctionMaxQuery((CommandSrch:inventor |
text:inventor^1.5 | Description:inventor^2.0 | title:inventor^3.5 |
keywords:inventor^1.2)~0.01) +DisjunctionMaxQuery((CommandSrch:templat |
text:templat^1.5 | Description:templat^2.0 | title:templat^3.5 |
keywords:templat^1.2)~0.01)) Source2:sfdcarticles^9.0 Source2:downloads^5.0
FunctionQuery(1.0/(3.16E-11*float(ms(const(1472083200000),date(PublishDate)))+1.0)))/no_coord</str>

<str name="parsedquery_toString">+(+(CommandSrch:inventor |
text:inventor^1.5 | Description:inventor^2.0 | title:inventor^3.5 |
keywords:inventor^1.2)~0.01 +(CommandSrch:templat | text:templat^1.5 |
Description:templat^2.0 | title:templat^3.5 | keywords:templat^1.2)~0.01)
Source2:sfdcarticles^9.0 Source2:downloads^5.0
1.0/(3.16E-11*float(ms(const(1472083200000),date(PublishDate)))+1.0)</str>

>From splainer :

9.915069 Sum of the following:
 5.03947 Dismax (max plus:0.01 times others)
   5.038846 title:templat

   0.062400598 text:templat

 4.767776 Dismax (max plus:0.01 times others)
   4.7674117 title:inventor

   0.03642158 text:inventor

 0.098686054 Source2:CloudHelp

 0.009136423
1.0/(3.16E-11*float(ms(const(1472083200000),date(PublishDate)))+1.0)


I'm using edismax.


Just wondering what I'm missing here. Any help will be appreciated.

Regards,
Shamik

Reply via email to