Hi,
This is my first post. I have been working with Lucene for about 4
weeks and Solr for just about 10 days. We are going to convert our
site search over to Solr as soon as we figure out some of the nuances.
As I was testing out the synonyms features to decide how we could best
use it, I searched for iPod (I know it is an example, but we actually
sell them). I was shocked when the search results were nothing close
to an iPod.
Looking closer, I could see that the description had an iPod word in
it, just 1. With debug on, that fact is confirmed (this is the first
result):
<str name="id=502999430,internal_docid=6247">
152529.23 = (MATCH) fieldWeight(search_text:ipod in 6247), product of:
1.0 = tf(termFreq(search_text:ipod)=1)
3.7238584 = idf(docFreq=522)
40960.0 = fieldNorm(field=search_text, doc=6247)
</str>
Here is an explainOther, FOR an actual iPod SKU (in the same search):
<str name="otherQuery">id:650085488</str>
<lst name="explainOther">
<str name="id=650085488,internal_docid=6985">
1.0473351 = (MATCH) fieldWeight(search_text:ipod in 6985), product of:
3.0 = tf(termFreq(search_text:ipod)=9)
3.7238584 = idf(docFreq=522)
0.09375 = fieldNorm(field=search_text, doc=6985)
</str>
If the term frequency is higher, the only difference is'fieldNorm'
which I do not understand in the context of relevancy. Does this have
to do with omitNorms in some way?
In a related factor, I also tried the dismax query with the following
line in it:
<str name="qf">search_text^0.5 brand^10.0 keywords^5.0 title^20.0
sub_title^1.5 model^2.0 attribute^1.1</str>
As an experiment I boosted the title a bunch, since this is where the
term iPod exists the most. It made no effect, in fact, it was not even
working. The title was not being used at all, just the search_text,
even though I have it indexed.
Here is the relevant schema parts
<field name="id" type="string" indexed="true" stored="true"
required="true" />
<field name="brand" type="string" indexed="true" stored="true" />
<field name="model" type="string" indexed="true" stored="true" />
<field name="manufacturer_model" type="string" indexed="true"
stored="true" />
<field name="keywords" type="string" indexed="true"
stored="false" />
<field name="title" type="string" indexed="true" stored="true" />
<field name="sub_title" type="string" indexed="true"
stored="true" />
<field name="attribute" type="string" indexed="true" stored="true"
multiValued="true" />
<field name="type" type="string" indexed="true" stored="true" />
<field name="description_category" type="string" indexed="true"
stored="true" />
<field name="description" type="string" indexed="true"
stored="true" />
<field name="brand_id" type="string" indexed="false"
stored="true" />
<field name="code" type="string" indexed="false" stored="true" />
<field name="color" type="string" indexed="true" stored="true" />
<field name="description_category_id" type="string"
indexed="false" stored="true" />
<field name="display_price" type="sfloat" indexed="false"
stored="true" />
<field name="line_item_price" type="sfloat" indexed="true"
stored="true" />
<field name="main_category" type="string" indexed="true"
stored="true" />
<field name="main_category_id" type="string" indexed="false"
stored="true" />
<field name="regular_price" type="sfloat" indexed="false"
stored="true" />
<field name="sku" type="string" indexed="true" stored="true" />
<field name="type_id" type="string" indexed="false" stored="true" />
<field name="upc" type="string" indexed="true" stored="true" />
<field name="size" type="string" indexed="true" stored="true" />
<field name="search_text" type="text" indexed="true"
stored="false" multiValued="true" termVectors="true"/>
<defaultSearchField>search_text</defaultSearchField>
<copyField source="brand" dest="search_text"/>
<copyField source="model" dest="search_text"/>
<copyField source="manufacturer_model" dest="search_text"/>
<copyField source="keywords" dest="search_text"/>
<copyField source="title" dest="search_text"/>
<copyField source="sub_title" dest="search_text"/>
<copyField source="attribute" dest="search_text"/>
<copyField source="description_category" dest="search_text"/>
<copyField source="type" dest="search_text"/>
<copyField source="description" dest="search_text"/>
<copyField source="main_category" dest="search_text"/>
<copyField source="sku" dest="search_text"/>
<copyField source="upc" dest="search_text"/>
Thanks to all who are willing to take a look at this and help.
----------------------------------------------------
Tim Christensen
Director Media & Technology
Vann's Inc.
406-203-4656
[EMAIL PROTECTED]
http://www.vanns.com