Thanks Ryan. I now know the reason why.
Before I explain the reason, let me correct the mistake I made in my earlier
mail. I was not using the first document mentioned in the xml . Instead it
was this one:
<doc>
  <field name="id">IW-02</field>
  <field name="name">iPod &amp; iPod Mini USB 2.0 Cable</field>
  <field name="manu">Belkin</field>
  <field name="cat">electronics</field>
  <field name="cat">connector</field>
  <field name="features">car power adapter for iPod, white</field>
  <field name="weight">2</field>
  <field name="price">11.50</field>
  <field name="popularity">1</field>
  <field name="inStock">false</field>
</doc>

The reason I was getting strange result was because of the character "i".
Here is what I learnt from debug info:

"debug":{
  "rawquerystring":"id:neardup06",
  "querystring":"id:neardup06",
  "parsedquery":"features:og features:en features:til features:er
features:af features:der features:ts features:se features:i features:p
features:pet features:brag features:efter features:zombier features:k
features:tilbag features:ala features:sviner features:folk
features:klassisk features:resid features:horder features:lidt
features:man features:denn",
  "parsedquery_toString":"features:og features:en features:til
features:er features:af features:der features:ts features:se
features:i features:p features:pet features:brag features:efter
features:zombier features:k features:tilbag features:ala
features:sviner features:folk features:klassisk features:resid
features:horder features:lidt features:man features:denn",
  "explain":{
        "id=IW-02,internal_docid=8":"\n0.0050230525 = (MATCH) product of:\n
0.12557632 = (MATCH) sum of:\n    0.12557632 = (MATCH)
weight(features:i in 8), product of:\n      0.17474915 =
queryWeight(features:i), product of:\n        1.9162908 =
idf(docFreq=3)\n        0.09119135 = queryNorm\n      0.71860904 =
(MATCH) fieldWeight(features:i in 8), product of:\n        1.0 =
tf(termFreq(features:i)=1)\n        1.9162908 = idf(docFreq=3)\n
 0.375 = fieldNorm(field=features, doc=8)\n  0.04 = coord(1/25)\n"}}}

The field "features" uses the default fieldtype - "text" in the schema.xml.
The problem was solved by adding the character "i" to the
stopwords.txtfile. the "i"s in document 2 were matched with the "i" in
"iPod" of document
1.

I still have to figure out why a single character - "i" - matched the "i" in
a word - "iPod".

Regards,
Rishabh

On 22/11/2007, Ryan McKinley <[EMAIL PROTECTED]> wrote:
>
> >
> > Now when I run the following query:
> >
> http://localhost:8080/solr/mlt?q=id:neardup06&mlt.fl=features&mlt.mindf=1&mlt.mintf=1&mlt.displayTerms=details&wt=json&indent=on
> >
>
> try adding:
>   &debugQuery=on
>
> to your query string and you can see why each document matches...
>
> My guess is that "features" uses a text field with stemming and a
> stemmed word matches
>
> ryan
>

Reply via email to