Re: please help explaining debug output

2011-07-28 Thread Erick Erickson
IDF is the frequency of the term in that field for the entire index, not
the specific document.

So it means that the term is in that field for some document somewhere,
but not in that particular document I believe...

Which leads me to wonder if the document is getting indexed as you
expect, although there's nothing in the data that you've provided that
I can point to as the culprit, it all looks like it *should* work

If you can get a copy of Luke and look at the document in question
and/or look at the schema browser for that particular field it might
help, but frankly I'm at a loss to understand what the problem is...

Sorry I can't be of more help
Erick

On Tue, Jul 26, 2011 at 1:04 PM, Robert Petersen rober...@buy.com wrote:
 That didn't help.  Seems like another case where I should get matches but 
 don't and this time it is only for some documents.  Others with similar 
 content do match just fine.  The debug output 'explain other' section for a 
 non-matching document seems to say the term frequency is 0 for my problematic 
 term, although I know it is in the content.

 I ended up making a synonym to do what the analysis stack *should* be doing: 
 splitting LaserJet on case changes.  IE putting LaserJet, laser jet in 
 synonyms at index time makes this work.  I don't know why though.

 Question:  Does this debug output mean it is matching the terms but the term 
 frequency vector is returning 0 for the frequency of this term.  IE Does this 
 mean the term is in the doc but not in the tf array?

 0.0 = no match on required clause (moreWords:laser jet)

    0.0 = weight(moreWords:laser jet in 32497), product of:

      0.60590804 = queryWeight(moreWords:laser jet), product of:

        14.597603 = idf(moreWords: laser=26731 jet=12685)

        0.041507367 = queryNorm

      0.0 = fieldWeight(moreWords:laser jet in 32497), product of:

        0.0 = tf(phraseFreq=0.0)

        14.597603 = idf(moreWords: laser=26731 jet=12685)

        0.078125 = fieldNorm(field=moreWords, doc=32497)




 -Original Message-
 From: Erick Erickson [mailto:erickerick...@gmail.com]
 Sent: Monday, July 25, 2011 3:28 PM
 To: solr-user@lucene.apache.org
 Subject: Re: please help explaining debug output

 Hmmm, I can't find a convenient 1.4.0 to download, but re-indexing is a good
 idea since this seems like it *should* work.

 Erick

 On Mon, Jul 25, 2011 at 5:32 PM, Robert Petersen rober...@buy.com wrote:
 I'm still on solr 1.4.0 and the analysis page looks like they should match, 
 and other products with the same content do in fact match.  I'm reindexing 
 the non-matching ones to rule that out.

 -Original Message-
 From: Erick Erickson [mailto:erickerick...@gmail.com]
 Sent: Monday, July 25, 2011 1:58 PM
 To: solr-user@lucene.apache.org
 Subject: Re: please help explaining debug output

 Hmmm, I'm assuming that moreWords is your default text field, yes?

 But it works for me (tm), using 1.4.1. What version of Solr are you on?

 Also, take a glance at the admin/analysis page, that might help...

 Gotta run

 Erick

 On Mon, Jul 25, 2011 at 4:52 PM, Robert Petersen rober...@buy.com wrote:
 Sorry, to clarify a search for P1102W matches all three docs but a
 search for p1102w LaserJet only matches the second two.  Someone asked
 me a question while I was typing and I got distracted, apologies for any
 confusion.

 -Original Message-
 From: Robert Petersen [mailto:rober...@buy.com]
 Sent: Monday, July 25, 2011 1:42 PM
 To: solr-user@lucene.apache.org
 Subject: please help explaining debug output

 I have three documents with the following product titles in a text field
 called moreWords with analysis stack matching the solr example text
 field definition.



 1.       HP LaserJet P1102W Monochrome Laser Printer
 http://www.buy.com/prod/hp-laserjet-p1102w-monochrome-laser-printer/q/l
 oc/101/213824965.html

 2.       HP CE285A (85A) Remanufactured Black Toner Cartridge for
 LaserJet M1212nf, P1102, P1102W Series
 http://www.buy.com/prod/hp-ce285a-85a-remanufactured-black-toner-cartri
 dge-for-laserjet/q/loc/101/217145536.html

 3.       Black HP CE285A Toner Cartridge For LaserJet P1102W, LaserJet
 M1130, LaserJet M1132, LaserJet M1210
 http://www.buy.com/prod/black-hp-ce285a-toner-cartridge-for-laserjet-p1
 102w-laserjet-m1130/q/loc/101/222045267.html



 A search for P1102W matches (2) and (3), but not (1) above.  Can someone
 explain the debug output?  It looks like I am getting a non-match on (1)
 because term frequency is zero?  Am I reading that right?  If so, how
 could that be? the searched terms are equivalently in all three docs.  I
 don't get it.





 lst name=debug

 str name=rawquerystringp1102w LaserJet /str

 str name=querystringp1102w LaserJet /str

 str name=parsedquery+PhraseQuery(moreWords:p 1102 w)
 +PhraseQuery(moreWords:laser jet)/str

 str name=parsedquery_toString+moreWords:p 1102 w +moreWords:laser
 jet/str

 lst name=explain

 str name=222045267

 3.64852 = (MATCH) sum

RE: please help explaining debug output

2011-07-26 Thread Robert Petersen
That didn't help.  Seems like another case where I should get matches but don't 
and this time it is only for some documents.  Others with similar content do 
match just fine.  The debug output 'explain other' section for a non-matching 
document seems to say the term frequency is 0 for my problematic term, although 
I know it is in the content.  

I ended up making a synonym to do what the analysis stack *should* be doing: 
splitting LaserJet on case changes.  IE putting LaserJet, laser jet in synonyms 
at index time makes this work.  I don't know why though.

Question:  Does this debug output mean it is matching the terms but the term 
frequency vector is returning 0 for the frequency of this term.  IE Does this 
mean the term is in the doc but not in the tf array?

0.0 = no match on required clause (moreWords:laser jet)

0.0 = weight(moreWords:laser jet in 32497), product of:

  0.60590804 = queryWeight(moreWords:laser jet), product of:

14.597603 = idf(moreWords: laser=26731 jet=12685)

0.041507367 = queryNorm

  0.0 = fieldWeight(moreWords:laser jet in 32497), product of:

0.0 = tf(phraseFreq=0.0)

14.597603 = idf(moreWords: laser=26731 jet=12685)

0.078125 = fieldNorm(field=moreWords, doc=32497)




-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Monday, July 25, 2011 3:28 PM
To: solr-user@lucene.apache.org
Subject: Re: please help explaining debug output

Hmmm, I can't find a convenient 1.4.0 to download, but re-indexing is a good
idea since this seems like it *should* work.

Erick

On Mon, Jul 25, 2011 at 5:32 PM, Robert Petersen rober...@buy.com wrote:
 I'm still on solr 1.4.0 and the analysis page looks like they should match, 
 and other products with the same content do in fact match.  I'm reindexing 
 the non-matching ones to rule that out.

 -Original Message-
 From: Erick Erickson [mailto:erickerick...@gmail.com]
 Sent: Monday, July 25, 2011 1:58 PM
 To: solr-user@lucene.apache.org
 Subject: Re: please help explaining debug output

 Hmmm, I'm assuming that moreWords is your default text field, yes?

 But it works for me (tm), using 1.4.1. What version of Solr are you on?

 Also, take a glance at the admin/analysis page, that might help...

 Gotta run

 Erick

 On Mon, Jul 25, 2011 at 4:52 PM, Robert Petersen rober...@buy.com wrote:
 Sorry, to clarify a search for P1102W matches all three docs but a
 search for p1102w LaserJet only matches the second two.  Someone asked
 me a question while I was typing and I got distracted, apologies for any
 confusion.

 -Original Message-
 From: Robert Petersen [mailto:rober...@buy.com]
 Sent: Monday, July 25, 2011 1:42 PM
 To: solr-user@lucene.apache.org
 Subject: please help explaining debug output

 I have three documents with the following product titles in a text field
 called moreWords with analysis stack matching the solr example text
 field definition.



 1.       HP LaserJet P1102W Monochrome Laser Printer
 http://www.buy.com/prod/hp-laserjet-p1102w-monochrome-laser-printer/q/l
 oc/101/213824965.html

 2.       HP CE285A (85A) Remanufactured Black Toner Cartridge for
 LaserJet M1212nf, P1102, P1102W Series
 http://www.buy.com/prod/hp-ce285a-85a-remanufactured-black-toner-cartri
 dge-for-laserjet/q/loc/101/217145536.html

 3.       Black HP CE285A Toner Cartridge For LaserJet P1102W, LaserJet
 M1130, LaserJet M1132, LaserJet M1210
 http://www.buy.com/prod/black-hp-ce285a-toner-cartridge-for-laserjet-p1
 102w-laserjet-m1130/q/loc/101/222045267.html



 A search for P1102W matches (2) and (3), but not (1) above.  Can someone
 explain the debug output?  It looks like I am getting a non-match on (1)
 because term frequency is zero?  Am I reading that right?  If so, how
 could that be? the searched terms are equivalently in all three docs.  I
 don't get it.





 lst name=debug

 str name=rawquerystringp1102w LaserJet /str

 str name=querystringp1102w LaserJet /str

 str name=parsedquery+PhraseQuery(moreWords:p 1102 w)
 +PhraseQuery(moreWords:laser jet)/str

 str name=parsedquery_toString+moreWords:p 1102 w +moreWords:laser
 jet/str

 lst name=explain

 str name=222045267

 3.64852 = (MATCH) sum of:

  2.4758534 = weight(moreWords:p 1102 w in 6667236), product of:

    0.7955347 = queryWeight(moreWords:p 1102 w), product of:

      19.166107 = idf(moreWords: p=189166 1102=1135 w=445720)

      0.041507367 = queryNorm

    3.1121879 = fieldWeight(moreWords:p 1102 w in 6667236), product
 of:

      1.7320508 = tf(phraseFreq=3.0)

      19.166107 = idf(moreWords: p=189166 1102=1135 w=445720)

      0.09375 = fieldNorm(field=moreWords, doc=6667236)

  1.1726664 = weight(moreWords:laser jet in 6667236), product of:

    0.60590804 = queryWeight(moreWords:laser jet), product of:

      14.597603 = idf(moreWords: laser=26731 jet=12685)

      0.041507367 = queryNorm

    1.9353869 = fieldWeight(moreWords:laser jet in 6667236), product

please help explaining debug output

2011-07-25 Thread Robert Petersen
I have three documents with the following product titles in a text field
called moreWords with analysis stack matching the solr example text
field definition.

 

1.   HP LaserJet P1102W Monochrome Laser Printer
http://www.buy.com/prod/hp-laserjet-p1102w-monochrome-laser-printer/q/l
oc/101/213824965.html 

2.   HP CE285A (85A) Remanufactured Black Toner Cartridge for
LaserJet M1212nf, P1102, P1102W Series
http://www.buy.com/prod/hp-ce285a-85a-remanufactured-black-toner-cartri
dge-for-laserjet/q/loc/101/217145536.html 

3.   Black HP CE285A Toner Cartridge For LaserJet P1102W, LaserJet
M1130, LaserJet M1132, LaserJet M1210
http://www.buy.com/prod/black-hp-ce285a-toner-cartridge-for-laserjet-p1
102w-laserjet-m1130/q/loc/101/222045267.html 

 

A search for P1102W matches (2) and (3), but not (1) above.  Can someone
explain the debug output?  It looks like I am getting a non-match on (1)
because term frequency is zero?  Am I reading that right?  If so, how
could that be? the searched terms are equivalently in all three docs.  I
don't get it.

 

 

lst name=debug

str name=rawquerystringp1102w LaserJet /str

str name=querystringp1102w LaserJet /str

str name=parsedquery+PhraseQuery(moreWords:p 1102 w)
+PhraseQuery(moreWords:laser jet)/str

str name=parsedquery_toString+moreWords:p 1102 w +moreWords:laser
jet/str

lst name=explain

str name=222045267

3.64852 = (MATCH) sum of:

  2.4758534 = weight(moreWords:p 1102 w in 6667236), product of:

0.7955347 = queryWeight(moreWords:p 1102 w), product of:

  19.166107 = idf(moreWords: p=189166 1102=1135 w=445720)

  0.041507367 = queryNorm

3.1121879 = fieldWeight(moreWords:p 1102 w in 6667236), product
of:

  1.7320508 = tf(phraseFreq=3.0)

  19.166107 = idf(moreWords: p=189166 1102=1135 w=445720)

  0.09375 = fieldNorm(field=moreWords, doc=6667236)

  1.1726664 = weight(moreWords:laser jet in 6667236), product of:

0.60590804 = queryWeight(moreWords:laser jet), product of:

  14.597603 = idf(moreWords: laser=26731 jet=12685)

  0.041507367 = queryNorm

1.9353869 = fieldWeight(moreWords:laser jet in 6667236), product
of:

  1.4142135 = tf(phraseFreq=2.0)

  14.597603 = idf(moreWords: laser=26731 jet=12685)

  0.09375 = fieldNorm(field=moreWords, doc=6667236)

 

/str

str name=222045265

2.8656518 = (MATCH) sum of:

  1.4294347 = weight(moreWords:p 1102 w in 6684158), product of:

0.7955347 = queryWeight(moreWords:p 1102 w), product of:

  19.166107 = idf(moreWords: p=189166 1102=1135 w=445720)

  0.041507367 = queryNorm

1.7968225 = fieldWeight(moreWords:p 1102 w in 6684158), product
of:

  1.0 = tf(phraseFreq=1.0)

  19.166107 = idf(moreWords: p=189166 1102=1135 w=445720)

  0.09375 = fieldNorm(field=moreWords, doc=6684158)

  1.4362172 = weight(moreWords:laser jet in 6684158), product of:

0.60590804 = queryWeight(moreWords:laser jet), product of:

  14.597603 = idf(moreWords: laser=26731 jet=12685)

  0.041507367 = queryNorm

2.3703551 = fieldWeight(moreWords:laser jet in 6684158), product
of:

  1.7320508 = tf(phraseFreq=3.0)

  14.597603 = idf(moreWords: laser=26731 jet=12685)

  0.09375 = fieldNorm(field=moreWords, doc=6684158)

 

/str

/lst

str name=otherQuerysku:213824965

/str

lst name=explainOther

str name=213824965

0.0 = (NON-MATCH) Failure to meet condition(s) of required/prohibited
clause(s)

  1.1911955 = weight(moreWords:p 1102 w in 32497), product of:

0.7955347 = queryWeight(moreWords:p 1102 w), product of:

  19.166107 = idf(moreWords: p=189166 1102=1135 w=445720)

  0.041507367 = queryNorm

1.4973521 = fieldWeight(moreWords:p 1102 w in 32497), product of:

  1.0 = tf(phraseFreq=1.0)

  19.166107 = idf(moreWords: p=189166 1102=1135 w=445720)

  0.078125 = fieldNorm(field=moreWords, doc=32497)

  0.0 = no match on required clause (moreWords:laser jet)

0.0 = weight(moreWords:laser jet in 32497), product of:

  0.60590804 = queryWeight(moreWords:laser jet), product of:

14.597603 = idf(moreWords: laser=26731 jet=12685)

0.041507367 = queryNorm

  0.0 = fieldWeight(moreWords:laser jet in 32497), product of:

0.0 = tf(phraseFreq=0.0)

14.597603 = idf(moreWords: laser=26731 jet=12685)

0.078125 = fieldNorm(field=moreWords, doc=32497)

 

/str

/lst



RE: please help explaining debug output

2011-07-25 Thread Robert Petersen
Sorry, to clarify a search for P1102W matches all three docs but a
search for p1102w LaserJet only matches the second two.  Someone asked
me a question while I was typing and I got distracted, apologies for any
confusion.

-Original Message-
From: Robert Petersen [mailto:rober...@buy.com] 
Sent: Monday, July 25, 2011 1:42 PM
To: solr-user@lucene.apache.org
Subject: please help explaining debug output

I have three documents with the following product titles in a text field
called moreWords with analysis stack matching the solr example text
field definition.

 

1.   HP LaserJet P1102W Monochrome Laser Printer
http://www.buy.com/prod/hp-laserjet-p1102w-monochrome-laser-printer/q/l
oc/101/213824965.html 

2.   HP CE285A (85A) Remanufactured Black Toner Cartridge for
LaserJet M1212nf, P1102, P1102W Series
http://www.buy.com/prod/hp-ce285a-85a-remanufactured-black-toner-cartri
dge-for-laserjet/q/loc/101/217145536.html 

3.   Black HP CE285A Toner Cartridge For LaserJet P1102W, LaserJet
M1130, LaserJet M1132, LaserJet M1210
http://www.buy.com/prod/black-hp-ce285a-toner-cartridge-for-laserjet-p1
102w-laserjet-m1130/q/loc/101/222045267.html 

 

A search for P1102W matches (2) and (3), but not (1) above.  Can someone
explain the debug output?  It looks like I am getting a non-match on (1)
because term frequency is zero?  Am I reading that right?  If so, how
could that be? the searched terms are equivalently in all three docs.  I
don't get it.

 

 

lst name=debug

str name=rawquerystringp1102w LaserJet /str

str name=querystringp1102w LaserJet /str

str name=parsedquery+PhraseQuery(moreWords:p 1102 w)
+PhraseQuery(moreWords:laser jet)/str

str name=parsedquery_toString+moreWords:p 1102 w +moreWords:laser
jet/str

lst name=explain

str name=222045267

3.64852 = (MATCH) sum of:

  2.4758534 = weight(moreWords:p 1102 w in 6667236), product of:

0.7955347 = queryWeight(moreWords:p 1102 w), product of:

  19.166107 = idf(moreWords: p=189166 1102=1135 w=445720)

  0.041507367 = queryNorm

3.1121879 = fieldWeight(moreWords:p 1102 w in 6667236), product
of:

  1.7320508 = tf(phraseFreq=3.0)

  19.166107 = idf(moreWords: p=189166 1102=1135 w=445720)

  0.09375 = fieldNorm(field=moreWords, doc=6667236)

  1.1726664 = weight(moreWords:laser jet in 6667236), product of:

0.60590804 = queryWeight(moreWords:laser jet), product of:

  14.597603 = idf(moreWords: laser=26731 jet=12685)

  0.041507367 = queryNorm

1.9353869 = fieldWeight(moreWords:laser jet in 6667236), product
of:

  1.4142135 = tf(phraseFreq=2.0)

  14.597603 = idf(moreWords: laser=26731 jet=12685)

  0.09375 = fieldNorm(field=moreWords, doc=6667236)

 

/str

str name=222045265

2.8656518 = (MATCH) sum of:

  1.4294347 = weight(moreWords:p 1102 w in 6684158), product of:

0.7955347 = queryWeight(moreWords:p 1102 w), product of:

  19.166107 = idf(moreWords: p=189166 1102=1135 w=445720)

  0.041507367 = queryNorm

1.7968225 = fieldWeight(moreWords:p 1102 w in 6684158), product
of:

  1.0 = tf(phraseFreq=1.0)

  19.166107 = idf(moreWords: p=189166 1102=1135 w=445720)

  0.09375 = fieldNorm(field=moreWords, doc=6684158)

  1.4362172 = weight(moreWords:laser jet in 6684158), product of:

0.60590804 = queryWeight(moreWords:laser jet), product of:

  14.597603 = idf(moreWords: laser=26731 jet=12685)

  0.041507367 = queryNorm

2.3703551 = fieldWeight(moreWords:laser jet in 6684158), product
of:

  1.7320508 = tf(phraseFreq=3.0)

  14.597603 = idf(moreWords: laser=26731 jet=12685)

  0.09375 = fieldNorm(field=moreWords, doc=6684158)

 

/str

/lst

str name=otherQuerysku:213824965

/str

lst name=explainOther

str name=213824965

0.0 = (NON-MATCH) Failure to meet condition(s) of required/prohibited
clause(s)

  1.1911955 = weight(moreWords:p 1102 w in 32497), product of:

0.7955347 = queryWeight(moreWords:p 1102 w), product of:

  19.166107 = idf(moreWords: p=189166 1102=1135 w=445720)

  0.041507367 = queryNorm

1.4973521 = fieldWeight(moreWords:p 1102 w in 32497), product of:

  1.0 = tf(phraseFreq=1.0)

  19.166107 = idf(moreWords: p=189166 1102=1135 w=445720)

  0.078125 = fieldNorm(field=moreWords, doc=32497)

  0.0 = no match on required clause (moreWords:laser jet)

0.0 = weight(moreWords:laser jet in 32497), product of:

  0.60590804 = queryWeight(moreWords:laser jet), product of:

14.597603 = idf(moreWords: laser=26731 jet=12685)

0.041507367 = queryNorm

  0.0 = fieldWeight(moreWords:laser jet in 32497), product of:

0.0 = tf(phraseFreq=0.0)

14.597603 = idf(moreWords: laser=26731 jet=12685)

0.078125 = fieldNorm(field=moreWords, doc=32497)

 

/str

/lst



Re: please help explaining debug output

2011-07-25 Thread Erick Erickson
Hmmm, I'm assuming that moreWords is your default text field, yes?

But it works for me (tm), using 1.4.1. What version of Solr are you on?

Also, take a glance at the admin/analysis page, that might help...

Gotta run

Erick

On Mon, Jul 25, 2011 at 4:52 PM, Robert Petersen rober...@buy.com wrote:
 Sorry, to clarify a search for P1102W matches all three docs but a
 search for p1102w LaserJet only matches the second two.  Someone asked
 me a question while I was typing and I got distracted, apologies for any
 confusion.

 -Original Message-
 From: Robert Petersen [mailto:rober...@buy.com]
 Sent: Monday, July 25, 2011 1:42 PM
 To: solr-user@lucene.apache.org
 Subject: please help explaining debug output

 I have three documents with the following product titles in a text field
 called moreWords with analysis stack matching the solr example text
 field definition.



 1.       HP LaserJet P1102W Monochrome Laser Printer
 http://www.buy.com/prod/hp-laserjet-p1102w-monochrome-laser-printer/q/l
 oc/101/213824965.html

 2.       HP CE285A (85A) Remanufactured Black Toner Cartridge for
 LaserJet M1212nf, P1102, P1102W Series
 http://www.buy.com/prod/hp-ce285a-85a-remanufactured-black-toner-cartri
 dge-for-laserjet/q/loc/101/217145536.html

 3.       Black HP CE285A Toner Cartridge For LaserJet P1102W, LaserJet
 M1130, LaserJet M1132, LaserJet M1210
 http://www.buy.com/prod/black-hp-ce285a-toner-cartridge-for-laserjet-p1
 102w-laserjet-m1130/q/loc/101/222045267.html



 A search for P1102W matches (2) and (3), but not (1) above.  Can someone
 explain the debug output?  It looks like I am getting a non-match on (1)
 because term frequency is zero?  Am I reading that right?  If so, how
 could that be? the searched terms are equivalently in all three docs.  I
 don't get it.





 lst name=debug

 str name=rawquerystringp1102w LaserJet /str

 str name=querystringp1102w LaserJet /str

 str name=parsedquery+PhraseQuery(moreWords:p 1102 w)
 +PhraseQuery(moreWords:laser jet)/str

 str name=parsedquery_toString+moreWords:p 1102 w +moreWords:laser
 jet/str

 lst name=explain

 str name=222045267

 3.64852 = (MATCH) sum of:

  2.4758534 = weight(moreWords:p 1102 w in 6667236), product of:

    0.7955347 = queryWeight(moreWords:p 1102 w), product of:

      19.166107 = idf(moreWords: p=189166 1102=1135 w=445720)

      0.041507367 = queryNorm

    3.1121879 = fieldWeight(moreWords:p 1102 w in 6667236), product
 of:

      1.7320508 = tf(phraseFreq=3.0)

      19.166107 = idf(moreWords: p=189166 1102=1135 w=445720)

      0.09375 = fieldNorm(field=moreWords, doc=6667236)

  1.1726664 = weight(moreWords:laser jet in 6667236), product of:

    0.60590804 = queryWeight(moreWords:laser jet), product of:

      14.597603 = idf(moreWords: laser=26731 jet=12685)

      0.041507367 = queryNorm

    1.9353869 = fieldWeight(moreWords:laser jet in 6667236), product
 of:

      1.4142135 = tf(phraseFreq=2.0)

      14.597603 = idf(moreWords: laser=26731 jet=12685)

      0.09375 = fieldNorm(field=moreWords, doc=6667236)



 /str

 str name=222045265

 2.8656518 = (MATCH) sum of:

  1.4294347 = weight(moreWords:p 1102 w in 6684158), product of:

    0.7955347 = queryWeight(moreWords:p 1102 w), product of:

      19.166107 = idf(moreWords: p=189166 1102=1135 w=445720)

      0.041507367 = queryNorm

    1.7968225 = fieldWeight(moreWords:p 1102 w in 6684158), product
 of:

      1.0 = tf(phraseFreq=1.0)

      19.166107 = idf(moreWords: p=189166 1102=1135 w=445720)

      0.09375 = fieldNorm(field=moreWords, doc=6684158)

  1.4362172 = weight(moreWords:laser jet in 6684158), product of:

    0.60590804 = queryWeight(moreWords:laser jet), product of:

      14.597603 = idf(moreWords: laser=26731 jet=12685)

      0.041507367 = queryNorm

    2.3703551 = fieldWeight(moreWords:laser jet in 6684158), product
 of:

      1.7320508 = tf(phraseFreq=3.0)

      14.597603 = idf(moreWords: laser=26731 jet=12685)

      0.09375 = fieldNorm(field=moreWords, doc=6684158)



 /str

 /lst

 str name=otherQuerysku:213824965

 /str

 lst name=explainOther

 str name=213824965

 0.0 = (NON-MATCH) Failure to meet condition(s) of required/prohibited
 clause(s)

  1.1911955 = weight(moreWords:p 1102 w in 32497), product of:

    0.7955347 = queryWeight(moreWords:p 1102 w), product of:

      19.166107 = idf(moreWords: p=189166 1102=1135 w=445720)

      0.041507367 = queryNorm

    1.4973521 = fieldWeight(moreWords:p 1102 w in 32497), product of:

      1.0 = tf(phraseFreq=1.0)

      19.166107 = idf(moreWords: p=189166 1102=1135 w=445720)

      0.078125 = fieldNorm(field=moreWords, doc=32497)

  0.0 = no match on required clause (moreWords:laser jet)

    0.0 = weight(moreWords:laser jet in 32497), product of:

      0.60590804 = queryWeight(moreWords:laser jet), product of:

        14.597603 = idf(moreWords: laser=26731 jet=12685)

        0.041507367 = queryNorm

      0.0 = fieldWeight(moreWords:laser jet in 32497), product of:

        0.0 = tf

RE: please help explaining debug output

2011-07-25 Thread Robert Petersen
I'm still on solr 1.4.0 and the analysis page looks like they should match, and 
other products with the same content do in fact match.  I'm reindexing the 
non-matching ones to rule that out.

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Monday, July 25, 2011 1:58 PM
To: solr-user@lucene.apache.org
Subject: Re: please help explaining debug output

Hmmm, I'm assuming that moreWords is your default text field, yes?

But it works for me (tm), using 1.4.1. What version of Solr are you on?

Also, take a glance at the admin/analysis page, that might help...

Gotta run

Erick

On Mon, Jul 25, 2011 at 4:52 PM, Robert Petersen rober...@buy.com wrote:
 Sorry, to clarify a search for P1102W matches all three docs but a
 search for p1102w LaserJet only matches the second two.  Someone asked
 me a question while I was typing and I got distracted, apologies for any
 confusion.

 -Original Message-
 From: Robert Petersen [mailto:rober...@buy.com]
 Sent: Monday, July 25, 2011 1:42 PM
 To: solr-user@lucene.apache.org
 Subject: please help explaining debug output

 I have three documents with the following product titles in a text field
 called moreWords with analysis stack matching the solr example text
 field definition.



 1.       HP LaserJet P1102W Monochrome Laser Printer
 http://www.buy.com/prod/hp-laserjet-p1102w-monochrome-laser-printer/q/l
 oc/101/213824965.html

 2.       HP CE285A (85A) Remanufactured Black Toner Cartridge for
 LaserJet M1212nf, P1102, P1102W Series
 http://www.buy.com/prod/hp-ce285a-85a-remanufactured-black-toner-cartri
 dge-for-laserjet/q/loc/101/217145536.html

 3.       Black HP CE285A Toner Cartridge For LaserJet P1102W, LaserJet
 M1130, LaserJet M1132, LaserJet M1210
 http://www.buy.com/prod/black-hp-ce285a-toner-cartridge-for-laserjet-p1
 102w-laserjet-m1130/q/loc/101/222045267.html



 A search for P1102W matches (2) and (3), but not (1) above.  Can someone
 explain the debug output?  It looks like I am getting a non-match on (1)
 because term frequency is zero?  Am I reading that right?  If so, how
 could that be? the searched terms are equivalently in all three docs.  I
 don't get it.





 lst name=debug

 str name=rawquerystringp1102w LaserJet /str

 str name=querystringp1102w LaserJet /str

 str name=parsedquery+PhraseQuery(moreWords:p 1102 w)
 +PhraseQuery(moreWords:laser jet)/str

 str name=parsedquery_toString+moreWords:p 1102 w +moreWords:laser
 jet/str

 lst name=explain

 str name=222045267

 3.64852 = (MATCH) sum of:

  2.4758534 = weight(moreWords:p 1102 w in 6667236), product of:

    0.7955347 = queryWeight(moreWords:p 1102 w), product of:

      19.166107 = idf(moreWords: p=189166 1102=1135 w=445720)

      0.041507367 = queryNorm

    3.1121879 = fieldWeight(moreWords:p 1102 w in 6667236), product
 of:

      1.7320508 = tf(phraseFreq=3.0)

      19.166107 = idf(moreWords: p=189166 1102=1135 w=445720)

      0.09375 = fieldNorm(field=moreWords, doc=6667236)

  1.1726664 = weight(moreWords:laser jet in 6667236), product of:

    0.60590804 = queryWeight(moreWords:laser jet), product of:

      14.597603 = idf(moreWords: laser=26731 jet=12685)

      0.041507367 = queryNorm

    1.9353869 = fieldWeight(moreWords:laser jet in 6667236), product
 of:

      1.4142135 = tf(phraseFreq=2.0)

      14.597603 = idf(moreWords: laser=26731 jet=12685)

      0.09375 = fieldNorm(field=moreWords, doc=6667236)



 /str

 str name=222045265

 2.8656518 = (MATCH) sum of:

  1.4294347 = weight(moreWords:p 1102 w in 6684158), product of:

    0.7955347 = queryWeight(moreWords:p 1102 w), product of:

      19.166107 = idf(moreWords: p=189166 1102=1135 w=445720)

      0.041507367 = queryNorm

    1.7968225 = fieldWeight(moreWords:p 1102 w in 6684158), product
 of:

      1.0 = tf(phraseFreq=1.0)

      19.166107 = idf(moreWords: p=189166 1102=1135 w=445720)

      0.09375 = fieldNorm(field=moreWords, doc=6684158)

  1.4362172 = weight(moreWords:laser jet in 6684158), product of:

    0.60590804 = queryWeight(moreWords:laser jet), product of:

      14.597603 = idf(moreWords: laser=26731 jet=12685)

      0.041507367 = queryNorm

    2.3703551 = fieldWeight(moreWords:laser jet in 6684158), product
 of:

      1.7320508 = tf(phraseFreq=3.0)

      14.597603 = idf(moreWords: laser=26731 jet=12685)

      0.09375 = fieldNorm(field=moreWords, doc=6684158)



 /str

 /lst

 str name=otherQuerysku:213824965

 /str

 lst name=explainOther

 str name=213824965

 0.0 = (NON-MATCH) Failure to meet condition(s) of required/prohibited
 clause(s)

  1.1911955 = weight(moreWords:p 1102 w in 32497), product of:

    0.7955347 = queryWeight(moreWords:p 1102 w), product of:

      19.166107 = idf(moreWords: p=189166 1102=1135 w=445720)

      0.041507367 = queryNorm

    1.4973521 = fieldWeight(moreWords:p 1102 w in 32497), product of:

      1.0 = tf(phraseFreq=1.0)

      19.166107 = idf(moreWords: p=189166 1102=1135 w=445720)

      0.078125 = fieldNorm

Re: please help explaining debug output

2011-07-25 Thread Erick Erickson
Hmmm, I can't find a convenient 1.4.0 to download, but re-indexing is a good
idea since this seems like it *should* work.

Erick

On Mon, Jul 25, 2011 at 5:32 PM, Robert Petersen rober...@buy.com wrote:
 I'm still on solr 1.4.0 and the analysis page looks like they should match, 
 and other products with the same content do in fact match.  I'm reindexing 
 the non-matching ones to rule that out.

 -Original Message-
 From: Erick Erickson [mailto:erickerick...@gmail.com]
 Sent: Monday, July 25, 2011 1:58 PM
 To: solr-user@lucene.apache.org
 Subject: Re: please help explaining debug output

 Hmmm, I'm assuming that moreWords is your default text field, yes?

 But it works for me (tm), using 1.4.1. What version of Solr are you on?

 Also, take a glance at the admin/analysis page, that might help...

 Gotta run

 Erick

 On Mon, Jul 25, 2011 at 4:52 PM, Robert Petersen rober...@buy.com wrote:
 Sorry, to clarify a search for P1102W matches all three docs but a
 search for p1102w LaserJet only matches the second two.  Someone asked
 me a question while I was typing and I got distracted, apologies for any
 confusion.

 -Original Message-
 From: Robert Petersen [mailto:rober...@buy.com]
 Sent: Monday, July 25, 2011 1:42 PM
 To: solr-user@lucene.apache.org
 Subject: please help explaining debug output

 I have three documents with the following product titles in a text field
 called moreWords with analysis stack matching the solr example text
 field definition.



 1.       HP LaserJet P1102W Monochrome Laser Printer
 http://www.buy.com/prod/hp-laserjet-p1102w-monochrome-laser-printer/q/l
 oc/101/213824965.html

 2.       HP CE285A (85A) Remanufactured Black Toner Cartridge for
 LaserJet M1212nf, P1102, P1102W Series
 http://www.buy.com/prod/hp-ce285a-85a-remanufactured-black-toner-cartri
 dge-for-laserjet/q/loc/101/217145536.html

 3.       Black HP CE285A Toner Cartridge For LaserJet P1102W, LaserJet
 M1130, LaserJet M1132, LaserJet M1210
 http://www.buy.com/prod/black-hp-ce285a-toner-cartridge-for-laserjet-p1
 102w-laserjet-m1130/q/loc/101/222045267.html



 A search for P1102W matches (2) and (3), but not (1) above.  Can someone
 explain the debug output?  It looks like I am getting a non-match on (1)
 because term frequency is zero?  Am I reading that right?  If so, how
 could that be? the searched terms are equivalently in all three docs.  I
 don't get it.





 lst name=debug

 str name=rawquerystringp1102w LaserJet /str

 str name=querystringp1102w LaserJet /str

 str name=parsedquery+PhraseQuery(moreWords:p 1102 w)
 +PhraseQuery(moreWords:laser jet)/str

 str name=parsedquery_toString+moreWords:p 1102 w +moreWords:laser
 jet/str

 lst name=explain

 str name=222045267

 3.64852 = (MATCH) sum of:

  2.4758534 = weight(moreWords:p 1102 w in 6667236), product of:

    0.7955347 = queryWeight(moreWords:p 1102 w), product of:

      19.166107 = idf(moreWords: p=189166 1102=1135 w=445720)

      0.041507367 = queryNorm

    3.1121879 = fieldWeight(moreWords:p 1102 w in 6667236), product
 of:

      1.7320508 = tf(phraseFreq=3.0)

      19.166107 = idf(moreWords: p=189166 1102=1135 w=445720)

      0.09375 = fieldNorm(field=moreWords, doc=6667236)

  1.1726664 = weight(moreWords:laser jet in 6667236), product of:

    0.60590804 = queryWeight(moreWords:laser jet), product of:

      14.597603 = idf(moreWords: laser=26731 jet=12685)

      0.041507367 = queryNorm

    1.9353869 = fieldWeight(moreWords:laser jet in 6667236), product
 of:

      1.4142135 = tf(phraseFreq=2.0)

      14.597603 = idf(moreWords: laser=26731 jet=12685)

      0.09375 = fieldNorm(field=moreWords, doc=6667236)



 /str

 str name=222045265

 2.8656518 = (MATCH) sum of:

  1.4294347 = weight(moreWords:p 1102 w in 6684158), product of:

    0.7955347 = queryWeight(moreWords:p 1102 w), product of:

      19.166107 = idf(moreWords: p=189166 1102=1135 w=445720)

      0.041507367 = queryNorm

    1.7968225 = fieldWeight(moreWords:p 1102 w in 6684158), product
 of:

      1.0 = tf(phraseFreq=1.0)

      19.166107 = idf(moreWords: p=189166 1102=1135 w=445720)

      0.09375 = fieldNorm(field=moreWords, doc=6684158)

  1.4362172 = weight(moreWords:laser jet in 6684158), product of:

    0.60590804 = queryWeight(moreWords:laser jet), product of:

      14.597603 = idf(moreWords: laser=26731 jet=12685)

      0.041507367 = queryNorm

    2.3703551 = fieldWeight(moreWords:laser jet in 6684158), product
 of:

      1.7320508 = tf(phraseFreq=3.0)

      14.597603 = idf(moreWords: laser=26731 jet=12685)

      0.09375 = fieldNorm(field=moreWords, doc=6684158)



 /str

 /lst

 str name=otherQuerysku:213824965

 /str

 lst name=explainOther

 str name=213824965

 0.0 = (NON-MATCH) Failure to meet condition(s) of required/prohibited
 clause(s)

  1.1911955 = weight(moreWords:p 1102 w in 32497), product of:

    0.7955347 = queryWeight(moreWords:p 1102 w), product of:

      19.166107 = idf(moreWords: p=189166 1102=1135 w=445720