Re: RE: Scoring without normalization!
Sadly, I am still running into problems Explain shows the following after the modification. Rank: 1 ID: 11285358Score: 5.5740864E8 5.5740864E8 = product of: 8.3611296E8 = sum of: 8.3611296E8 = product of: 6.6889037E9 = weight(title:iron in 1235940), product of: 0.12621856 = queryWeight(title:iron), product of: 7.0507255 = idf(docFreq=10816) 0.017901499 = queryNorm 5.2994613E10 = fieldWeight(title:iron in 1235940), product of: 1.0 = tf(termFreq(title:iron)=1) 7.0507255 = idf(docFreq=10816) 7.5161928E9 = fieldNorm(field=title, doc=1235940) 0.125 = coord(1/8) 2.7106019E-8 = product of: 1.08424075E-7 = sum of: 5.7318403E-9 = weight(abstract:an in 1235940), product of: 0.03711049 = queryWeight(abstract:an), product of: 2.073038 = idf(docFreq=1569960) 0.017901499 = queryNorm 1.5445337E-7 = fieldWeight(abstract:an in 1235940), product of: 1.0 = tf(termFreq(abstract:an)=1) 2.073038 = idf(docFreq=1569960) 7.4505806E-8 = fieldNorm(field=abstract, doc=1235940) 1.0269223E-7 = weight(abstract:iron in 1235940), product of: 0.111071706 = queryWeight(abstract:iron), product of: 6.2046037 = idf(docFreq=25209) 0.017901499 = queryNorm 9.24558E-7 = fieldWeight(abstract:iron in 1235940), product of: 2.0 = tf(termFreq(abstract:iron)=4) 6.2046037 = idf(docFreq=25209) 7.4505806E-8 = fieldNorm(field=abstract, doc=1235940) 0.25 = coord(2/8) 0.667 = coord(2/3) Rank: 2 ID: 8157438 Score: 2.7870432E8 2.7870432E8 = product of: 8.3611296E8 = product of: 6.6889037E9 = weight(title:iron in 159395), product of: 0.12621856 = queryWeight(title:iron), product of: 7.0507255 = idf(docFreq=10816) 0.017901499 = queryNorm 5.2994613E10 = fieldWeight(title:iron in 159395), product of: 1.0 = tf(termFreq(title:iron)=1) 7.0507255 = idf(docFreq=10816) 7.5161928E9 = fieldNorm(field=title, doc=159395) 0.125 = coord(1/8) 0.3334 = coord(1/3) Rank: 3 ID: 10543103Score: 2.7870432E8 2.7870432E8 = product of: 8.3611296E8 = product of: 6.6889037E9 = weight(title:iron in 553967), product of: 0.12621856 = queryWeight(title:iron), product of: 7.0507255 = idf(docFreq=10816) 0.017901499 = queryNorm 5.2994613E10 = fieldWeight(title:iron in 553967), product of: 1.0 = tf(termFreq(title:iron)=1) 7.0507255 = idf(docFreq=10816) 7.5161928E9 = fieldNorm(field=title, doc=553967) 0.125 = coord(1/8) 0.3334 = coord(1/3) Rank: 4 ID: 8753559 Score: 2.7870432E8 2.7870432E8 = product of: 8.3611296E8 = product of: 6.6889037E9 = weight(title:iron in 2563152), product of: 0.12621856 = queryWeight(title:iron), product of: 7.0507255 = idf(docFreq=10816) 0.017901499 = queryNorm 5.2994613E10 = fieldWeight(title:iron in 2563152), product of: 1.0 = tf(termFreq(title:iron)=1) 7.0507255 = idf(docFreq=10816) 7.5161928E9 = fieldNorm(field=title, doc=2563152) 0.125 = coord(1/8) 0.3334 = coord(1/3) I would like to get rid of all normalizations and just have TF and IDF. What am I missing? On Thu, 15 Jul 2004 Anson Lau wrote : If you don't mind hacking the source: In Hits.java In method getMoreDocs() // Comment out the following //float scoreNorm = 1.0f; //if (length 0 scoreDocs[0].score 1.0f) { // scoreNorm = 1.0f / scoreDocs[0].score; //} // And just set scoreNorm to 1. int scoreNorm = 1; I don't know if u can do it without going to the src. Anson -Original Message- From: Jones G [mailto:[EMAIL PROTECTED] Sent: Thursday, July 15, 2004 6:52 AM To: [EMAIL PROTECTED] Subject: Scoring without normalization! How do I remove document normalization from scoring in Lucene? I just want to stick to TF IDF. Thanks. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Scoring without normalization!
Have you looked at: http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/search/Similarity.html in particular, at: http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/search/Similarity.html#lengthNorm(java.lang.String,%20int) http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/search/Similarity.html#queryNorm(float) Doug Jones G wrote: Sadly, I am still running into problems Explain shows the following after the modification. Rank: 1 ID: 11285358Score: 5.5740864E8 5.5740864E8 = product of: 8.3611296E8 = sum of: 8.3611296E8 = product of: 6.6889037E9 = weight(title:iron in 1235940), product of: 0.12621856 = queryWeight(title:iron), product of: 7.0507255 = idf(docFreq=10816) 0.017901499 = queryNorm 5.2994613E10 = fieldWeight(title:iron in 1235940), product of: 1.0 = tf(termFreq(title:iron)=1) 7.0507255 = idf(docFreq=10816) 7.5161928E9 = fieldNorm(field=title, doc=1235940) 0.125 = coord(1/8) 2.7106019E-8 = product of: 1.08424075E-7 = sum of: 5.7318403E-9 = weight(abstract:an in 1235940), product of: 0.03711049 = queryWeight(abstract:an), product of: 2.073038 = idf(docFreq=1569960) 0.017901499 = queryNorm 1.5445337E-7 = fieldWeight(abstract:an in 1235940), product of: 1.0 = tf(termFreq(abstract:an)=1) 2.073038 = idf(docFreq=1569960) 7.4505806E-8 = fieldNorm(field=abstract, doc=1235940) 1.0269223E-7 = weight(abstract:iron in 1235940), product of: 0.111071706 = queryWeight(abstract:iron), product of: 6.2046037 = idf(docFreq=25209) 0.017901499 = queryNorm 9.24558E-7 = fieldWeight(abstract:iron in 1235940), product of: 2.0 = tf(termFreq(abstract:iron)=4) 6.2046037 = idf(docFreq=25209) 7.4505806E-8 = fieldNorm(field=abstract, doc=1235940) 0.25 = coord(2/8) 0.667 = coord(2/3) Rank: 2 ID: 8157438 Score: 2.7870432E8 2.7870432E8 = product of: 8.3611296E8 = product of: 6.6889037E9 = weight(title:iron in 159395), product of: 0.12621856 = queryWeight(title:iron), product of: 7.0507255 = idf(docFreq=10816) 0.017901499 = queryNorm 5.2994613E10 = fieldWeight(title:iron in 159395), product of: 1.0 = tf(termFreq(title:iron)=1) 7.0507255 = idf(docFreq=10816) 7.5161928E9 = fieldNorm(field=title, doc=159395) 0.125 = coord(1/8) 0.3334 = coord(1/3) Rank: 3 ID: 10543103Score: 2.7870432E8 2.7870432E8 = product of: 8.3611296E8 = product of: 6.6889037E9 = weight(title:iron in 553967), product of: 0.12621856 = queryWeight(title:iron), product of: 7.0507255 = idf(docFreq=10816) 0.017901499 = queryNorm 5.2994613E10 = fieldWeight(title:iron in 553967), product of: 1.0 = tf(termFreq(title:iron)=1) 7.0507255 = idf(docFreq=10816) 7.5161928E9 = fieldNorm(field=title, doc=553967) 0.125 = coord(1/8) 0.3334 = coord(1/3) Rank: 4 ID: 8753559 Score: 2.7870432E8 2.7870432E8 = product of: 8.3611296E8 = product of: 6.6889037E9 = weight(title:iron in 2563152), product of: 0.12621856 = queryWeight(title:iron), product of: 7.0507255 = idf(docFreq=10816) 0.017901499 = queryNorm 5.2994613E10 = fieldWeight(title:iron in 2563152), product of: 1.0 = tf(termFreq(title:iron)=1) 7.0507255 = idf(docFreq=10816) 7.5161928E9 = fieldNorm(field=title, doc=2563152) 0.125 = coord(1/8) 0.3334 = coord(1/3) I would like to get rid of all normalizations and just have TF and IDF. What am I missing? On Thu, 15 Jul 2004 Anson Lau wrote : If you don't mind hacking the source: In Hits.java In method getMoreDocs() // Comment out the following //float scoreNorm = 1.0f; //if (length 0 scoreDocs[0].score 1.0f) { // scoreNorm = 1.0f / scoreDocs[0].score; //} // And just set scoreNorm to 1. int scoreNorm = 1; I don't know if u can do it without going to the src. Anson -Original Message- From: Jones G [mailto:[EMAIL PROTECTED] Sent: Thursday, July 15, 2004 6:52 AM To: [EMAIL PROTECTED] Subject: Scoring without normalization! How do I remove document normalization from scoring in Lucene? I just want to stick to TF IDF. Thanks. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Re: Scoring without normalization!
: Thursday, July 15, 2004 6:52 AM To: [EMAIL PROTECTED] Subject: Scoring without normalization! How do I remove document normalization from scoring in Lucene? I just want to stick to TF IDF. Thanks. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Scoring without normalization!
How do I remove document normalization from scoring in Lucene? I just want to stick to TF IDF. Thanks.
RE: Scoring without normalization!
If you don't mind hacking the source: In Hits.java In method getMoreDocs() // Comment out the following //float scoreNorm = 1.0f; //if (length 0 scoreDocs[0].score 1.0f) { // scoreNorm = 1.0f / scoreDocs[0].score; //} // And just set scoreNorm to 1. int scoreNorm = 1; I don't know if u can do it without going to the src. Anson -Original Message- From: Jones G [mailto:[EMAIL PROTECTED] Sent: Thursday, July 15, 2004 6:52 AM To: [EMAIL PROTECTED] Subject: Scoring without normalization! How do I remove document normalization from scoring in Lucene? I just want to stick to TF IDF. Thanks. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: RE: Scoring without normalization!
Thanks! Just what I wanted. On Thu, 15 Jul 2004 Anson Lau wrote : If you don't mind hacking the source: In Hits.java In method getMoreDocs() // Comment out the following //float scoreNorm = 1.0f; //if (length 0 scoreDocs[0].score 1.0f) { // scoreNorm = 1.0f / scoreDocs[0].score; //} // And just set scoreNorm to 1. int scoreNorm = 1; I don't know if u can do it without going to the src. Anson -Original Message- From: Jones G [mailto:[EMAIL PROTECTED] Sent: Thursday, July 15, 2004 6:52 AM To: [EMAIL PROTECTED] Subject: Scoring without normalization! How do I remove document normalization from scoring in Lucene? I just want to stick to TF IDF. Thanks. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]