Re: RE: Scoring without normalization!

2004-07-15 Thread Jones G
Sadly, I am still running into problems

Explain shows the following after the modification.

Rank: 1 ID: 11285358Score: 5.5740864E8
5.5740864E8 = product of:
  8.3611296E8 = sum of:
8.3611296E8 = product of:
  6.6889037E9 = weight(title:iron in 1235940), product of:
0.12621856 = queryWeight(title:iron), product of:
  7.0507255 = idf(docFreq=10816)
  0.017901499 = queryNorm
5.2994613E10 = fieldWeight(title:iron in 1235940), product of:
  1.0 = tf(termFreq(title:iron)=1)
  7.0507255 = idf(docFreq=10816)
  7.5161928E9 = fieldNorm(field=title, doc=1235940)
  0.125 = coord(1/8)
2.7106019E-8 = product of:
  1.08424075E-7 = sum of:
5.7318403E-9 = weight(abstract:an in 1235940), product of:
  0.03711049 = queryWeight(abstract:an), product of:
2.073038 = idf(docFreq=1569960)
0.017901499 = queryNorm
  1.5445337E-7 = fieldWeight(abstract:an in 1235940), product of:
1.0 = tf(termFreq(abstract:an)=1)
2.073038 = idf(docFreq=1569960)
7.4505806E-8 = fieldNorm(field=abstract, doc=1235940)
1.0269223E-7 = weight(abstract:iron in 1235940), product of:
  0.111071706 = queryWeight(abstract:iron), product of:
6.2046037 = idf(docFreq=25209)
0.017901499 = queryNorm
  9.24558E-7 = fieldWeight(abstract:iron in 1235940), product of:
2.0 = tf(termFreq(abstract:iron)=4)
6.2046037 = idf(docFreq=25209)
7.4505806E-8 = fieldNorm(field=abstract, doc=1235940)
  0.25 = coord(2/8)
  0.667 = coord(2/3)
Rank: 2 ID: 8157438 Score: 2.7870432E8
2.7870432E8 = product of:
  8.3611296E8 = product of:
6.6889037E9 = weight(title:iron in 159395), product of:
  0.12621856 = queryWeight(title:iron), product of:
7.0507255 = idf(docFreq=10816)
0.017901499 = queryNorm
  5.2994613E10 = fieldWeight(title:iron in 159395), product of:
1.0 = tf(termFreq(title:iron)=1)
7.0507255 = idf(docFreq=10816)
7.5161928E9 = fieldNorm(field=title, doc=159395)
0.125 = coord(1/8)
  0.3334 = coord(1/3)
Rank: 3 ID: 10543103Score: 2.7870432E8
2.7870432E8 = product of:
  8.3611296E8 = product of:
6.6889037E9 = weight(title:iron in 553967), product of:
  0.12621856 = queryWeight(title:iron), product of:
7.0507255 = idf(docFreq=10816)
0.017901499 = queryNorm
  5.2994613E10 = fieldWeight(title:iron in 553967), product of:
1.0 = tf(termFreq(title:iron)=1)
7.0507255 = idf(docFreq=10816)
7.5161928E9 = fieldNorm(field=title, doc=553967)
0.125 = coord(1/8)
  0.3334 = coord(1/3)
Rank: 4 ID: 8753559 Score: 2.7870432E8
2.7870432E8 = product of:
  8.3611296E8 = product of:
6.6889037E9 = weight(title:iron in 2563152), product of:
  0.12621856 = queryWeight(title:iron), product of:
7.0507255 = idf(docFreq=10816)
0.017901499 = queryNorm
  5.2994613E10 = fieldWeight(title:iron in 2563152), product of:
1.0 = tf(termFreq(title:iron)=1)
7.0507255 = idf(docFreq=10816)
7.5161928E9 = fieldNorm(field=title, doc=2563152)
0.125 = coord(1/8)
  0.3334 = coord(1/3)

I would like to get rid of all normalizations and just have TF and IDF.
What am I missing?


On Thu, 15 Jul 2004 Anson Lau wrote :
If you don't mind hacking the source:

In Hits.java

In method getMoreDocs()



 // Comment out the following
 //float scoreNorm = 1.0f;
 //if (length  0  scoreDocs[0].score  1.0f) {
 //  scoreNorm = 1.0f / scoreDocs[0].score;
 //}

 // And just set scoreNorm to 1.
 int scoreNorm = 1;


I don't know if u can do it without going to the src.

Anson


-Original Message-
 From: Jones G [mailto:[EMAIL PROTECTED]
Sent: Thursday, July 15, 2004 6:52 AM
To: [EMAIL PROTECTED]
Subject: Scoring without normalization!

How do I remove document normalization from scoring in Lucene? I just want
to stick to TF IDF.

Thanks.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Scoring without normalization!

2004-07-15 Thread Doug Cutting
Have you looked at:
http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/search/Similarity.html
in particular, at:
http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/search/Similarity.html#lengthNorm(java.lang.String,%20int)
http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/search/Similarity.html#queryNorm(float)
Doug
Jones G wrote:
Sadly, I am still running into problems
Explain shows the following after the modification.
Rank: 1 ID: 11285358Score: 5.5740864E8
5.5740864E8 = product of:
  8.3611296E8 = sum of:
8.3611296E8 = product of:
  6.6889037E9 = weight(title:iron in 1235940), product of:
0.12621856 = queryWeight(title:iron), product of:
  7.0507255 = idf(docFreq=10816)
  0.017901499 = queryNorm
5.2994613E10 = fieldWeight(title:iron in 1235940), product of:
  1.0 = tf(termFreq(title:iron)=1)
  7.0507255 = idf(docFreq=10816)
  7.5161928E9 = fieldNorm(field=title, doc=1235940)
  0.125 = coord(1/8)
2.7106019E-8 = product of:
  1.08424075E-7 = sum of:
5.7318403E-9 = weight(abstract:an in 1235940), product of:
  0.03711049 = queryWeight(abstract:an), product of:
2.073038 = idf(docFreq=1569960)
0.017901499 = queryNorm
  1.5445337E-7 = fieldWeight(abstract:an in 1235940), product of:
1.0 = tf(termFreq(abstract:an)=1)
2.073038 = idf(docFreq=1569960)
7.4505806E-8 = fieldNorm(field=abstract, doc=1235940)
1.0269223E-7 = weight(abstract:iron in 1235940), product of:
  0.111071706 = queryWeight(abstract:iron), product of:
6.2046037 = idf(docFreq=25209)
0.017901499 = queryNorm
  9.24558E-7 = fieldWeight(abstract:iron in 1235940), product of:
2.0 = tf(termFreq(abstract:iron)=4)
6.2046037 = idf(docFreq=25209)
7.4505806E-8 = fieldNorm(field=abstract, doc=1235940)
  0.25 = coord(2/8)
  0.667 = coord(2/3)
Rank: 2 ID: 8157438 Score: 2.7870432E8
2.7870432E8 = product of:
  8.3611296E8 = product of:
6.6889037E9 = weight(title:iron in 159395), product of:
  0.12621856 = queryWeight(title:iron), product of:
7.0507255 = idf(docFreq=10816)
0.017901499 = queryNorm
  5.2994613E10 = fieldWeight(title:iron in 159395), product of:
1.0 = tf(termFreq(title:iron)=1)
7.0507255 = idf(docFreq=10816)
7.5161928E9 = fieldNorm(field=title, doc=159395)
0.125 = coord(1/8)
  0.3334 = coord(1/3)
Rank: 3 ID: 10543103Score: 2.7870432E8
2.7870432E8 = product of:
  8.3611296E8 = product of:
6.6889037E9 = weight(title:iron in 553967), product of:
  0.12621856 = queryWeight(title:iron), product of:
7.0507255 = idf(docFreq=10816)
0.017901499 = queryNorm
  5.2994613E10 = fieldWeight(title:iron in 553967), product of:
1.0 = tf(termFreq(title:iron)=1)
7.0507255 = idf(docFreq=10816)
7.5161928E9 = fieldNorm(field=title, doc=553967)
0.125 = coord(1/8)
  0.3334 = coord(1/3)
Rank: 4 ID: 8753559 Score: 2.7870432E8
2.7870432E8 = product of:
  8.3611296E8 = product of:
6.6889037E9 = weight(title:iron in 2563152), product of:
  0.12621856 = queryWeight(title:iron), product of:
7.0507255 = idf(docFreq=10816)
0.017901499 = queryNorm
  5.2994613E10 = fieldWeight(title:iron in 2563152), product of:
1.0 = tf(termFreq(title:iron)=1)
7.0507255 = idf(docFreq=10816)
7.5161928E9 = fieldNorm(field=title, doc=2563152)
0.125 = coord(1/8)
  0.3334 = coord(1/3)
I would like to get rid of all normalizations and just have TF and IDF.
What am I missing?
On Thu, 15 Jul 2004 Anson Lau wrote :
If you don't mind hacking the source:
In Hits.java
In method getMoreDocs()

   // Comment out the following
   //float scoreNorm = 1.0f;
   //if (length  0  scoreDocs[0].score  1.0f) {
   //  scoreNorm = 1.0f / scoreDocs[0].score;
   //}
   // And just set scoreNorm to 1.
   int scoreNorm = 1;
I don't know if u can do it without going to the src.
Anson
-Original Message-
From: Jones G [mailto:[EMAIL PROTECTED]
Sent: Thursday, July 15, 2004 6:52 AM
To: [EMAIL PROTECTED]
Subject: Scoring without normalization!
How do I remove document normalization from scoring in Lucene? I just want
to stick to TF IDF.
Thanks.
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: Re: Scoring without normalization!

2004-07-15 Thread Jones G
: Thursday, July 15, 2004 6:52 AM
To: [EMAIL PROTECTED]
Subject: Scoring without normalization!

How do I remove document normalization from scoring in Lucene? I just want
to stick to TF IDF.

Thanks.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Scoring without normalization!

2004-07-14 Thread Jones G
How do I remove document normalization from scoring in Lucene? I just want to stick to 
TF IDF.

Thanks.

RE: Scoring without normalization!

2004-07-14 Thread Anson Lau
If you don't mind hacking the source:

In Hits.java

In method getMoreDocs()



// Comment out the following
//float scoreNorm = 1.0f;
//if (length  0  scoreDocs[0].score  1.0f) {
//  scoreNorm = 1.0f / scoreDocs[0].score;
//}

// And just set scoreNorm to 1.
int scoreNorm = 1;


I don't know if u can do it without going to the src.

Anson


-Original Message-
From: Jones G [mailto:[EMAIL PROTECTED] 
Sent: Thursday, July 15, 2004 6:52 AM
To: [EMAIL PROTECTED]
Subject: Scoring without normalization!

How do I remove document normalization from scoring in Lucene? I just want
to stick to TF IDF.

Thanks.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: RE: Scoring without normalization!

2004-07-14 Thread Jones G
Thanks! Just what I wanted.

On Thu, 15 Jul 2004 Anson Lau wrote :
If you don't mind hacking the source:

In Hits.java

In method getMoreDocs()



 // Comment out the following
 //float scoreNorm = 1.0f;
 //if (length  0  scoreDocs[0].score  1.0f) {
 //  scoreNorm = 1.0f / scoreDocs[0].score;
 //}

 // And just set scoreNorm to 1.
 int scoreNorm = 1;


I don't know if u can do it without going to the src.

Anson


-Original Message-
 From: Jones G [mailto:[EMAIL PROTECTED]
Sent: Thursday, July 15, 2004 6:52 AM
To: [EMAIL PROTECTED]
Subject: Scoring without normalization!

How do I remove document normalization from scoring in Lucene? I just want
to stick to TF IDF.

Thanks.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]