Joachim, Why don't you use the method explain of IndexSearcher? http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/search/IndexSear cher.html
This is the best way to find why your documents are different. I suspect the lengthNorm method, which is used at indexation time. Julien ----- Original Message ----- From: "Joachim Schreiber" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Tuesday, March 23, 2004 4:05 PM Subject: Similarity - position in Field[] effects scoring - how to change? > Hallo, > > I run in following problem. Perhaps somebody can help me. > > I have a index with different ids in the same field > something like > > <s>00000000 > <s>45678565 > <s>87854546 > > Situation: I have different documents with the entry <s>00000000 in the same > index. > > > document 1) > > <s>324235678565 > <s>324dssd5678565 > <s>45678324565 > <s>00000000 > <s>8785454324326 > > > document 2) > > <s>324235678565 > <s>00000000 > <s>45678324565 > <s>8785454324326 > > > > when I search for " s:00000000 " I receive both docs, but document 1 has a > better scoring than document 2. > The position of <s>00000000 in doc 1 is Field[4] and in doc 2 it's Field[2], > so this seems to effect scoring. > > How can I disable this behaviour, so doc 1 has the same scoring as doc 2??? > Which method do I have to overwrite in DefaultSimilarity. > Has anybody any idea, any help. > > Thanks > > yo > > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
