Hi Chris,

thanks for the details, I am meanwhile poking around with my own class which I defined in the schema.xml everything is working perfectly there.

But I have still the problem with the normalization, I try to change several parameters to fix it to 1.0, this does indeed change the scoring but still not the real way I need it. It seems that it is always the "fieldNorm" which is playing, but where is this field really from ? In the Similarity Class I don't find this term to alter.

  Let me give a short example what goes wrong :

I have a field "searchname" with a boost of "3.0" during the document.add. Another field "text" is a copyField of several entries, this one does not have a boost factor, but indeed more data in it. In this text is a copy of a field where the text searched is in there 3times. This entry has the score : 5.5930133

But I have also entries where the searchname has the same word in it, but this one have a score of 1.9975047.

Currently my class is like this (I took the DefaultSimilarity as a basis);

  - lengthNorm is fixed to 1.0
  - tf fixed to 1.0
  - idf fixed to 1.0

With these changes, might it be possible that I've deactivated the boost on the different Fields.

What I need is, a search, which will handle each document the same, regardless of the frequency and the size, it shall calculate the score only on the boost factors, so a document with a hight boostfactor and the same text in it as another one with less factor shall be before the others.

  Something I do might be completely wrong, perhaps You have an idea ?

  Thanks,

   Tom

Reply via email to