Hi all.
I am trying to make some experiments in an algorithm that scores results by
counting how many words of the query submited are in a document.
For example if i enter the query
A B D A
The similarities I want to get for the documents follows:
A A C F D (2-found A and D)
A B D S S A (3 -
Before I make this questions I have been looking the list for over 2 hours
and I didn't find something to make me understand how to do what I want.
After you sent the message I made a quick pass through all your messages,
but I didn't find something. I also searched for FakeNormsIndexReader and
s
I feel kind of stupid...I don't get what hossman says in his post.
I got the thing abou the OMMIT_NORMS and I tried to do it by calling
Field.setOmitNorms(true); before adding a field in the index. After that I
re-indexed my collection but still not making any difference.
Tell me if I got it rig
But i don't want to get the frequency of each term in the doc.
what I want is 1 if the term exists in the doc and 0 if it doesn't. After
this, I want all thes 1s and 0s to be summed and give me a number to use as
a score.
If I set the TF value as 1 or 0, as I described above, I get the right
num
It is 4 in the morning here in Greece, so I will try it tomorrow...sometime I
must sleep!
I will come up with the results tomorrow.
Thanks!
Vagelis
markrmiller wrote:
>
> A...I brushed over your example too fast...looked like normal
> counting to me...I see now what you mean. So OMIT_NORM
eed. It really comes down to makeing a
> FakeNormsIndexReader. The problem you are having is a result of the
> field size normalization.
>
> - mark
>
> Vagelis Kotsonis wrote:
>> Hi all.
>> I am trying to make some experiments in an algorithm that scores results
>>
obably did
> work. Are you getting the results through hits? Hits will normalize. Use
> topdocs or a hitcollector.
>
> - Mark
>
> Vagelis Kotsonis wrote:
>> But i don't want to get the frequency of each term in the doc.
>>
>> what I want is 1 if the term e