Hi,
      I am afraid I didn't describe clearly enough in my last mail.
Let me describe it again.
For example, there are 5 documents as doc1,doc2,doc3,doc4,doc5 in the
search hits. And their updateTimes are respectively
 t1 = doc1.updateTime = 2009-01-01 12:45:00
 t2 = doc2.updateTime = 2009-01-01 15:30:00
t3 = doc3.updateTime = 2009-01-05 09:45:00
t4 = doc4.updateTime = 2009-08-01 12:45:00
t5 = doc5.updateTime = 2009-11-27 12:45:00
Suppose their relevancy scores are:
score1 = doc1.score = 2.4
score2 = doc2.score = 2.3
score3 = doc3.score = 2.3
score4 = doc4.score = 1.8
score5 = doc5.score = 1.6
If I don't care the updateTime and I sort by document score
(relevancy), the sequence should be doc1 > doc2 > doc3 > doc4 > doc5,
am I right?
But we should take the updateTime as a sorting factor. Through the
function f(t), we can calculate values according to updateTimes.
Suppose the values are
 v1 = f(t1) = 2.00
 v2 = f(t2) = 2.01
 v3 = f(t3) = 2.1
 v4 = f(t4) = 2.5
 v5 = f(t5) = 3.5
So the final result is:
r1 = v1 * score1 = 2      * 2.4 = 4.8
r2 = v2 * score2 = 2.01 * 2.3 = 4.623
r3 = v3 * score3 = 2.1   * 2.3 =  4.83
r4 = v4 * score4 = 2.5   * 1.8 = 4.5
r5 = v5 * score5 = 3.5   * 1.6 =  5.6,
r5 > r3 > r1 > r2 > r4
the sequence should be doc5 > doc3 > doc1 > doc2 > doc4 .

    In the above example, we can see althrough score1(= 2.4) >
score2(=2.3) = score3(=2.3), but t2 is almost  3 hours bigger than
t1,and t3 is almost 4 days bigger than t1. We think 3 hours is a small
value,and 4 days maybe a much big value. So the final result r3 > r1 >
r2. And we can also change the updateTime's proportion in sorting
factors through changing the function f(t).

   Am I describing clearly?






2009/11/26 Savvas-Andreas Moysidis <savvas.andreas.moysi...@googlemail.com>:
> hi,
>
>
>
> I’m not exactly sure I understand they the type of sorting you are trying to
> achieve.
>
> You have an updateTime field and you mention that "We want the new document
> in the
> front and also want high score document in the front".
>
> My take on this is that you want to first sort by the updateTime and then by
> score but you say this is not the case?
>
>
> Instead of calculating a boost value with f(t) can you not calculate and
> index the actual value you need for every document?
>
> Then you can  first sort by this value and then by score?
>
>
>
> regards,
>
> savvas
>
>
> 2009/11/26 Wilson Wu <songzi0...@gmail.com>
>
>> Hi,
>>     Recently, there is a requirement to sort the hits by both the
>> scores of documents and the updateTime which is a field of document to
>> mark the document's update time.  We want the new document in the
>> front and also want high score document in the front,in other words,
>> we want to mix the score and updateTime, but not first sort by
>> one,second by the other. So, I design a time based function f(t) to
>> calculte each document boost to write into the index store.
>>      The result is that I can caculate a value for each document
>> based its update time, and the value can influence the document score
>> through adjusting the fieldNorm value. But when I lookup the boost
>> value through the method document.getBoost() from every document in
>> the index store, I found the boost value = 1.0. Which means I can set
>> a document's boost value and the boost value can adjust the final
>> score, but I can't read the boost value from the document I have
>> searched out.
>>    Is it a bug in lucene? Thanks. I use lucene version 2.4.1.
>>    PS: Is there any other way to meet my reqirement?  I think it is
>> not a good idea to adjust document's final score through writing a
>> document boost into the index store. Because if I want to open two
>> interfaces to the Client: one is sorting documents only by score which
>> is just the similarity score and has not been adjusted by boost value
>> f(t), the other is sorting by final score which has been adjuested by
>> boost value f(t). Thank a lot!
>>
>>                                               wilson
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to