Hi, I am afraid I didn't describe clearly enough in my last mail. Let me describe it again. For example, there are 5 documents as doc1,doc2,doc3,doc4,doc5 in the search hits. And their updateTimes are respectively t1 = doc1.updateTime = 2009-01-01 12:45:00 t2 = doc2.updateTime = 2009-01-01 15:30:00 t3 = doc3.updateTime = 2009-01-05 09:45:00 t4 = doc4.updateTime = 2009-08-01 12:45:00 t5 = doc5.updateTime = 2009-11-27 12:45:00 Suppose their relevancy scores are: score1 = doc1.score = 2.4 score2 = doc2.score = 2.3 score3 = doc3.score = 2.3 score4 = doc4.score = 1.8 score5 = doc5.score = 1.6 If I don't care the updateTime and I sort by document score (relevancy), the sequence should be doc1 > doc2 > doc3 > doc4 > doc5, am I right? But we should take the updateTime as a sorting factor. Through the function f(t), we can calculate values according to updateTimes. Suppose the values are v1 = f(t1) = 2.00 v2 = f(t2) = 2.01 v3 = f(t3) = 2.1 v4 = f(t4) = 2.5 v5 = f(t5) = 3.5 So the final result is: r1 = v1 * score1 = 2 * 2.4 = 4.8 r2 = v2 * score2 = 2.01 * 2.3 = 4.623 r3 = v3 * score3 = 2.1 * 2.3 = 4.83 r4 = v4 * score4 = 2.5 * 1.8 = 4.5 r5 = v5 * score5 = 3.5 * 1.6 = 5.6, r5 > r3 > r1 > r2 > r4 the sequence should be doc5 > doc3 > doc1 > doc2 > doc4 .
In the above example, we can see althrough score1(= 2.4) > score2(=2.3) = score3(=2.3), but t2 is almost 3 hours bigger than t1,and t3 is almost 4 days bigger than t1. We think 3 hours is a small value,and 4 days maybe a much big value. So the final result r3 > r1 > r2. And we can also change the updateTime's proportion in sorting factors through changing the function f(t). Am I describing clearly? 2009/11/26 Savvas-Andreas Moysidis <savvas.andreas.moysi...@googlemail.com>: > hi, > > > > I’m not exactly sure I understand they the type of sorting you are trying to > achieve. > > You have an updateTime field and you mention that "We want the new document > in the > front and also want high score document in the front". > > My take on this is that you want to first sort by the updateTime and then by > score but you say this is not the case? > > > Instead of calculating a boost value with f(t) can you not calculate and > index the actual value you need for every document? > > Then you can first sort by this value and then by score? > > > > regards, > > savvas > > > 2009/11/26 Wilson Wu <songzi0...@gmail.com> > >> Hi, >> Recently, there is a requirement to sort the hits by both the >> scores of documents and the updateTime which is a field of document to >> mark the document's update time. We want the new document in the >> front and also want high score document in the front,in other words, >> we want to mix the score and updateTime, but not first sort by >> one,second by the other. So, I design a time based function f(t) to >> calculte each document boost to write into the index store. >> The result is that I can caculate a value for each document >> based its update time, and the value can influence the document score >> through adjusting the fieldNorm value. But when I lookup the boost >> value through the method document.getBoost() from every document in >> the index store, I found the boost value = 1.0. Which means I can set >> a document's boost value and the boost value can adjust the final >> score, but I can't read the boost value from the document I have >> searched out. >> Is it a bug in lucene? Thanks. I use lucene version 2.4.1. >> PS: Is there any other way to meet my reqirement? I think it is >> not a good idea to adjust document's final score through writing a >> document boost into the index store. Because if I want to open two >> interfaces to the Client: one is sorting documents only by score which >> is just the similarity score and has not been adjusted by boost value >> f(t), the other is sorting by final score which has been adjuested by >> boost value f(t). Thank a lot! >> >> wilson >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org