Thinking about this more, I don't think doing a second DB lookup for each result is going to scale well. It is possible that a single search returns tens of thousands of results, the very last one might be the most relevant. I am going to have to store the relevancy factors (it is more than just popularity) within the index itself.
I think I will write something to update the relevancy rating once a week or so for each indexed document. Afterall, I don't think Google updates their PageRank more than once a month or so. After that it is just a matter of sorting by that relevancy rating. Though, I read on the forums that sorting is a bit of an expensive procedure. Someone mentioned 100 searches / sec going down to 10 / sec. Not sure the details or the hardware. But that is an order of magnitude difference, if those results can be believed. Gonna experiment, I guess. On 5/18/07, Michael Garski <[EMAIL PROTECTED]> wrote:
Patrick, I've had to do something very similar, and you have a couple of options: 1. If the 'popularity' value is stored in a database, you can look up those values after performing your search against the index and then sort. 2. Continually update the index to reflect the most recent 'popularity' value and then perform a custom sort during your search. For my application, #2 is what we fond to be most efficient. Michael On May 18, 2007, at 4:48 AM, Patrick Burrows wrote: > Thanks guys. I'll try it out. > > My next question is going to be about ranking the results of my > searches > based on information that is not in the index (popularity, for > instance, > which might change hourly). Is there some reading I can do on the > subject > before I start asking questions? > > -- - P
