Hi all, I mentioned this during an e-mail discussion on the list last month, but I just wanted to hear from others in the Evergreen community about whether there is a desire to improve the relevance ranking for search results in Evergreen. Currently, we can tweak relevancy in the opensrf.xml, and it can look at things like the document length, word proximity, and unique word count. We've found that we had to remove the modifiers for document length and unique word count to prevent a problem where brief bib records were ranked way too high in our search results.
In our local discussions, we've thought the following enhancements could improve the ranking of search results: * Giving greater weight to a record if the search terms appear in the title or subject (ideally, we would like these field to be configurable.) This is something that is tweakable in search.relevance_ranking, but my understanding is that the use of these tweaks results in a major reduction in search performance. * Using some type of popularity metric to boost relevancy for popular titles. I'm not sure what this metric should be (number of copies attached to record? Total circs in last x months? Total current circs?), but we believe some type of popularity measure would be particularly helpful in a public library where searches will often be for titles that are popular. For example, a search for "twilight" will most likely be for the Stephanie Meyers novel and not this http://books.google.com/books/about/Twilight.html?id=zEhkpXCyGzIC. Mike Rylander had indicated in a previous e-mail (http://markmail.org/message/h6u5r3sy4nr36wsl) that we might be able to handle this through an overnight cron job without a negative impact on search speeds. Do others think these two enhancements would improve the search results in Evergreen? Do you think there are other things we could do to improve relevancy? My main concern would be that any changes might slow down search speeds, and I would want to make sure that we could do something to retrieve better search results without a slowdown. Also, I was wondering if this type of project might be a good candidate for a Google Summer of Code project. I look forward to hearing your feedback! Kathy ------------------------------------------------------------- Kathy Lussier Project Coordinator Massachusetts Library Network Cooperative (508) 756-0172 (508) 755-3721 (fax) [email protected] IM: kmlussier (AOL & Yahoo) Twitter: http://www.twitter.com/kmlussier
