Hi Mike, >>To be clear, weighting hits that come from different index definitions >>has always been possible. 2.2 will have a staff client interface to >>make it easier, but the capability has been there all along.
Is this staff client interface already available in master? If so, can you give me a little more information on how this is done? Thanks! Kathy >>-----Original Message----- >>From: [email protected] [mailto:open- >>[email protected]] On Behalf Of Mike >>Rylander >>Sent: Wednesday, March 07, 2012 10:11 AM >>To: Evergreen Discussion Group >>Subject: Re: [OPEN-ILS-GENERAL] Improving relevance ranking in >>Evergreen >> >>On Wed, Mar 7, 2012 at 8:35 AM, Hardy, Elaine >><[email protected]> wrote: >>> Kathy, >>> >>> While the relevance display is much improved in 2.x, it would be good >>to >>> have greater relevance given, in a keyword search, to title >>(specifically >>> the 245)and then subject fields. I also see where having a popularity >>> ranking might be beneficial. >>> >>> I just had to explain to a board member of one of our libraries why >>his >>> search for John Sandford turned up children's titles first. So having >>MARC >>> field 100s ranked higher than 700 in author searches would be >>beneficial >>> as well. >>> >> >>To be clear, weighting hits that come from different index definitions >>has always been possible. 2.2 will have a staff client interface to >>make it easier, but the capability has been there all along. >> >>Weighting different parts of one indexed term -- say, weighting the >>title embedded in the keyword blob higher than the subjects embedded >>in the same blob -- would require the above-mentioned "make use of >>tsearch class weighting". But one can approximate that today by >>duplicating the index definitions from, say, title, author and subject >>classes within the keyword class. >> >>-- >>Mike Rylander >> | Director of Research and Development >> | Equinox Software, Inc. / Your Library's Guide to Open Source >> | phone: 1-877-OPEN-ILS (673-6457) >> | email: [email protected] >> | web: http://www.esilibrary.com >> >> >>> I can't comment on any of the coding possibilities other than to say >>which >>> every way doesn't negatively impact search return time is preferable. >>> >>> Elaine >>> >>> >>> J. Elaine Hardy >>> PINES Bibliographic Projects and Metadata Manager >>> Georgia Public Library Service, >>> A Unit of the University System of Georgia >>> 1800 Century Place, Suite 150 >>> Atlanta, Ga. 30345-4304 >>> 404.235-7128 >>> 404.235-7201, fax >>> >>> [email protected] >>> www.georgialibraries.org >>> http://www.georgialibraries.org/pines/ >>> >>> >>> -----Original Message----- >>> From: [email protected] >>> [mailto:[email protected]] On Behalf >>Of >>> Kathy Lussier >>> Sent: Tuesday, March 06, 2012 4:43 PM >>> To: 'Evergreen Discussion Group' >>> Subject: [OPEN-ILS-GENERAL] Improving relevance ranking in Evergreen >>> >>> Hi all, >>> >>> I mentioned this during an e-mail discussion on the list last month, >>but I >>> just wanted to hear from others in the Evergreen community about >>whether >>> there is a desire to improve the relevance ranking for search results >>in >>> Evergreen. Currently, we can tweak relevancy in the opensrf.xml, and >>it >>> can look at things like the document length, word proximity, and >>unique >>> word count. We've found that we had to remove the modifiers for >>document >>> length and unique word count to prevent a problem where brief bib >>records >>> were ranked way too high in our search results. >>> >>> In our local discussions, we've thought the following enhancements >>could >>> improve the ranking of search results: >>> >>> * Giving greater weight to a record if the search terms appear in the >>> title or subject (ideally, we would like these field to be >>configurable.) >>> This is something that is tweakable in search.relevance_ranking, but >>my >>> understanding is that the use of these tweaks results in a major >>reduction >>> in search performance. >>> >>> * Using some type of popularity metric to boost relevancy for popular >>> titles. I'm not sure what this metric should be (number of copies >>attached >>> to record? Total circs in last x months? Total current circs?), but >>we >>> believe some type of popularity measure would be particularly helpful >>in a >>> public library where searches will often be for titles that are >>popular. >>> For example, a search for "twilight" will most likely be for the >>Stephanie >>> Meyers novel and not this >>> http://books.google.com/books/about/Twilight.html?id=zEhkpXCyGzIC. >>Mike >>> Rylander had indicated in a previous e-mail >>> (http://markmail.org/message/h6u5r3sy4nr36wsl) that we might be able >>to >>> handle this through an overnight cron job without a negative impact >>on >>> search speeds. >>> >>> Do others think these two enhancements would improve the search >>results in >>> Evergreen? Do you think there are other things we could do to improve >>> relevancy? My main concern would be that any changes might slow down >>> search speeds, and I would want to make sure that we could do >>something to >>> retrieve better search results without a slowdown. >>> >>> Also, I was wondering if this type of project might be a good >>candidate >>> for a Google Summer of Code project. >>> >>> I look forward to hearing your feedback! >>> >>> Kathy >>> >>> ------------------------------------------------------------- >>> Kathy Lussier >>> Project Coordinator >>> Massachusetts Library Network Cooperative >>> (508) 756-0172 >>> (508) 755-3721 (fax) >>> [email protected] >>> IM: kmlussier (AOL & Yahoo) >>> Twitter: http://www.twitter.com/kmlussier >>> >>> >>> >>>
