Hi all,

We've had difficulty finding records in our catalog due to the automatic stemming that occurs when records are indexed in Evergreen. As an example, a title on one of our summer readings lists was "The Assist" by Neil Swidey. However, when users were searching for "the assist" as a title search with the phrase enclosed in quotations, they still had to page through several pages of results before finding the title they needed. Many of the records that ranked higher contained words like "assistance", "assistive", "assisted", etc. because they were automatically stemmed at indexing, and the stemmed version of the word (assist) was what was stored in the index vector column. We've had many other examples where this stemming has made it difficult to conduct searches.

In digging through IRC logs and other list messages regarding stemming, people have mentioned that this stemming can be turned off so that the full words are indexed rather than the stemmed versions of a word. Can anybody tell me how this is done? I understand that the records would need to be reingested, but is there a flag that needs to be disabled to turn off the stemming or does it require something else? Also, is there a way to use another dictionary for the stemmer so that the stemming is somewhat less aggressive than is used by the snowball stemmer? Overall, we like the concept of stemming, particularly when it retrieves results for both singular and plural versions of a word, but we've had many examples where stemming seems to be throwing users off course.

Has anybody else had similar issues?

Thanks!
Kathy

--
Kathy Lussier
Project Coordinator
Massachusetts Library Network Cooperative
(508) 343-0128
[email protected]
Twitter: http://www.twitter.com/kmlussier

Reply via email to