In a Solr-based search, stemming is done at indexing time, into fields with
stemmed tokens.
It seems typical in library-catalog type applications based on Solr to have the
default (or even only) searches be over these stemmed fields, thus
'auto-stemming' to the user. (Search for 'monkey', find 'monkeys' too, and vice
versa).
I am curious how many people, who have Solr based catalogs (that is, I'm
interested in people who have search engines with majority or only content
originally from MARC), use such stemmed fields ('auto-stemming') over their
_author_ fields as well.
There are pro's and con's to this. There are certainly some things in an author
field that would benefit from stemming (mostly various kinds of corporate
authors, some of whose endings end up looking like english language phrases).
There are also very many things in an author field that would not benefit from
stemming, and thus when stemming is done it sometimes(/often?) results in false
matches, "pluralizing" an author's last name in an inappropriate way for
instance.
So, wanna say on the list, if you are using a Solr-based catalog, are you using
stemmed fields for your author searches? Curious what people end up doing. If
there are any other more complicated clever things you've done than just
stem-or-not, let us know that too!
Jonathan