On 10/14/06, Jong Kim <[EMAIL PROTECTED]> wrote:
Hi,

I'm looking for a stemmer that is capable of returning all morphological
variants  of a query term (to be used for high-recall search). For example,
given a query term of 'cares', I would like to be able to generate 'cares',
'care', 'cared', and 'caring'.

I looked at the Porter stemmer, Snowball stemmer, and the K-stem.
All of them provide a method that takes a surface string ('cares') as an
input and returns its base form/stem, which is 'care' in this example.
But it appears that I can not use the stemmer to generate all of the
inflected forms of a given query term.

Stemming is a multi-step, lossy, one-way operation and  it does not
suprised me that none of these packages attempts the reverse
operation.

My suggestion is to create a reverse stemmer yourself by taking the
lexicon of your corpus, stemming all the terms, and inverting the map.
At query time, a lookup can be performed using a trie or hashtable.

best,
-Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to