On Fri, Jan 8, 2010 at 16:27, Jamie <ja...@stimulussoft.com> wrote:
> Hi Ian / Will
>
> Thanks. Surely, the Porter Stemmer should not stem proper noun's. i.e. it
> could check the capitalization of the first letter of a word and whether or
> not the word is the start of sentence. If so, it could choose not apply any
> stemming. Or am I completely out of whack?
Look again: you're downcasing the terms before the Porter filter ever
sees them (which is, AIUI, necessary).  You might do well to combine
the tokenizing and downcasing step with some heuristic to find proper
nouns and not downcase or stem them.

Will

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to