You won't get hits for "security" if you do not use the stemmer. The stem of 
"security" is the token that gets stored in the index.

If you don't use the stemming algorithm when you create the index you could search for 
"security" and only get those documents that contain "security".

See the FAQ 
http://lucene.sourceforge.net/cgi-bin/faq/faqmanager.cgi?file=chapter.indexing&toc=faq#q15

If you have a list of terms you want to treat differently (i.e. you know there are 
certain words you don't want to stem) you could build a custom TokenFilter that checks 
the tokens for those words before applying the stemming algorithm then add that 
TokenFilter to your analyzer. You might also consider allowing the tokens to be 
stemmed and adding the original non-stemmed term at the same position using 
Token.setPositionIncrement(0), you might also want to figure out some way to boost the 
score on those non-stemmed tokens when you build your query (not sure how you might 
accomplish that, but some custom query parsing code could do the trick).

Eric

-----Original Message-----
From: Mailing Lists Account [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, February 12, 2003 4:17 AM
To: [EMAIL PROTECTED]
Subject: Phrase query and porter stemmer


Hi,

I use PorterStemmer with my analyzer for indexing the documents.
And I have been using the same analyzer for searching too.

When I search for a phrase like "security" AND database, I would like to
avoid matches for
terms like "secure" or "securities" .  I observed that Google and couple of
search engines do
not return such matches.

1) In otherwords, in a single query, is it possible not to choose porter
stemmer for phrase queries and
    use for other queries (such as Term query etc)

2) As an alternative, is it advisable to manually construct a PhraseQuery by
adding terms without appling porter
   stemmer ?

regards
Ramesh



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to