Phrase search using quotes -- special Tokenizer

Philip Brown Thu, 31 Aug 2006 22:38:11 -0700

Hi,

After running some tests using the StandardAnalyzer, and getting 0 results
from the search, I believe I need a special Tokenizer/Analyzer.  Does
anybody have something that parses like the following:


- doesn't parse apart phrases (in quotes)
- doesn't parse/separate hyphentated or underscored words
other normal stuff like
- parses on whitespace
- removes periods in acronyms
- lowercases everything (even in quotes? -- maybe)

I basically have a set of terms, some of which are multi-worded phrases, but
none should ever be broken apart -- not when adding the documents, not when
querying the search results, etc.  I'm creating the field in the documents
as UN_TOKENIZED and using a StandardAnalyzer and basic Query object to get
the results.  Any suggestions and/or existing code that I could re-use to
fit this purpose?

Thanks.
-- 
View this message in context: 
http://www.nabble.com/Phrase-search-using-quotes----special-Tokenizer-tf2200760.html#a6093138
Sent from the Lucene - Java Users forum at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Phrase search using quotes -- special Tokenizer

Reply via email to