The Keyword analyzer does no stemming or input modification of any sort: think of it as WYSIWYG for index population. The Whitespace analyzer simply removes spaces from your input (still no stemming), but the tokens are the individual words. I don't have the code in front of me, so I'm not sure if the Whitespace analyzer uses a LowerCaseTokenizer or not. Analyzers don't break on partial terms; they may take a term and break it down further, but the starting point is the term itself.
The analyzer won't detect the patterns -- it's your query. You'll want to look at WildcardQuery and/or PhraseQuery. Given the sequences you've listed as matches, you'll have to experiment with indexing to meet those scenarios. Start there and that should give you a better idea of what you need from your analyzer. Hope this helps. -- j On 5/19/06, AsifTheManRahman <[EMAIL PROTECTED]> wrote:
I need to know how the following analyzers work: Whitespace Keyword I am looking for an analyzer that will result in a hit if the string that is queried appears in the document being searched. For example, if I am looking for "A_B_C", then I want the analyzer to detect all of the following patterns: XXXA_B_CXXX, A_B_C, A_B_CXXX, XXXA_B_C, where XXX can be any character, string, etc. I am using the Whitespace analyzer right now and it looks like it only detects A_B_C when it appears in the document as a single string, i.e. wrapped around by white spaces (which is what the Whitespace analyzer is supposed to do, if I'm not mistaken). However, I have tried using the Keyword Analyzer with 1.9.1 as well, but in vain. I would like to know how exactly this analyzer would tokenize, say, the patterns mentioned above. Besides, the KeywordAnalyzer I believe is only available in lucene 1.9.1, but I prefer 1.4.3 in this case. Is there any other analyzer that could probably serve my purpose? If not, then I'm guessing I'll have to build my own. Since I am new to this kind of stuff, I was wondering if anyone could give me a heads-on on how/where to start. Thanks in advance. -- View this message in context: http://www.nabble.com/Analyzer+question-t1650271.html#a4469840 Sent from the Lucene - Java Users forum at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]