Hi thanks for the reply >> my dataset also seems to have a similar problem the chemical name >> alpha-androstane-3, and several others exsists in the given text, can anyone point out what is the best stratergy >> to employ so as to index >> words containing - _ + to be indexed as they are and not face being mutilated ? > >You have to use or write an Analyzer that doesn't tokenize on >non-letter or other characters.
Are there any built in analyzers that do that ? >> currently on my indexes the StandardAnalyzer and QueryParser break >> up >> alpha-androstane-3 >> into TEXT:alpha -TEXT:androstane -TEXT:3 , where TEXT is the Field >> to be >> searched > >Hm, I thought we've fixed QueryParser not to do this. Are you using >Lucene 1.4? no, i guess I will have to Rupinder --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
