For dealing with parsing + indexing RTF, see chapter 7 of Lucene in Action.
For indexing text that has multiple languages.... I don't know what to recommend. Well, I do - try the StandardAnalyzer and see if that produces satisfactory results, but you'd really need a smart analyzer that knows how to properly tokenize and filter words from multiple languages, and I haven't heard of anyone doing that here. Otis --- tirupathi reddy <[EMAIL PROTECTED]> wrote: > Hello, > > I have to index the text in the .txt document. This text document > contains english characters , german characters etc. Please tell me > how can I index that text document. Is the procedure of indexing RTF > documents can be applied here? > > thanx, > MTREDDY > > > > Tirupati Reddy Manyam > 24-06-08, > Sundugaullee-24, > 79110 Freiburg > GERMANY. > > Phone: 00497618811257 > cell : 004917624649007 > > __________________________________________________ > Do You Yahoo!? > Tired of spam? Yahoo! Mail has the best spam protection around > http://mail.yahoo.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]