RE: Single Analyzer for multiple European languages

2005-09-27 Thread Madhu Satyanarayana Panitini
Hi all, One more idea would be using cryptograms to differentiate between languages, and then u can use the delete stopwords and apply stemming for particular language. Regards madhu -Original Message- From: Endre Stølsvik [mailto:[EMAIL PROTECTED] Sent: Tuesday, September 27, 2005 4:08

preserving document attributes

2005-09-14 Thread Madhu Satyanarayana Panitini
Hi all I have text docs similar to the TREC format some think like this Full text search with one or more keywords with advanced search operators to enhance search has to be implemented. Advanced search with document attributes like author, title, type and Meta keyword in add

RE: Splitting of words

2005-09-13 Thread Madhu Satyanarayana Panitini
ribs or javadoc of core). For which language do wish to use that ? paul Le 13 sept. 05, à 11:45, Madhu Satyanarayana Panitini a écrit : > Hai all > > I want know the split pattern of text before indexing in Lucene, its > splits where ever there is space in between the words Or

Spliting of words

2005-09-13 Thread Madhu Satyanarayana Panitini
Hai all I want know the split pattern of text before indexing in Lucene, its splits where ever there is space in between the words Or is there any pattern in splitting the words of text document. In which program I can find the code on the splitting of the word. Madhu Madhu Satyanarayana