Hi all,
One more idea would be using cryptograms to differentiate between
languages, and then u can use the delete stopwords and apply stemming
for particular language.
Regards
madhu
-Original Message-
From: Endre Stølsvik [mailto:[EMAIL PROTECTED]
Sent: Tuesday, September 27, 2005 4:08
Hi all
I have text docs similar to the TREC format some think like this
Full text search with one or more keywords with advanced search
operators to enhance search has to be implemented. Advanced search with
document attributes like author, title, type and Meta keyword in
add
ribs or
javadoc of core).
For which language do wish to use that ?
paul
Le 13 sept. 05, à 11:45, Madhu Satyanarayana Panitini a écrit :
> Hai all
>
> I want know the split pattern of text before indexing in Lucene, its
> splits where ever there is space in between the words Or
Hai all
I want know the split pattern of text before indexing in Lucene, its
splits where ever there is space in between the words Or is there any
pattern in splitting the words of text document. In which program I can
find the code on the splitting of the word.
Madhu
Madhu Satyanarayana