Thanks Erik,
Here I describe about my research on this problem. It might be helpful
for someone :)
I will divide the problem with multiple language docs in some subproblems:
*1. Determining the language in the text documents.
1.1. Determining the language in document when the whole text is in on
I know this has been discussed several times, but sure don't remember the
answers. Search the mail archive for "multiple languages" and you'll find
some good suggestions. But as I remember, it's not a trivial issue.
But I don't see why the "three different documents" approach wouldn't work.
You c
Hi All,
Our application that uses Lucene for indexing will be used to index
documents that each of which contains parts written in different
languages. For example some document could contain English, Chinese and
Brazilian text. So how to index such document? Is there some best
practice to do