We wrote our own MultiSearcher type class that manages this problem. It
takes in a query in the user's native language and then feeds it to the
searcher for that language, which uses a machine translation component
to create a query for that index using that language's Analyzer.
-Grant
[EMAIL PROTECTED] wrote:
Hello,
we need to index and search documents of multiple languages.
Our current approach is:
Determine the language of each document before passing it to Lucene and use
a Lucene index for each language. This seems to be necessary because the
IndexWriter takes an analyzer as parameter. Thus we can pass the English
documents to the IndexWriter created with the English analyzer and so on.
Our problem is the search: We would like to be able to search in only one or
all language specific indexes. Not a problem itself, because we can use the
MultiSearcher. But the MultiSearcher takes one query as parameter and the
query is generated using an analyzer. We would need to generate different
analyzed queries for the different indexes.
Did somebody find a solution for this problem and can point us a direction
to investigate further?
Greetings
Peter and Stefan
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
--
Grant Ingersoll
Sr. Software Engineer
Center for Natural Language Processing
Syracuse University
School of Information Studies
335 Hinds Hall
Syracuse, NY 13244
http://www.cnlp.org
Voice: 315-443-5484
Fax: 315-443-6886
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]