We wrote our own MultiSearcher type class that manages this problem. It takes in a query in the user's native language and then feeds it to the searcher for that language, which uses a machine translation component to create a query for that index using that language's Analyzer.

-Grant

[EMAIL PROTECTED] wrote:
Hello,

we need to index and search documents of multiple languages.
Our current approach is:

Determine the language of each document before passing it to Lucene and use
a Lucene index for each language. This seems to be necessary because the
IndexWriter takes an analyzer as parameter. Thus we can pass the English
documents to the IndexWriter created with the English analyzer and so on.

Our problem is the search: We would like to be able to search in only one or
all language specific indexes. Not a problem itself, because we can use the
MultiSearcher. But the MultiSearcher takes one query as parameter and the
query is generated using an analyzer. We would need to generate different
analyzed queries for the different indexes.

Did somebody find a solution for this problem and can point us a direction
to investigate further?

Greetings
Peter and Stefan

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



--

Grant Ingersoll Sr. Software Engineer Center for Natural Language Processing Syracuse University School of Information Studies 335 Hinds Hall Syracuse, NY 13244 http://www.cnlp.org Voice: 315-443-5484 Fax: 315-443-6886

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to