for <1>. Not that I know of. What you can do, and relatively simply
at that, is create a SolrJ program that uses Tika to parse the files
on the *client*. At that point you can do anything you'd like, including
detect language, route the document to the right core, etc. This will
also give you more control over how meta-data parsed by Tika is
mapped to your documents.

about <2>. This is hard, mostly because the queries have very
little text to analyze. Consider not worrying about it, just send the
query to all the languages in the hope that anything language
specific is scored higher.

You can do things like detect the language the browser defaults
to or ask the user to provide a "preferred language", but trying
to determine the language based on a short phrase is notoriously
hard.

Best
Erick

On Sun, Jan 29, 2012 at 10:50 PM, bing <nibing_...@hotmail.com> wrote:
> Hi, all,
>
> I am going to multilingual search in multicore solr. Specifically, the
> design of the solr server is like: I have several cores corresponding to
> different languages, where each core has its configuration files and data.
>
> I have following questions:
>
> 1. While indexing a document, I use ExtractingRequestHandler in Tika0.10
> (embed in Solr3.5.0) and I can get a field "language_s" after indexing. Is
> it possible to get the info of the "language_s" before indexing happens, so
> that I can put the document in the corresponding core?
>
> 2. In searching with a query, is it possible that I can use language
> detection function to determine the language code of the query, so that I
> direct the query to the corresponding core?
>
> Thanks for your suggestions.
>
> Note:  In this thread I would like to stick on multicore solr and want to
> see whether the problems can be solved. Meanwhile, I am aware that
> multilingual search does not necessarily need multicore solr, which I have
> learned in previous thread.
> http://lucene.472066.n3.nabble.com/Tika0-10-language-identifier-in-Solr3-5-0-tt3671712.html#none
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Multilingual-search-in-multicore-solr-tp3698969p3698969.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to