for <1>. Not that I know of. What you can do, and relatively simply at that, is create a SolrJ program that uses Tika to parse the files on the *client*. At that point you can do anything you'd like, including detect language, route the document to the right core, etc. This will also give you more control over how meta-data parsed by Tika is mapped to your documents.
about <2>. This is hard, mostly because the queries have very little text to analyze. Consider not worrying about it, just send the query to all the languages in the hope that anything language specific is scored higher. You can do things like detect the language the browser defaults to or ask the user to provide a "preferred language", but trying to determine the language based on a short phrase is notoriously hard. Best Erick On Sun, Jan 29, 2012 at 10:50 PM, bing <nibing_...@hotmail.com> wrote: > Hi, all, > > I am going to multilingual search in multicore solr. Specifically, the > design of the solr server is like: I have several cores corresponding to > different languages, where each core has its configuration files and data. > > I have following questions: > > 1. While indexing a document, I use ExtractingRequestHandler in Tika0.10 > (embed in Solr3.5.0) and I can get a field "language_s" after indexing. Is > it possible to get the info of the "language_s" before indexing happens, so > that I can put the document in the corresponding core? > > 2. In searching with a query, is it possible that I can use language > detection function to determine the language code of the query, so that I > direct the query to the corresponding core? > > Thanks for your suggestions. > > Note: In this thread I would like to stick on multicore solr and want to > see whether the problems can be solved. Meanwhile, I am aware that > multilingual search does not necessarily need multicore solr, which I have > learned in previous thread. > http://lucene.472066.n3.nabble.com/Tika0-10-language-identifier-in-Solr3-5-0-tt3671712.html#none > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Multilingual-search-in-multicore-solr-tp3698969p3698969.html > Sent from the Solr - User mailing list archive at Nabble.com.