Re: multi-language searching with Solr

2008-05-08 Thread Gereon Steffens
These are shards of one index and not multiple indexes. There is probably a way to get each shard to contain one language but then you end up with x servers for x languages, and some will be under utilized while other will be over utilized. Schemas will be identical, except for analysers. The l

Re: multi-language searching with Solr

2008-05-07 Thread Gereon Steffens
I have the same requirement, and from what I understand the distributed search feature will help implementing this, by having one shard per language. Am I right? Gereon Mike Klaas wrote: On 5-May-08, at 1:28 PM, Eli K wrote: Wouldn't this impact both indexing and search performance and the

Re: Solr 1.2 and MoreLikeThis

2007-10-29 Thread Gereon Steffens
DEFALT/DEFAULT typo), and MLT seems to work. Is that the correct procedure? If so, I'll update the wiki accordingly. Gereon Gereon Steffens wrote: Hi, I've been trying to get MoreLikeThis running in Solr 1.2, so far without success. Since there is no mention of any special installat

Solr 1.2 and MoreLikeThis

2007-10-29 Thread Gereon Steffens
Hi, I've been trying to get MoreLikeThis running in Solr 1.2, so far without success. Since there is no mention of any special installation steps in the Wiki, I had assumed that it was built into 1.2, but that does not seem to be the case. So now I've downloaded the patches from SOLR-69, and

Re: AW: UTF-8 2-byte vs 4-byte encodings

2007-05-02 Thread Gereon Steffens
Hi Chrisitian, > It is not sufficient to set the encoding in the XML but > you need an additional HTTP header to set the encoding ("Content-type: > text/xml; charset=UTF-8") Thanks, that's what I was missing. Gereon

UTF-8 2-byte vs 4-byte encodings

2007-05-02 Thread Gereon Steffens
Hi, I have a question regarding UTF-8 encodings, illustrated by the utf8-example.xml file. This file contains raw, unescaped UTF8 characters, for example the "e acute" character, represented as two bytes 0xC3 0xA9. When this file is added to Solar and retrieved later, the XML output contains a fou