Re: Searching combined English-Japanese index

2007-10-04 Thread Maximilian Hütter
You were right, the indexing is already wrong. I debugged Solr and saw that the indexwriter gets the wrong values. That was because of the missing Content-Type in the update-requests. It was just text/xml without the charset=utf-8 . So it was interpreted as ISO-8859-1 Ithink. Changing the charset

Re: Searching combined English-Japanese index

2007-10-02 Thread Maximilian Hütter
Yonik Seeley schrieb: On 10/1/07, Maximilian Hütter [EMAIL PROTECTED] wrote: Yonik Seeley schrieb: On 10/1/07, Maximilian Hütter [EMAIL PROTECTED] wrote: When I search using an English term, I get results but the Japanese is not encoded correctly in the response. (although it is UTF-8

RE: Searching combined English-Japanese index

2007-10-02 Thread Lance Norskog
[mailto:[EMAIL PROTECTED] Sent: Tuesday, October 02, 2007 1:35 AM To: solr-user@lucene.apache.org Subject: Re: Searching combined English-Japanese index Yonik Seeley schrieb: On 10/1/07, Maximilian Hütter [EMAIL PROTECTED] wrote: Yonik Seeley schrieb: On 10/1/07, Maximilian Hütter [EMAIL

Re: Searching combined English-Japanese index

2007-10-02 Thread Yonik Seeley
On 10/2/07, Maximilian Hütter [EMAIL PROTECTED] wrote: Are you sure, they are wrong in the index? It's not an issue with Jetty output encoding since the python writer takes the string and converts it to ascii before that. Since Solr does no charset encoding itself on output, that must mean that

Searching combined English-Japanese index

2007-10-01 Thread Maximilian Hütter
Hi, I know there has been quite some discussion about Multilanguage searching already, but I am not quite sure this applies to my case. I have an index with field which contain Japanese and English at the same time. Is this possible? Tokenizing is not the big problem here, the

Re: Searching combined English-Japanese index

2007-10-01 Thread Yonik Seeley
On 10/1/07, Maximilian Hütter [EMAIL PROTECTED] wrote: When I search using an English term, I get results but the Japanese is not encoded correctly in the response. (although it is UTF-8 encoded) One quick thing to try is the python writer (wt=python) to see the actual unicode values of what

Re: Searching combined English-Japanese index

2007-10-01 Thread Yonik Seeley
On 10/1/07, Maximilian Hütter [EMAIL PROTECTED] wrote: Yonik Seeley schrieb: On 10/1/07, Maximilian Hütter [EMAIL PROTECTED] wrote: When I search using an English term, I get results but the Japanese is not encoded correctly in the response. (although it is UTF-8 encoded) One quick

RE: Searching combined English-Japanese index

2007-10-01 Thread Lance Norskog
combined English-Japanese index On 10/1/07, Maximilian Hütter [EMAIL PROTECTED] wrote: Yonik Seeley schrieb: On 10/1/07, Maximilian Hütter [EMAIL PROTECTED] wrote: When I search using an English term, I get results but the Japanese is not encoded correctly in the response. (although