Re: seach with non English characters in lenya

Emmanouil Batsis Tue, 20 Sep 2005 09:18:03 -0700

This may have to do with a recently fixed bug named "Invalid UTF-8 error for 
non-ASCII meta data", check it out at bugzilla or the dev list. I believe it 
is fixed, in any case the request/response pair must satisfy the following


* The request must be in a proper encoding, i.e. UTF-8 (usually the UI will 
use the encoding the previous response/page was served in)
* The container must be aware that the request is indeed in that encoding, 
which should be used to interpret parameters (this can be done by a servlet 
filter that calls request.setCharacterEncoding).

Manos

On Tuesday 20 September 2005 17:37, John Cherouvim wrote:
> After 6 hours of messing up with encodings I came up with a solution.
> edit this file: pubs\{YOURPUB}\lenya\content\search\search-and-results.xsp
> near line 180 you should see String query = <xsp-request:get-parameter
> name="query" default=""/>;
> after that add: query = new String(query.getBytes("ISO-8859-1"), "UTF-8");
>
> Which I fail to understand why, but it works.
> I cannot understand it as the encoding on the page is UTF-8 and not
> ISO-8859-1, so I thought that the contents of the forms would be already
> encoded in UTF-8. Some guy told me that all HTML forms send their
> contents in ISO-8859-1..
>
> Anyway this fixes all my problems and makes lucene work with native chars.
> If anyone can enlighten us about this subject or find a better way to
> solve this problem, please do so :)
>
> Regards,
> Ioannis
>
> LORIN ronan yann wrote:
> >Hi,
> >
> >How can we change the encoding ?
> >I've not been able to get something else than UTF-8
> >
> >Regards
> >-----Message d'origine-----
> >De : Felix Röthenbacher [mailto:[EMAIL PROTECTED]
> >Envoyé : mardi 20 septembre 2005 13:38
> >À : user@lenya.apache.org
> >Objet : Re: seach with non English characters in lenya
> >
> >
> >Hi Ioannis
> >
> >what is your encoding on the page with the search field?
> >
> >- Felix
> >
> >John Cherouvim wrote:
> >>Hello
> >>
> >>I cannot seem to be able to fix the problem with searching in lenya. I
> >>know lenya uses lucene to index the documents and search them. I've
> >>created the indexes and also checked them with the program Luke-Lucene
> >>Index Toolbox. The documents are indexed correctly with Greek characters
> >> > included which are searchable finely through Luke. The problem is that
> >> when I enter a search term in my default's
> >>publication search field, all the non-english chars are transformed to
> >>weird symbols such as: ÎµÎ½Î±.
> >>
> >>Is there a known solution available?
> >>Maybe the UTF-8 encoding is not applied somewhere through the pipeline?
> >>
> >>Regards,
> >>Ioannis
> >>
> >>---------------------------------------------------------------------
> >>To unsubscribe, e-mail: [EMAIL PROTECTED]
> >>For additional commands, e-mail: [EMAIL PROTECTED]
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: seach with non English characters in lenya

Reply via email to