[ 
https://issues.apache.org/jira/browse/SOLR-4265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544783#comment-13544783
 ] 

Uwe Schindler commented on SOLR-4265:
-------------------------------------

In Tomcat you can do this. Unfortunately, the default for Tomcat is to always 
use ISO-8859-1 as encoding for URL parameters and use the submitted body 
encoding (if available) for the form data. Jetty does the same, see 
[http://grepcode.com/file/repo1.maven.org/maven2/org.eclipse.jetty/jetty-server/8.1.7.v20120910/org/eclipse/jetty/server/Request.java#Request.extractParameters%28%29].

I will commit this patch later if nobody objects. We can open another issue to 
improve error handling on invalid encodings using commons-codec for URL 
decoding. This could also add the "ie=charset" request parameter (like Google 
and BING and others do), to specify the encoding of the URL request parameters.
                
> Fix decoding of GET/POST parameters for servlet containers with non-UTF-8 URL 
> parsing (Tomcat)
> ----------------------------------------------------------------------------------------------
>
>                 Key: SOLR-4265
>                 URL: https://issues.apache.org/jira/browse/SOLR-4265
>             Project: Solr
>          Issue Type: Bug
>          Components: web gui
>    Affects Versions: 4.0
>         Environment: Windows but, environment independent
>            Reporter: Alex Rocher
>            Assignee: Uwe Schindler
>         Attachments: CropperCapture[4].png, CropperCapture[5].png, 
> CropperCapture[6].png, SOLR-4265.patch, SOLR-4265.patch, SOLR-4265.patch, 
> SOLR-4265.patch, SOLR-4265.patch, SolrDispatchFilter.java.patch
>
>
> When you type an accent (in french language for example) in the console query 
> tester, there's no charset conversion (servlet request charset conversion)
> Eg.: "même" is converted into it's ISO-8859-1 representation ==> fail
> The reason : getCharacterEncoding from HTTPRequest is not tested. Il it's 
> null, il will assume to convert an UTF-8 encoding charset.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to