[ 
https://issues.apache.org/jira/browse/SOLR-4265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544643#comment-13544643
 ] 

Uwe Schindler commented on SOLR-4265:
-------------------------------------

Yonik: Internet Exploder is wrong you are right. What we enforce here is that 
all URLs coming in are in valid-encoded UTF-8, which is also documented in the 
official Solr docs.

The problem with IE is the standard one: Microsoft's sense of backwards 
compatibility, but should be easy to fix (but for some reason did not work 
here): Internet Options -> Advanced (tab) -> International -> Send UTF-8-URLs

bq. The old behavior did not result in an HTTP error, but I actually think this 
new behavior is preferable!

I would also prefer the new behaviour: The user knows that he is wrong and will 
not complain with horrible discussions making Solr the bad guy being wrong with 
unicode in the end. User gets clear message that his URL was wrong -> and the 
stack trace contains Jetty on the top!

I will do some small improvements in the patch soon and upload a new one. There 
is one inconsistency currently: When you have a POST request (form-encoded) but 
also add ?-parameters into the URL, the reported form-encoded charset (the one 
the client sends in the POST Content-Type) will be used to decode everything, 
also the URL params. This should be done separately. I have to refactor some 
code to do this correct [currently StandardRequestParser appends both the POST 
content and the URL-Query and decodes in one turn].
                
> Fix decoding of GET/POST parameters for servlet containers with non-UTF-8 URL 
> parsing (Tomcat)
> ----------------------------------------------------------------------------------------------
>
>                 Key: SOLR-4265
>                 URL: https://issues.apache.org/jira/browse/SOLR-4265
>             Project: Solr
>          Issue Type: Bug
>          Components: web gui
>    Affects Versions: 4.0
>         Environment: Windows but, environment independent
>            Reporter: Alex Rocher
>            Assignee: Uwe Schindler
>         Attachments: SOLR-4265.patch, SOLR-4265.patch, 
> SolrDispatchFilter.java.patch
>
>
> When you type an accent (in french language for example) in the console query 
> tester, there's no charset conversion (servlet request charset conversion)
> Eg.: "même" is converted into it's ISO-8859-1 representation ==> fail
> The reason : getCharacterEncoding from HTTPRequest is not tested. Il it's 
> null, il will assume to convert an UTF-8 encoding charset.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to