[
https://issues.apache.org/jira/browse/SOLR-4265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544767#comment-13544767
]
Uwe Schindler commented on SOLR-4265:
-------------------------------------
Dawid: If the encoding for the URL would be different than UTF-8 we are again
at the problems we had before. So everything that comes in as part of the URL
has to be UTF-8 URLEncoded.
The additional formdata in POST has to be handled like the standard wants: You
have to respect the encoding given in the Content-Type. But you can be sure: No
browser actually sends it, so it is also always defaulting to UTF-8. But *if*
somebody sets the encoding, it must be rrespected.
For URLS you have no chance, as the encoding cannot be submitted wih the HTTP
request, unless HTTP/1.2 defines a new header that defines encoding for the URL
:-)
The polish example above is broken alltogether, as the URL in action is not a
valid URL. It also has the same problems as it is is not cross-browser
supported in the same way. The user will get Jetty errors like discussed above
(with IE) or otherwise broken behaviour. There is nothing we can do.
{quote}
- try to get character encoding from HTTP header; if not present, assume UTF-8
- decode the URI and the body (if POST) using the above encoding. If decoder
failures occur, return HTTP BAD_REQUEST.
{quote}
The body encoding says nothing about the URL encoding. Also you cannot give it
for GET requests. So enforcing URLs to be UTF-8 is consistent.
Uwe
> Fix decoding of GET/POST parameters for servlet containers with non-UTF-8 URL
> parsing (Tomcat)
> ----------------------------------------------------------------------------------------------
>
> Key: SOLR-4265
> URL: https://issues.apache.org/jira/browse/SOLR-4265
> Project: Solr
> Issue Type: Bug
> Components: web gui
> Affects Versions: 4.0
> Environment: Windows but, environment independent
> Reporter: Alex Rocher
> Assignee: Uwe Schindler
> Attachments: CropperCapture[4].png, CropperCapture[5].png,
> CropperCapture[6].png, SOLR-4265.patch, SOLR-4265.patch, SOLR-4265.patch,
> SOLR-4265.patch, SolrDispatchFilter.java.patch
>
>
> When you type an accent (in french language for example) in the console query
> tester, there's no charset conversion (servlet request charset conversion)
> Eg.: "même" is converted into it's ISO-8859-1 representation ==> fail
> The reason : getCharacterEncoding from HTTPRequest is not tested. Il it's
> null, il will assume to convert an UTF-8 encoding charset.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]