[ 
https://issues.apache.org/jira/browse/SOLR-4265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544770#comment-13544770
 ] 

Dawid Weiss commented on SOLR-4265:
-----------------------------------

I disagree with you here. The example is fine because browsers will escape the 
action URI for you; they'll just use the codepage of the origin HTML, whatever 
it was. So it's a perfectly valid HTTP request and a perfectly valid query. The 
fact that browser don't send the information which encoding they used for doing 
so is of no relevance. You can try it, the form above would be sent (from an 
Windows1250 encoded Web page) as:
{code}
$ nc -l 8081
POST /echo.jsp?abc=%B3%F3d%9F HTTP/1.1
Host: localhost:8081
Connection: keep-alive
Content-Length: 15
{code}

The URL is invalid in the HTML source but browsers will "fix" it for you 
(url-escape) and many people accept it as something ordinary.

bq. The body encoding says nothing about the URL encoding.

It does in the context of browsers. See my example above or try it yourself:
{code}
<%@ page language="java" contentType="text/html; charset=Cp1250" 
pageEncoding="UTF-8"%>
<html>
  <body>
    <form action="http://localhost:8081/echo.jsp?abc=łódź"; method="post" 
enctype="application/x-www-form-urlencoded">
      <input type="text" name="blah"><br>
      <input type="submit" value="Submit">
    </form>    
  </body>
</html>
{code}
                
> Fix decoding of GET/POST parameters for servlet containers with non-UTF-8 URL 
> parsing (Tomcat)
> ----------------------------------------------------------------------------------------------
>
>                 Key: SOLR-4265
>                 URL: https://issues.apache.org/jira/browse/SOLR-4265
>             Project: Solr
>          Issue Type: Bug
>          Components: web gui
>    Affects Versions: 4.0
>         Environment: Windows but, environment independent
>            Reporter: Alex Rocher
>            Assignee: Uwe Schindler
>         Attachments: CropperCapture[4].png, CropperCapture[5].png, 
> CropperCapture[6].png, SOLR-4265.patch, SOLR-4265.patch, SOLR-4265.patch, 
> SOLR-4265.patch, SOLR-4265.patch, SolrDispatchFilter.java.patch
>
>
> When you type an accent (in french language for example) in the console query 
> tester, there's no charset conversion (servlet request charset conversion)
> Eg.: "même" is converted into it's ISO-8859-1 representation ==> fail
> The reason : getCharacterEncoding from HTTPRequest is not tested. Il it's 
> null, il will assume to convert an UTF-8 encoding charset.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to