[
https://issues.apache.org/jira/browse/SOLR-4265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544735#comment-13544735
]
Uwe Schindler commented on SOLR-4265:
-------------------------------------
I am currently investigation to be more srict with parameter encoding:
Currently it is not an error if the %-encoded terms are not valid UTF-8
(URLDecoder of JDK replaces all invalid chars with ?). To make this more strict
and fail correctly (not silently doing the wrong thing), we could use
commons-codec's (we already use that library) to do a binary URL decoding:
http://commons.apache.org/codec/api-release/org/apache/commons/codec/net/URLCodec.html
URLCodec.decodeUrl takes byte[] and returns byte[]. Using this method we have
full flexibility on throwing encoding errors. We can in that case also pass the
byte[] contents from POST stream directly! Should we do this or not? The
current approach is greedy like webservers that also accept almost any wrong
encoded %XX stuff.
> Fix decoding of GET/POST parameters for servlet containers with non-UTF-8 URL
> parsing (Tomcat)
> ----------------------------------------------------------------------------------------------
>
> Key: SOLR-4265
> URL: https://issues.apache.org/jira/browse/SOLR-4265
> Project: Solr
> Issue Type: Bug
> Components: web gui
> Affects Versions: 4.0
> Environment: Windows but, environment independent
> Reporter: Alex Rocher
> Assignee: Uwe Schindler
> Attachments: CropperCapture[4].png, CropperCapture[5].png,
> CropperCapture[6].png, SOLR-4265.patch, SOLR-4265.patch, SOLR-4265.patch,
> SolrDispatchFilter.java.patch
>
>
> When you type an accent (in french language for example) in the console query
> tester, there's no charset conversion (servlet request charset conversion)
> Eg.: "même" is converted into it's ISO-8859-1 representation ==> fail
> The reason : getCharacterEncoding from HTTPRequest is not tested. Il it's
> null, il will assume to convert an UTF-8 encoding charset.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]