[ https://issues.apache.org/jira/browse/SOLR-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Uwe Schindler resolved SOLR-4283. --------------------------------- Resolution: Fixed Committed to trunk and 4.x. A next step would be to make the encoding of the GET-URLs configureable (using the defacto standard "&ie=charset" URL parameter, as used by most REST webservices of major search engines). > Improve URL decoding (followup of SOLR-4265) > -------------------------------------------- > > Key: SOLR-4283 > URL: https://issues.apache.org/jira/browse/SOLR-4283 > Project: Solr > Issue Type: Improvement > Affects Versions: 4.0 > Reporter: Uwe Schindler > Assignee: Uwe Schindler > Fix For: 4.1, 5.0 > > Attachments: index.jsp, request.http, SOLR-4283.patch, > SOLR-4283.patch, SOLR-4283.patch, SOLR-4283.patch, SOLR-4283.patch > > > Followup of SOLR-4265: > SOLR-4265 has 2 problems: > - it reads the whole InputStream into a String and this one can be big. This > wastes memory, especially when your query string from the POSted form data is > near the 2 Megabyte limit. The String is then packed in splitted form into a > big Map. > - it does not report corrupt UTF-8 > The attached patch will do 2 things: > - The decoding of the POSTed form data is done on the ServletInputStream, > directly parsing the bytes (not chars). Key/Value pairs are extracted and > %-decoded to byte[] on the fly. URL-parameters from getQueryString() are > parsed with the same code using ByteArrayInputStream on the original String, > interpreted as UTF-8 (this is a hack, because Servlet API does not give back > the original bytes from the HTTP request). To be standards conform, the query > String should be interpreted as US-ASCII, but with this approach, not full > escaped UTF-8 from the HTTP request survive. > - the byte[] key/value pairs are converted to Strings using CharsetDecoder > This will be memory efficient and will report incorrect escaped form data, so > people will no longer complain if searches hit no results or similar. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org