: Other things might use POST for querying though. Perhaps they can all : set a charset while doing so.
well, i can think of a couple of scenerios... 1) POST multipart/* to either /select or the new style URLs ... the browsers should put a content-type with a charset on each part; the ContentStream parsing code Ryan wrote should do the right thing, we only have to rely on the Servlet Container to do the right thing for the parts containing servlet request params -- hopefully they use the charset properly. 2) POST application/x-www-form-urlencoded to new style urls ... see below. 3) POST anything else to the new style urls ... parsed as a raw ContentStream, charset taken from the content-type -- should work fine. 4) POST application/x-www-form-urlencoded to the current /select ... see below. 5) POST */* to the /update ... it currently ignores content type and assumes UTF-8 regardless of servlet container config ... we could theoretically make it look at the content-type only for the charset and still ignore the meat of the content-type. 6) GET anything ... see below. "see below" is a situations where i don't think we can gleam anything from the request itself -- we have to make an assumption based on config. for #2 and #4 we could concievable have a solrconfig.xml option indicating what charset Solr should assume, and then we can (aparently) use HttpServletRequest.setCharacterEncoding to specify that's the charset we want the servlet container to use when parsing the input -- but i don't think this helps case #6 -- i can't find any portable way to tell the servlet container how to parse the URL, so if we have to rely on documentation to instruct people on how to deal with that, we might as well do the same thing for #2 and #4 (let it be in the servlet container config instead of hte solrconfig) (we should of course test all of these scenerios ... i'm just guessing #1, #3 and #5 all work okay) -Hoss
