: > Content-type: application/x-www-form-urlencoded; charset=utf-8 : > : > ...picking the charset based on the charset of the page containing the : > form (i assume you tested and verified this isn't happening?) : : Yep, FireFox2. : I'd serve the page, do a search, kill the solr server, run nc -l -p : 8983, and run the search again. The body was encoded correctly, but : just no charset info.
yeah ... the google cache of "ppewww.physics.gla.ac.uk/~flavell/charset/form-i18n.html" (URL currently 403) suggests that browsers don't do this because a lot of old CGI parsing libraries can't handle it. RFC2070 section 5.2 suggests that this is one method that can be used -- but says "The best solution is to use the "multipart/form-data" media type" ... perhaps if we change the forms to use that explicitly things would work. acctually ... all of the existing forms we have are GET -- so it's kind of a moot issue isn't it? (i see there's a seperate thread about resin and UTF-8 in URLs - multipart/form-data wouldn't relaly help in thta case. Did you see my other comments from what seemed to be a resin FAQ about that mentioned "The character-encoding tag in the resin.conf." ... it sounds like that's what we should recomend to people using Resin ... i suspect they wouldn't even *have* to use UTF-8 .. they just have to set it to whatever encoding they want to use when POSTing queries. if setting character-encoding in the <web-app> tag works for URL encoded values, putting this in the resin.conf will probably work for that too. -Hoss
