On Mon, Mar 16, 2009 at 12:30 AM, Charles F. Munat <c...@munat.com> wrote:
> > Everything is being served UTF-8... that means the header is UTF-8, the > xml process directive is UTF-8, and there's a content type meta tag > setting it to UTF-8, and I can confirm in Firefox via both the View > > Character Encoding menu an Firebug that the page is being served as > UTF-8, so even without adding a character-encoding attribute to the > forms, the code should arrive at the server as UTF-8. > > Furthermore, I used TextMate to create a file with a c-cedilla in it, > saved it, reopened it as UTF-8 *to be certain*, then copied and pasted > the character into my form in Firefox. > > Somewhere between clicking on Save and that character showing up in the > database, it gets interpreted as Latin1 (despite being UTF-8) and then > converted into the wrong UTF-8 characters. > > Is anyone running Tomcat or some other servlet container and can test this? Yeah, if you're running Tomcat and not forcing Tomcat to interpret form data as UTF-8, that's the problem. > > > Chas. > > marius d. wrote: > > just out of curiosity are you setting manually in the HTTP header > > > > Content-Type: text/html; charset=UTF-8 > > > > and it's still broken? > > > > P.S. > > Sometimes HTTP equiv from HTML header just doesn't do the trick. > > > > Br's, > > Marius > > > > On Mar 15, 11:20 pm, Derek Chen-Becker <dchenbec...@gmail.com> wrote: > >> Crapola: > >> > >> http://jira.codehaus.org/browse/JETTY-958 > >> > >> I think I've confirmed that this is not lift. I added a non-lift input > text > >> element to an existing lift form: > >> > >> <input name="testthis" type="text" /> > >> > >> Then I use the following code, which I believe should be getting direct > >> access to Jetty's HttpServletRequest instance: > >> > >> Log.info("testthis = " + (S.request.map({r => > >> r.request.getParameter("testthis")}) openOr "not found!")) > >> > >> And when I put a cedilla in, I get: > >> > >> INFO - testthis = ç > >> > >> Can you confirm that you're using Jetty? I also tried the flags listed > in > >> the JIRA ticket: > >> > >> -Dorg.mortbay.util.URI.charset=utf-8 -Dfile.encoding=UTF-8 > >> > >> But they didn't seem to do anything (it didn't crash, though). I'm not > sure > >> if I specified those correctly for use with the Maven jetty:run command > >> line: > >> > >> mvn -Djetty.port=9090 -Dorg.mortbay.util.URI.charset=utf-8 > >> -Dfile.encoding=UTF-8 jetty:run > >> > >> Anyways, this doesn't look to be Lift's fault. I know that's not a great > >> answer. I'm trying to think of whether there's a clean, simple way to > "undo" > >> the bogus transform but I don't know enough about charset handling. One > more > >> interesting thing is that if I change my log code to: > >> > >> Log.info("testthis = " + (S.request.map({r => > r.request.getCharacterEncoding > >> + r.request.getParameter("testthis")}) openOr "not found!")) > >> > >> I get: > >> > >> INFO - testthis = nullç > >> > >> Which seems to indicate that the character encoding for the POST isn't > being > >> set. I tried overriding it: > >> > >> S.request.foreach{ r => r.request.setCharacterEncoding("UTF-8")} > >> > >> and that seems to have absolutely no effect (in fact, I get the same > "null" > >> log message). > >> > >> Derek > >> > >> On Sun, Mar 15, 2009 at 3:08 PM, Charles F. Munat <c...@munat.com> > wrote: > >> > >> > >> > >>> Marc Boschma wrote: > >>>>> When I use ç instead, the problem is that it is *not* > converted > >>>>> to ç as it goes into the database, and then on the way out the XML > >>>>> interpreter does not recognize it as a character entity reference > >>>>> and so > >>>>> converts the & to &. > >>>> I think this is due to using the standard Scala XML load functions > >>>> rather than the lift XML parser. From memory I don't think the > >>>> standard parser recognises that many named entities. ie. does ç > >>>> work instead of ç ? If so then that is probably what is > >>>> happening on this issue. > >>> ç goes into the database unchanged, but comes back out as > >>> &#x00E7. For that matter, & in the DB comes out as &amp; on > >>> the page. > >>> This is actually fine with me. It means that my users can just type &, > >>> <, > etc. and they will appear on the page that way (rather than being > >>> intepreted as HTML tags). It's safer, too. There is no way for them to > >>> insert HTML, especially script tags. > >>> So really, the only problem I have is that I need to be able to type a > ç > >>> and have it still a ç when it gets to the database. > >>> Chas. > > > > > > > -- Lift, the simply functional web framework http://liftweb.net Beginning Scala http://www.apress.com/book/view/1430219890 Follow me: http://twitter.com/dpp Git some: http://github.com/dpp --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Lift" group. To post to this group, send email to liftweb@googlegroups.com To unsubscribe from this group, send email to liftweb+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/liftweb?hl=en -~----------~----~----~----~------~----~------~--~---