On Mon, Mar 16, 2009 at 12:30 AM, Charles F. Munat <c...@munat.com> wrote:

>
> Everything is being served UTF-8... that means the header is UTF-8, the
> xml process directive is UTF-8, and there's a content type meta tag
> setting it to UTF-8, and I can confirm in Firefox via both the View >
> Character Encoding menu an Firebug that the page is being served as
> UTF-8, so even without adding a character-encoding attribute to the
> forms, the code should arrive at the server as UTF-8.
>
> Furthermore, I used TextMate to create a file with a c-cedilla in it,
> saved it, reopened it as UTF-8 *to be certain*, then copied and pasted
> the character into my form in Firefox.
>
> Somewhere between clicking on Save and that character showing up in the
> database, it gets interpreted as Latin1 (despite being UTF-8) and then
> converted into the wrong UTF-8 characters.
>
> Is anyone running Tomcat or some other servlet container and can test this?


Yeah, if you're running Tomcat and not forcing Tomcat to interpret form data
as UTF-8, that's the problem.


>
>
> Chas.
>
> marius d. wrote:
> > just out of curiosity are you setting manually in the HTTP header
> >
> > Content-Type: text/html; charset=UTF-8
> >
> > and it's still broken?
> >
> > P.S.
> > Sometimes HTTP equiv from HTML header just doesn't do the trick.
> >
> > Br's,
> > Marius
> >
> > On Mar 15, 11:20 pm, Derek Chen-Becker <dchenbec...@gmail.com> wrote:
> >> Crapola:
> >>
> >> http://jira.codehaus.org/browse/JETTY-958
> >>
> >> I think I've confirmed that this is not lift. I added a non-lift input
> text
> >> element to an existing lift form:
> >>
> >> <input name="testthis" type="text" />
> >>
> >> Then I use the following code, which I believe should be getting direct
> >> access to Jetty's HttpServletRequest instance:
> >>
> >> Log.info("testthis = " + (S.request.map({r =>
> >> r.request.getParameter("testthis")}) openOr "not found!"))
> >>
> >> And when I put a cedilla in, I get:
> >>
> >> INFO - testthis = ç
> >>
> >> Can you confirm that you're using Jetty? I also tried the flags listed
> in
> >> the JIRA ticket:
> >>
> >> -Dorg.mortbay.util.URI.charset=utf-8 -Dfile.encoding=UTF-8
> >>
> >> But they didn't seem to do anything (it didn't crash, though). I'm not
> sure
> >> if I specified those correctly for use with the Maven jetty:run command
> >> line:
> >>
> >> mvn -Djetty.port=9090 -Dorg.mortbay.util.URI.charset=utf-8
> >> -Dfile.encoding=UTF-8 jetty:run
> >>
> >> Anyways, this doesn't look to be Lift's fault. I know that's not a great
> >> answer. I'm trying to think of whether there's a clean, simple way to
> "undo"
> >> the bogus transform but I don't know enough about charset handling. One
> more
> >> interesting thing is that if I change my log code to:
> >>
> >> Log.info("testthis = " + (S.request.map({r =>
> r.request.getCharacterEncoding
> >> + r.request.getParameter("testthis")}) openOr "not found!"))
> >>
> >> I get:
> >>
> >> INFO - testthis = nullç
> >>
> >> Which seems to indicate that the character encoding for the POST isn't
> being
> >> set. I tried overriding it:
> >>
> >> S.request.foreach{ r => r.request.setCharacterEncoding("UTF-8")}
> >>
> >> and that seems to have absolutely no effect (in fact, I get the same
> "null"
> >> log message).
> >>
> >> Derek
> >>
> >> On Sun, Mar 15, 2009 at 3:08 PM, Charles F. Munat <c...@munat.com>
> wrote:
> >>
> >>
> >>
> >>> Marc Boschma wrote:
> >>>>> When I use &ccedil; instead, the problem is that it is *not*
> converted
> >>>>> to ç as it goes into the database, and then on the way out the XML
> >>>>> interpreter does not recognize it as a character entity reference
> >>>>> and so
> >>>>> converts the & to &amp;.
> >>>> I think this is due to using the standard Scala XML load functions
> >>>> rather than the lift XML parser. From memory I don't think the
> >>>> standard parser recognises that many named entities. ie. does &#x00E7;
> >>>> work instead of &ccedil; ? If so then that is probably what is
> >>>> happening on this issue.
> >>> &#x00E7; goes into the database unchanged, but comes back out as
> >>> &amp;#x00E7. For that matter, &amp; in the DB comes out as &amp;amp; on
> >>> the page.
> >>> This is actually fine with me. It means that my users can just type &,
> >>> <, > etc. and they will appear on the page that way (rather than being
> >>> intepreted as HTML tags). It's safer, too. There is no way for them to
> >>> insert HTML, especially script tags.
> >>> So really, the only problem I have is that I need to be able to type a
> ç
> >>> and have it still a ç when it gets to the database.
> >>> Chas.
> > >
>
> >
>


-- 
Lift, the simply functional web framework http://liftweb.net
Beginning Scala http://www.apress.com/book/view/1430219890
Follow me: http://twitter.com/dpp
Git some: http://github.com/dpp

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Lift" group.
To post to this group, send email to liftweb@googlegroups.com
To unsubscribe from this group, send email to 
liftweb+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/liftweb?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to