Hmm, I changed my angle to go for UTF-8 all the way. So my jsps now have
<%@ page contentType="text/html;charset=utf-8" %> ... <?xml version="1.0" encoding="utf-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/> ... in the beginning. Now input is OK, i.e. user's input is correctly inserted into the db. Now static Scandinavian display alright. But db data is mangled: The Action gets a bean populated from the db (Unicode). Logging the bean, I can see that the fields contain UTF-8. Getters in the Action return UTF-8. But when I put the bean into the session, and using getters in JSP, display the data on the page, the Scandinavian characters are all wrong. It seems like the jsp tries to convert from ANSI (ISO-8859-1) to UTF-8!? Why doesn't it realize that the data is already UTF-8 and just display it as such? thanks so far, Christian > -----Original Message----- > From: Radu Preotiuc-Pietro [mailto:[EMAIL PROTECTED] > Sent: 26. lokakuuta 2005 0:03 > To: [email protected] > Subject: RE: character sets > > > Also, Christian, could you try using "ISO8859_1" rather than > "ISO-8859-1" for the encoding string in your code? XmlBeans > is using the Java names for the encodings, which seemed more > consistent than using the IANA names (so you can get the > encoding from other Java code and set it directly etc). Of > course, the generated documents always use the IANA names. > > Let us know how that works, > Radu > > PS Regarding the use of setSaveSubstituteCharacters(), as the > name indicates, this is a save-time XmlOption, so it will not > have any effect when passed as argument to newInstance(), but > when passed to xmlText() or other similar methods. The reason > for this is that the infoset doesn't make any difference > between a character being represented as entity or as literal value. > > -----Original Message----- > From: Steve Davis [mailto:[EMAIL PROTECTED] > Sent: Wednesday, October 19, 2005 7:12 AM > To: [email protected] > Subject: RE: character sets > > > I am not an expert in this area, but the following code may help: > > /** > * Gets the formatted character-encoded string > representation of an XmlTokenSource. > * @param xmlTokenSource - typically an XmlBean object > * @param encoding - the desired character encoding > * @return String ready for transmission > * @throws Exception > */ > public static String getEncodedXmlText(XmlTokenSource > xmlTokenSource, String encoding) > throws Exception > { > // Setup various properties of the XML instance document > xmlTokenSource.documentProperties().setEncoding(encoding); > xmlTokenSource.documentProperties().setVersion("1.0"); > XmlOptions xmlOptions = new XmlOptions(); > xmlOptions.setCharacterEncoding(encoding); > xmlOptions.setUseDefaultNamespace(); > xmlOptions.setSaveAggressiveNamespaces(); > > // Format to a buffer and read it back into a string > ByteArrayOutputStream bos = new ByteArrayOutputStream(); > xmlTokenSource.save(bos, xmlOptions); > return bos.toString(); > } > > > -----Original Message----- > From: Wendell, Christian [mailto:[EMAIL PROTECTED] > Sent: Wednesday, October 19, 2005 4:31 AM > To: [email protected] > Cc: Kekkonen, Jari > Subject: character sets > > > Hi > > We have a Struts/tomcat solution, the jsp files using the > 8859-1 character set, on top of an Oracle db, which uses > UTF-8/Unicode. Of course, in Scandinavia, we have > Scandinavian characters. The problem is putting stuff from > the html input boxes into the db, and reading stuff from the > db to the page, so that the umlaut chars work. > > The bean models we use are generated from schemas with > XMLBean tools. In the action that reads stuff from the page > and sends it to the db, we try to put the encoding into the > XML header, but fail: > > XmlOptions opts= new XmlOptions(); > /* we'd also like to encode '>', using the newest devel > version, which also has no effect: > XmlOptionCharEscapeMap escapes= new XmlOptionCharEscapeMap(); > escapes.addMapping('>', XmlOptionCharEscapeMap.PREDEF_ENTITY); > opts.setSaveSubstituteCharacters(escapes); > */ > opts.setCharacterEncoding("ISO-8859-1"); > Note addedNote= Note.Factory.newInstance(opts); //The bean > addedNote.setNote(text); //text contains umlaut characters from > the jsp > > Logging the xml structure, we can't see any difference in the > generated xml whether we do setCharacterEncoding() or not. > > Is our strategy right but implementation wrong, or should we > do this somehow else? > > Christian > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

