Also, Christian, could you try using "ISO8859_1" rather than "ISO-8859-1" for the encoding string in your code? XmlBeans is using the Java names for the encodings, which seemed more consistent than using the IANA names (so you can get the encoding from other Java code and set it directly etc). Of course, the generated documents always use the IANA names.
Let us know how that works, Radu PS Regarding the use of setSaveSubstituteCharacters(), as the name indicates, this is a save-time XmlOption, so it will not have any effect when passed as argument to newInstance(), but when passed to xmlText() or other similar methods. The reason for this is that the infoset doesn't make any difference between a character being represented as entity or as literal value. -----Original Message----- From: Steve Davis [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 19, 2005 7:12 AM To: [email protected] Subject: RE: character sets I am not an expert in this area, but the following code may help: /** * Gets the formatted character-encoded string representation of an XmlTokenSource. * @param xmlTokenSource - typically an XmlBean object * @param encoding - the desired character encoding * @return String ready for transmission * @throws Exception */ public static String getEncodedXmlText(XmlTokenSource xmlTokenSource, String encoding) throws Exception { // Setup various properties of the XML instance document xmlTokenSource.documentProperties().setEncoding(encoding); xmlTokenSource.documentProperties().setVersion("1.0"); XmlOptions xmlOptions = new XmlOptions(); xmlOptions.setCharacterEncoding(encoding); xmlOptions.setUseDefaultNamespace(); xmlOptions.setSaveAggressiveNamespaces(); // Format to a buffer and read it back into a string ByteArrayOutputStream bos = new ByteArrayOutputStream(); xmlTokenSource.save(bos, xmlOptions); return bos.toString(); } -----Original Message----- From: Wendell, Christian [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 19, 2005 4:31 AM To: [email protected] Cc: Kekkonen, Jari Subject: character sets Hi We have a Struts/tomcat solution, the jsp files using the 8859-1 character set, on top of an Oracle db, which uses UTF-8/Unicode. Of course, in Scandinavia, we have Scandinavian characters. The problem is putting stuff from the html input boxes into the db, and reading stuff from the db to the page, so that the umlaut chars work. The bean models we use are generated from schemas with XMLBean tools. In the action that reads stuff from the page and sends it to the db, we try to put the encoding into the XML header, but fail: XmlOptions opts= new XmlOptions(); /* we'd also like to encode '>', using the newest devel version, which also has no effect: XmlOptionCharEscapeMap escapes= new XmlOptionCharEscapeMap(); escapes.addMapping('>', XmlOptionCharEscapeMap.PREDEF_ENTITY); opts.setSaveSubstituteCharacters(escapes); */ opts.setCharacterEncoding("ISO-8859-1"); Note addedNote= Note.Factory.newInstance(opts); //The bean addedNote.setNote(text); //text contains umlaut characters from the jsp Logging the xml structure, we can't see any difference in the generated xml whether we do setCharacterEncoding() or not. Is our strategy right but implementation wrong, or should we do this somehow else? Christian --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

