Hi Bill, I got curious why sometimes UTF8 and sometimes UTF-8 is used to define the encoding. The code I posted above is right from the stream example.
I just found some interesting information on charsets: * Every charset has a canonical name and may also have one or more aliases. * Some charsets have an historical name that is defined for compatibility with previous versions of the Java platform. In case of UTF-8 this would be canonical name : UTF-8 aliases : [UTF8, unicode-1-1-utf-8] historical name : UTF8 More information on charsets http://java.sun.com/j2se/1.5.0/docs/api/java/nio/charset/Charset.html Cheers, Philipp On Fri, May 21, 2010 at 12:58 AM, try zigc <[email protected]> wrote: > I believe it is "UTF-8" instead of "UTF8". > > Cheers > > Bill > > On Thu, May 20, 2010 at 3:30 AM, Philipp Erlacher > <[email protected]> wrote: >> >> I just marshalled an object to an UTF-8 encoded document. The object I >> marshalled has a UTF-8 encoded attribute. >> >> This is how I did it: >> >> FileOutputStream fos = new FileOutputStream("testUTF8.xml"); >> Writer osr = new OutputStreamWriter(fos, "UTF8"); >> marshaller.setWriter(osr); >> marshaller.marshal(person); >> >> Afterwards I checked the document with the favorite text editor of my >> choice and saw that it worked :) >> >> Maybe you also want to have a look at >> http://java.sun.com/docs/books/tutorial/i18n/text/stream.html to get >> more information on unicode and non-unicode text. >> >> Cheers, >> Philipp >> >> >> On Wed, May 19, 2010 at 7:23 PM, pablo fernandez >> <[email protected]> wrote: >> > Philipp, >> > Thanks, that actually did something different. Now I'm seeing >> > \350\227\215 >> > instead of the characters (not the question marks). >> > I think I'm pretty close, any idea how to transform these into the >> > actual >> > characters? >> > Thanks a lot! >> > >> > 2010/5/19 Philipp Erlacher <[email protected]> >> >> >> >> Hi Pablo, >> >> you could simply use >> >> >> >> StringWriter writer = new StringWriter(); >> >> ... >> >> return writer.toString(); >> >> >> >> instead of >> >> >> >> ByteArrayOutputStream baos = new ByteArrayOutputStream(); >> >> PrintWriter writer = new PrintWriter(new PrintStream(baos, true, >> >> "UTF-8")); >> >> ... >> >> return baos.toString(); >> >> >> >> 2010/5/19 pablo fernandez <[email protected]> >> >> > >> >> > I'm sorry but I just don't see how that helps me with the Marshaller >> >> > problem. >> >> > Thanks a lot for taking the time to answer anyway! :) >> >> > >> >> > >> >> > 2010/5/19 Brian Sanders <[email protected]> >> >> >> >> >> >> I was able to produce similar results in bsh and then I started to >> >> >> get >> >> >> things working, so perhaps this will provide some insight... >> >> >> bsh % baos = new ByteArrayOutputStream(); >> >> >> bsh % baos.write(new byte[] >> >> >> {0xCE,0xA4,0xCE,0xAD,0xCF,0x82,0xCF,0x84}, >> >> >> 0, 8); >> >> >> bsh % print(baos.toString("UTF-8")); >> >> >> Τέςτ >> >> >> >> >> >> I picked some characters from charmap, pasted them into Notepad++, >> >> >> then >> >> >> switched to hex mode to get the codes. >> >> >> ________________________________ >> >> >> From: [email protected] >> >> >> Date: Tue, 18 May 2010 18:14:38 -0700 >> >> >> To: [email protected] >> >> >> Subject: [castor-user] Marshaller not properly encoding to UTF-8 >> >> >> >> >> >> Hi, >> >> >> Im having a problem with my castor Marshaller. Check this example: >> >> >> >> >> >> //Just For testing, this is not production code >> >> >> Person p = (Person) object; >> >> >> PrintStream ps = new PrintStream(System.out, true, "UTF-8"); >> >> >> //This line prints the UTF-8 characters correctly :) >> >> >> ps.println(p.getFullName()); >> >> >> >> >> >> //The actual code that has the problem >> >> >> //PrintWriter that wraps the SAME Stream used above >> >> >> ByteArrayOutputStream baos = new ByteArrayOutputStream(); >> >> >> PrintWriter writer = new PrintWriter(new PrintStream(baos, true, >> >> >> "UTF-8")); >> >> >> marshaller.setSuppressXSIType(true); >> >> >> marshaller.marshal(object,writer); >> >> >> //This line prints (?) instead of the UTF-8 characters :( >> >> >> return baos.toString(); >> >> >> Any idea what I'm doing wrong? Please any help will be appreciated. >> >> >> --Pablo >> >> >> ________________________________ >> >> >> Hotmail has tools for the New Busy. Search, chat and e-mail from >> >> >> your >> >> >> inbox. Learn more. >> >> >> >> --------------------------------------------------------------------- >> >> To unsubscribe from this list, please visit: >> >> >> >> http://xircles.codehaus.org/manage_email >> >> >> >> >> > >> > >> >> --------------------------------------------------------------------- >> To unsubscribe from this list, please visit: >> >> http://xircles.codehaus.org/manage_email >> >> > > --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email

