Turns out I wasn't encoding the actual content of my XML output as UTF-16,
but because Windows Cp1252 (WinLatin1) charset resolves a paticular UTF-8
char to the euro symbol, it fooled me into thinking I was encoding the
output simply by setting the output format on the OutputFormat class. This
is because when I opened the file in Windows using XMLSpy, it displayed the
euro symbol correctly.

The complete and correct solution is:

        OutputStreamWriter out = new OutputStreamWriter(new
FileOutputStream(new File(filename)), "UTF-16");
        XMLWriter writer = new XMLWriter(out, new OutputFormat("   ", true,
"UTF-16"));

Thanks!
Portia



|---------+---------------------------->
|         |           "James Strachan" |
|         |           <james_strachan@y|
|         |           ahoo.co.uk>      |
|         |                            |
|         |           19/02/2003 08:03 |
|         |                            |
|---------+---------------------------->
  
>---------------------------------------------------------------------------------------------------------------|
  |                                                                                    
                           |
  |       To:       <[EMAIL PROTECTED]>, <[EMAIL PROTECTED]>           
                           |
  |       cc:                                                                          
                           |
  |       Subject:  Re: [dom4j-user] Reading the euro symbol in a file using the 
SAXReader                        |
  
>---------------------------------------------------------------------------------------------------------------|




From: <[EMAIL PROTECTED]>
> I get the following exception when I try to read a file containing the
euro
> symbol:
>
> org.dom4j.DocumentException: invalid byte 1 of 1-byte UTF-8 sequence
(0x80)
> Nested exception: invalid byte 1 of 1-byte UTF-8 sequence (0x80)
>       at org.dom4j.io.SAXReader.read(SAXReader.java:342)
>       at org.dom4j.io.SAXReader.read(SAXReader.java:218)
>       at org.dom4j.io.SAXReader.read(SAXReader.java:207)
>
> I know that the symbol has been encoded properly because when I use
Java's
> BufferedReader to read in the file, the symbol is displayed correctly.
> Any ideas?

Whats you XML document look like? Its probably to do with the encoding of
the XML document. Do you include an encoding <?xml version="1.0"
encoding="..."?> section?

James
-------
http://radio.weblogs.com/0112098/

__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com






**********************************************************************************
This email may contain confidential material. If you were not an
intended recipient, please notify the sender and delete all copies.
We may monitor email to and from our network.



-------------------------------------------------------
This SF.net email is sponsored by: SlickEdit Inc. Develop an edge.
The most comprehensive and flexible code editor you can use.
Code faster. C/C++, C#, Java, HTML, XML, many more. FREE 30-Day Trial.
www.slickedit.com/sourceforge
_______________________________________________
dom4j-user mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dom4j-user

Reply via email to