Shenxue Zhou wrote:
Try to construct an InputStreamReader obj from your InputStream obj:
InputStreamReader reader = new InputStreamReader(is, encoding);
Thanks for responding Shenxue.
I'm not having a problem with encoding characters, I'm having a problem
where XmlBeans is creating an incorrect prologue - or failing to persist
at all. Perhaps the subject was misleading - I was talking about the
character encoding definition in the XML prologue.
And construct an OutputStreamWriter from your ByteArrayOutputStream:
OutputStreamWriter writer = new OutputStreamWriter(baos, encoding);
Again, this isn't a character encoding issue - it's strictly a prologue
issue.
Then do you reading, writing stuff
If you end up creating a new XmlObject for the desired encoding, try to
set the encoding property on the new XmlObject:
targetDoc.documentProperties().setEncoding(encoding);
I had already done this (shown in the previous email pasted below).
The bug (as I understand it) is that XmlBeans is only doing a one-way
mapping of IANA character encodings. For example, I receive an XmlBean
XML Document that starts like this:
<?xml version="1.0" encoding="UNICODEBIG"?>
...
XmlBeans handles this properly - it converts the IANA 'UNICODEBIG'
character encoding into the corresponding UTF-16BE and then decodes the
data into a Java String/XmlBean.
All is good incoming.
I MUST respond with the same character encoding as what the client sent
me. This is impossible because XmlBeans is not capable of setting the
prologue character encoding to UNICODEBIG.
At this stage, when I serialize the response Document, I set its
responseDoc.documentProperties().setEncoding("UNICODEBIG"); I do this
because this is what the client sent me. I can only assume clients will
send me IANA character encodings, so I must respond in kind - a C#
client would be confused if it saw a Java-specific character encoding
name...
The code, text, and exception in the previous email (below) all show
that this final step is broken - XmlBeans during serialization needs to
do this:
0. create a prologue with the given IANA charset. The challenge here is
for XmlBeans to provide an API that accepts an IANA charset name or a
Java charset name (and then convert the Java charset name to an IANA
name in the prologue)
1. translate the IANA charset encoding name to the Java equivalent and
encode the document using this charset.
The bug is that the handling of the XmlBean.documentProperties()
character encoding is just plain handled wrong. Perhaps the unit tests
only test the UTF-8 case?
It would be great if someone responded with the next steps. I may have
some time on Tuesday to look into a fix for this. It may change the
default behaviour for folks not using UTF-8.
Thank you for reading.
Thoughts?
-----Original Message-----
From: Mark Swanson [mailto:[EMAIL PROTECTED]
Sent: Thursday, September 21, 2006 3:58 PM
To: dev@xmlbeans.apache.org
Subject: Bug: XmlBeans does not support character encodings other than
UTF-8
Hello,
I'm trying to get XmlBeans to persist using a specific character set:
UNICODEBIG.
XmlDocumentProperties xmlDocumentProperties =
doc.documentProperties();
String encoding = xmlDocumentProperties.getEncoding();
Logger.info("response encoding:" + encoding);
(encoding is UNICODEBIG)
XmlOptions xmlOptions = new XmlOptions();
xmlOptions.setSavePrettyPrint();
xmlOptions.setSavePrettyPrintIndent(4);
xmlOptions.setSaveOuter();
ByteArrayOutputStream baos = new ByteArrayOutputStream(2048);
InputStream is = doc.newInputStream(xmlOptions);
byte[] buffer = new byte[2048];
int length = 0;
while (true) {
length = is.read(buffer, 0, 2048);
if (length < 0)
break;
baos.write(buffer, 0, length);
}
I need to state the prologue, and XmlBeans is printing a prologue with
the wrong character set: UTF-8.
<?xml version="1.0" encoding="UTF-8"?>
...
The document has the UNICODEBIG encoding. Yet, it prints out as UTF-8.
This is contrary to the javadocs which says newInputStream() takes the
encoding into account.
I tried xmlOptions.setCharacterEncoding() but that fails with:
java.lang.RuntimeException: java.io.UnsupportedEncodingException:
UNICODEBIG
at
org.apache.xmlbeans.impl.store.Saver$InputStreamSaver.<init>(Saver.java:
1785)
at
org.apache.xmlbeans.impl.store.Cursor._newInputStream(Cursor.java:552)
at
org.apache.xmlbeans.impl.store.Cursor.newInputStream(Cursor.java:2442)
at
org.apache.xmlbeans.impl.values.XmlObjectBase.newInputStream(XmlObjectBa
se.java:156)
I'm using XmlBeans 2.1.0. Does anyone have any ideas as to why the
Document's encoding properties are being ignored?
Thank you.
--
Free replacement for Exchange and Outlook (Contacts and Calendar)
http://www.ScheduleWorld.com/tg/
WebDAV: http://www.ScheduleWorld.com/sw/webDAVDir/4000.ics
VFREEBUSY: http://www.ScheduleWorld.com/sw/freebusy/4000.ifb
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]