See, the problem is that you're not handling the character encoding
correctly in general.  You should use String's getBytes method only when you
know what you're doing, because the whole point of character encodings is
that you can represent any given string with different sequences of bytes.

I'd suggest doing more research on encoding in general: here's one popular
piece, although not Javacentric:
The Absolute Minimum Every Software Developer Absolutely, Positively Must
Know About Unicode and Character Sets (No
Excuses!)<http://www.joelonsoftware.com/articles/Unicode.html>

From there, you may want to review the APIs for java.io.Reader and
java.io.Writer, which are specifically designed to help smooth over the
issues involved in serializing Java strings to bytes.

This looks like it's going way too far off topic to be something that should
be discussed much further on the Struts list.

Best,
 Joe


On 4/16/07, Ashish Kulkarni <[EMAIL PROTECTED]> wrote:

Hi
Here is the code where i read the dom tree and then convert it to a
String,
then convert this string into Byte array and then user
DocumentBuilder().parse to parse it.

I get error in factory.newDocumentBuilder().parse(byteArray);


TransformerFactory tFactory =
            TransformerFactory.newInstance();
        Transformer transformer = tFactory.newTransformer();
        StringWriter writer = new StringWriter();
        DOMSource source = new DOMSource(doc);
        transformer.transform(source, new StreamResult(writer));
        String obj = writer.toString();
ByteArrayInputStream byteArray = new ByteArrayInputStream(obj.getBytes());
Document doc = factory.newDocumentBuilder().parse(byteArray);


Ashish
On 4/16/07, Joe Germuska <[EMAIL PROTECTED]> wrote:
>
> On 4/16/07, Christopher Schultz <[EMAIL PROTECTED]> wrote:
> >
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA1
> >
> > Ashish,
> >
> > Ashish Kulkarni wrote:
> > > I have java class which creates an XML file from SQL resultset,
> > > It works fine in USA, but i am having issues when this process runs
in
> > > Germany where they have non UTF characters in there database like ü
or
> > á.
> >
> > I think you mean non-lower-ASCII. This characters are certainly
covered
> > by UTF-8.
> >
> > > How do we handle this kind of situation in XML file, i set the XML
> file
> > to
> > > be of UTF-8 type.
> >
> > How do you set the file "type" to UTF-8?
>
>
> I'm assuming Ashish is talking about the "encoding" attribute of the XML
> declaration in the first line of the file.
>
> Chris is correct that the real magic happens when you serialize the DOM
to
> a
> file, but you should be sure to use the same encoding with the writer
that
> actually creates the file as you do in the XML declaration.  If your
> characters aren't UTF-8 then don't use UTF-8.  Any decent XML reading
> software will recognize the encoding when the file is read.
>
> Joe
>
> --
> Joe Germuska
> [EMAIL PROTECTED] * http://blog.germuska.com
>
> "The truth is that we learned from João forever to be out of tune."
> -- Caetano Veloso
>




--
Joe Germuska
[EMAIL PROTECTED] * http://blog.germuska.com

"The truth is that we learned from João forever to be out of tune."
-- Caetano Veloso

Reply via email to