No, all the parser sees is a stream of bytes. It's up to the parser to interpret the bytes properly. With no xml declaration, no encoding provided, or no byte order mark, the parser assumes UTF-8. In that case, your document is not XML, because it contains invalid characters.
Dave Joseph Shraibman To: [EMAIL PROTECTED] <jks@selectac cc: (bcc: David N Bertoni/CAM/Lotus) ast.net> Subject: Re: accented characters and xerces j 11/05/2001 10:08 PM Please respond to general How can that be? Isn't unicode conversion done before any of the contents are looked at? [EMAIL PROTECTED] wrote: > This is not the best list for Xerces questions. There is a Xerces-J list > that you should subscribe to. > > The problem is that your document is encoded incorrectly. There is no > ASCII character 246, since ASCII only defines characters up to 127. > However, there _is_ a character defined in ISO-8859-1with such a value. > Your document does not contain an XML declaration, so you need to add one > and specify the correct encoding: > > <?xml version="1.0" encoding="ISO-8859-1"?> > > Dave > -- Joseph Shraibman [EMAIL PROTECTED] Increase signal to noise ratio. http://www.targabot.com --------------------------------------------------------------------- In case of troubles, e-mail: [EMAIL PROTECTED] To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- In case of troubles, e-mail: [EMAIL PROTECTED] To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]