Re: accented characters and xerces j

David_N_Bertoni Mon, 05 Nov 2001 22:08:22 -0800


No, all the parser sees is a stream of bytes.  It's up to the parser to
interpret the bytes properly.  With no xml declaration, no encoding
provided, or no byte order mark, the parser assumes UTF-8.  In that case,
your document is not XML, because it contains invalid characters.


Dave



                                                                                       
        
                    Joseph                                                             
        
                    Shraibman            To:     [EMAIL PROTECTED]                
        
                    <jks@selectac        cc:     (bcc: David N Bertoni/CAM/Lotus)      
        
                    ast.net>             Subject:     Re: accented characters and 
xerces j     
                                                                                       
        
                    11/05/2001                                                         
        
                    10:08 PM                                                           
        
                    Please                                                             
        
                    respond to                                                         
        
                    general                                                            
        
                                                                                       
        
                                                                                       
        



How can that be?  Isn't unicode conversion done before any of the contents
are looked at?

[EMAIL PROTECTED] wrote:

> This is not the best list for Xerces questions.  There is a Xerces-J list
> that you should subscribe to.
>
> The problem is that your document is encoded incorrectly.  There is no
> ASCII character 246, since ASCII only defines characters up to 127.
> However, there _is_ a character defined in ISO-8859-1with such a value.
> Your document does not contain an XML declaration, so you need to add one
> and specify the correct encoding:
>
>    <?xml version="1.0" encoding="ISO-8859-1"?>
>
> Dave
>



--
Joseph Shraibman
[EMAIL PROTECTED]
Increase signal to noise ratio.  http://www.targabot.com


---------------------------------------------------------------------
In case of troubles, e-mail:     [EMAIL PROTECTED]
To unsubscribe, e-mail:          [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]






---------------------------------------------------------------------
In case of troubles, e-mail:     [EMAIL PROTECTED]
To unsubscribe, e-mail:          [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: accented characters and xerces j

Reply via email to