This is not the best list for Xerces questions.  There is a Xerces-J list
that you should subscribe to.

The problem is that your document is encoded incorrectly.  There is no
ASCII character 246, since ASCII only defines characters up to 127.
However, there _is_ a character defined in ISO-8859-1with such a value.
Your document does not contain an XML declaration, so you need to add one
and specify the correct encoding:

   <?xml version="1.0" encoding="ISO-8859-1"?>

Dave



                                                                                       
        
                    Joseph                                                             
        
                    Shraibman            To:     [EMAIL PROTECTED]                
        
                    <jks@selectac        cc:     (bcc: David N Bertoni/CAM/Lotus)      
        
                    ast.net>             Subject:     accented characters and xerces j 
        
                                                                                       
        
                    11/05/2001                                                         
        
                    08:16 PM                                                           
        
                    Please                                                             
        
                    respond to                                                         
        
                    general                                                            
        
                                                                                       
        
                                                                                       
        



I'm using Xerces 1.3.1

I have a file that contains 'ö', ascii 246


When I try to parse the file using xerces I get:
: 151, 6: An invalid XML character (Unicode: 0x1b6803) was found in the
element content of
the document.

Presumably when java reads the file before it gets to xerces it converts
246 to that
unicode value, but why?  I'm using the default (US) locale.

You can get the files involved from:
http://www.selectacast.net/~jks/xml/pr2.xml
http://www.selectacast.net/~jks/xml/pr2.txt is the original text file.


--
Joseph Shraibman
[EMAIL PROTECTED]
Increase signal to noise ratio.  http://www.targabot.com


---------------------------------------------------------------------
In case of troubles, e-mail:     [EMAIL PROTECTED]
To unsubscribe, e-mail:          [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]






---------------------------------------------------------------------
In case of troubles, e-mail:     [EMAIL PROTECTED]
To unsubscribe, e-mail:          [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to