Re: [xml] '#x10;' question

2006-09-06 Thread Tim Van Holder
Marchese Stefano wrote:
 ... hi all,
 
 just a question about the '#x10;' character.
 
 My application parses some xml files using the xmlParseFile() API.
 This API gives an error if the file has the following content:
 contentAsl#x10;URP/content
 
 What I have to do to parse files like that?

The XML standard defines a character as

 Char ::= #x9 | #xA | #xD |
  [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x1-#x10]

(http://www.w3.org/TR/xml/#charsets)

As such the entity corresponding to codepoint 0x10 is not a valid
character according to the XML standard, and a conforming parser will
not allow it in a document.

So it seems the content is binary, in which case it should either be
encoded in some way (base64 for example), or not be in XML at all (XML
is not a binary transport).

___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
http://mail.gnome.org/mailman/listinfo/xml


Re: [xml] '#x10;' question

2006-09-06 Thread Liam R E Quin
On Wed, 2006-09-06 at 15:50 +0200, Marchese Stefano wrote:

 My application parses some xml files using the xmlParseFile() API.
 This API gives an error if the file has the following content:
 contentAsl#x10;URP/content

As indeed it should, character 0x10 (hexadecimal, ie. decimal 16,
i.e. ASCII DLE, Data Link Escape, control-P) is not legal in XML 1.0
documents.

You can use XML 1.1 if your tools support it, but it's more likely
an error in the data.  Maybe it's intended to be a newline, which
would be #10; instead, or in hexadecmial #xa;, or maybe you have
a character set problem and it's supposed to be some accented
character, in which case you need to convert to UTF-8 (for example)
*before* escaping non-ascii characters.

Liam

-- 
Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/
Pictures from old books: http://fromoldbooks.org/
Liam on the Web: http://www.holoweb.net/~liam/

___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
http://mail.gnome.org/mailman/listinfo/xml