XmlLayout allows output of invalid control characters 
------------------------------------------------------

         Key: LOG4NET-22
         URL: http://issues.apache.org/jira/browse/LOG4NET-22
     Project: Log4net
        Type: Bug
  Components: Appenders  
    Versions: 1.2.9    
    Reporter: Nicko Cadell


XmlLayout allows output of invalid control characters.

Reported by Mike Blake-Knox with additional comments from Curt Arnold.


The XmlLayout encodes the character 0x1e as  using the standard XML 
numeric character reference.

This character code is in a range which is not allowed to appear in XML 1.0 
either as a un-encoded value or as a numeric character reference.

The valid character ranges are defined here in the XML recommendation:
http://www.w3.org/TR/REC-xml/#charsets

They are:

#x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]

Numeric character references are not able to express characters from outside 
these ranges.

The System.Xml.XmlTextWriter does not verify if the unicode character is valid 
in XML, but it does encode it as a numeric character reference if it cannot be 
expressed in the output encoding.

To complicate matters further XML 1.1 does allow further, so called restricted 
characters, to be included in the output if they are encoded as numeric 
character references. These ranges are:

[#x1-#x8] | [#xB-#xC] | [#xE-#x1F] | [#x7F-#x84] | [#x86-#x9F]

See http://www.w3.org/TR/2004/REC-xml11-20040204/#charsets for details.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
If you want more information on JIRA, or have a bug to report see:
   http://www.atlassian.com/software/jira

Reply via email to