According to section 3.3.1
<http://www.w3.org/TR/2006/REC-xml11-20060816/#sec-attribute-types> ,
"XML attribute types are of three kinds: a string type, a set of
tokenized types, and enumerated types," and string types are CDATA. In
addition, section 3.3.3 says, "All attributes for which no declaration
has been read SHOULD be treated by a non-validating processor as if
declared CDATA."
 
If you have declared your attribute to be one of the tokenized types or
an enumerated type, normalization should collapse whitespace, and the
behavior you describe sounds like a bug. Otherwise, normalization should
not collapse whitespace.

________________________________

From: Ling Xiaohan [mailto:[EMAIL PROTECTED] 
Sent: Thursday, October 16, 2008 1:35 AM
To: security-dev@xml.apache.org
Subject: Attribute normalization !!


Hi,
 
    I am using apache XMLdsig(1.4.2) to canonicalize XML file.
 
    The W3C Recommendation "Canonical XML" said that "Attribute
values are normalized, as if by a validating processor". 
 
    And paragraph 3.3.3 Attribute-Value Normalization of XML1.1
Recommendation said that "If the attribute type is not CDATA, 
then the XML processor MUST further process the normalized 
attribute value by discarding any leading and trailing space (#x20) 
characters, and by replacing sequences of space (#x20) characters 
by a single space (#x20) character". 
 
    When inputing a XML segment containing an attribute (normal,   
not specified CDATA) like
    ...
    <a attr="   abc               abc   "> 
    ...
the canonicalized output is still
    ... 
    <a attr="   abc               abc   "> 
    ...
where leading and trailing spaces were not removed and sequences 
of space between value "abc"s were not replaced with a single space.
 
Could anyone tell me why? 
Thank you very much.
 
________________________________________________________________________
______________________
nolen
 

Reply via email to