Note that this behavior is required by the XML specification.  See 
http://www.w3.org/TR/2008/REC-xml-20081126/#AVNormalize.  It's dense, but in 
summary, when an attribute value is loaded, leading and trailing white space is 
discarded, and each sequence of spaces, tabs, carriage returns, and linefeeds 
are converted to a single space.

This applies only if there's no schema indicating that the attribute value is 
CDATA, but the safest thing for a serializer to do is assume that the value 
might not be CDATA (or might not be recognized as such by whatever processor 
loads the document) and that whitespace should be preserved.  The only way to 
guarantee that is to write the whitespace characters as entities.

-----Original Message-----
From: Alberto Massari [mailto:[email protected]]
Sent: Thu 7/30/2009 11:37 AM
To: [email protected]
Subject: Re: DOMLSSerializer converts white space characters in attributes to 
xml entities
 
No, if the serialized attribute value has newlines/tab, they are 
converted upon loading into spaces. If you want to really store such 
characters in an attribute, they have to be encoded into entities.

Alberto

mini thomas wrote:
> Hi,
>  
> I am using xerces 3.0.1 and doing the following
>  
>
> 1) Parse a string
>  
> 2)Set an attribute "newattr" on the root node. The attribute value is 
> char *temp = "\n Hello \t\t testing"
>  
> 3) converting the parsed data back to xml
>  
> static const XMLCh gLS[] = { chLatin_L,  chLatin_S,  chNull };
> DOMImplementation *impl = 
> DOMImplementationRegistry::getDOMImplementation(gLS);
> DOMLSSerializer*  myWriter = (impl)->createLSSerializer();
> DOMConfiguration* dc = myWriter->getDomConfig();
> dc->setParameter( XMLUni::fgDOMWRTDiscardDefaultContent,true);
> // serialize the DOMNode to a UTF-16 string
> XMLCh* theXMLString_Unicode = 
> myWriter->writeToString(toWrite.GetDOMNodePtr());
>
> 4) Convert theXMLString_Unicode  to char* and print using cout.
>  
>  I got the attribute printed this way.
> newattr="
 Hello 		 testing"
>  
>  
> Is there any way to get the attribute printed as newattr="
>  Hello  testing"
>  
>  
> Thanks,
> Mini
>
>
>       
>   


Reply via email to