oh right - I keep forgetting about that.

thanks - dave


----- Original Message ----- 
From: "David D. Lucas" <[EMAIL PROTECTED]>
To: "David Thielen" <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>
Sent: Sunday, September 21, 2003 7:48 PM
Subject: Re: [dom4j-user] Want \r\n in node attribute values


> I believe the spec describes why best.  Unfortunately, I think the
> behavior you do not want is actually correct.
>
> This came from http://www.w3.org/TR/REC-xml
>
>
============================================================================
> 3.3.3 Attribute-Value Normalization
>
> Before the value of an attribute is passed to the application or checked
> for validity, the XML processor must normalize the attribute value by
> applying the algorithm below, or by using some other method such that
> the value passed to the application is the same as that produced by the
> algorithm.
>
>     1. All line breaks must have been normalized on input to #xA as
> described in 2.11 End-of-Line Handling, so the rest of this algorithm
> operates on text normalized in this way.
>     2. Begin with a normalized value consisting of the empty string.
>     3. For each character, entity reference, or character reference in
> the unnormalized attribute value, beginning with the first and
> continuing to the last, do the following:
>        * For a character reference, append the referenced character to
> the normalized value.
>        * For an entity reference, recursively apply step 3 of this
> algorithm to the replacement text of the entity.
>        * For a white space character (#x20, #xD, #xA, #x9), append a
> space character (#x20) to the normalized value.
>        * For another character, append the character to the normalized
> value.
>
> If the attribute type is not CDATA, then the XML processor must further
> process the normalized attribute value by discarding any leading and
> trailing space (#x20) characters, and by replacing sequences of space
> (#x20) characters by a single space (#x20) character.
>
> Note that if the unnormalized attribute value contains a character
> reference to a white space character other than space (#x20), the
> normalized value contains the referenced character itself (#xD, #xA or
> #x9). This contrasts with the case where the unnormalized value contains
> a white space character (not a reference), which is replaced with a
> space character (#x20) in the normalized value and also contrasts with
> the case where the unnormalized value contains an entity reference whose
> replacement text contains a white space character; being recursively
> processed, the white space character is replaced with a space character
> (#x20) in the normalized value.
>
> All attributes for which no declaration has been read should be treated
> by a non-validating processor as if declared CDATA.
>
>
============================================================================
>
> Hope that helps.
>
> Dave
>
>
> David Thielen wrote:
> > Hi;
> >
> > If I have:
> > <node>line 1
> > line 2</node>
> >
> > Then Node.valueOf("/node") will return "line 1\r\nline 2" which is what
> > I want.
> >
> > But if I have:
> > <node atr = "line 1
> > line 2"</node>
> >
> > Then Node.valueOf (/node/@atr) will return "line 1 line 2" - ie no \r\n.
> > Is there any way to get the \r\n?
> >
> > thanks - dave
>
>
> -- 
>
> +------------------------------------------------------------+
> | David Lucas                        mailto:[EMAIL PROTECTED]  |
> | Lucas Software Engineering, Inc.   (740) 964-6248 Voice    |
> | Unix,Java,C++,CORBA,XML,EJB        (614) 668-4020 Mobile   |
> | Middleware,Frameworks              (888) 866-4728 Fax/Msg  |
> +------------------------------------------------------------+
> | GPS Location:  40.0150 deg Lat,  -82.6378 deg Long         |
> | IMHC: "Jesus Christ is the way, the truth, and the life."  |
> | IMHC: "I know where I am; I know where I'm going."    <><  |
> +------------------------------------------------------------+
>
> Notes: PGP Key Block=http://www.lse.com/~ddlucas/pgpblock.txt
> IMHO="in my humble opinion" IMHC="in my humble conviction"
> All trademarks above are those of their respective owners.
>
>
>
>



-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
dom4j-user mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dom4j-user

Reply via email to