Document.asXML() seems to depend on
outputFormat = new OutputFormat( " ", false );
OutputFormat.java:42 has
private String lineSeparator = "\r\n";
which surprised me - from
http://www.w3.org/TR/2000/REC-xml-20001006#sec-line-ends
:
>>>
To simplify the tasks of applications, the characters passed to an application by the XML processor must be as if the XML processor normalized all line breaks in external parsed entities (including the document entity) on input, before parsing, by translating both the two-character sequence #xD #xA and any #xD that is not followed by #xA to a single #xA character.
<<<
I'm not sure whether this applies to asXML(), but I'd thought the rule of thumb was "XML uses LF".
If people do not want to change this, could a call to
outputFormat.setLineSeparator(System.getProperty("line.separator"));
be triggered by one of the OutputFormat constructors?
BTW, I think line 27 in OutputFormat.java
private String encoding = "UTF8";
should perhaps be "UTF-8" - see http://www.w3.org/TR/2000/REC-xml-20001006#charencoding sec 4.3.3
Thanks again,
Thomas.
_______________________________________________
dom4j-dev mailing list
[EMAIL PROTECTED]
http://lists.sourceforge.net/lists/listinfo/dom4j-dev
- Re: [dom4j-dev] Document.asXML() uses CRLF? Thomas Nichols
- Re: [dom4j-dev] Document.asXML() uses CRLF? James Strachan
- Re: [dom4j-dev] Document.asXML() uses CRLF? Thomas Nichols