Document.asXML() seems to depend on
outputFormat = new OutputFormat( "  ", false );

OutputFormat.java:42 has
 private String lineSeparator = "\r\n";
which surprised me - from http://www.w3.org/TR/2000/REC-xml-20001006#sec-line-ends :
>>>
To simplify the tasks of applications, the characters passed to an application by the XML processor must be as if the XML processor normalized all line breaks in external parsed entities (including the document entity) on input, before parsing, by translating both the two-character sequence #xD #xA and any #xD that is not followed by #xA to a single #xA character.
<<<
I'm not sure whether this applies to asXML(), but I'd thought the rule of thumb was "XML uses LF".

If people do not want to change this, could a call to
outputFormat.setLineSeparator(System.getProperty("line.separator"));
be triggered by one of the OutputFormat constructors?

BTW, I think line 27 in OutputFormat.java
    private String encoding = "UTF8";
should perhaps be "UTF-8" - see http://www.w3.org/TR/2000/REC-xml-20001006#charencoding sec 4.3.3

Thanks again,
Thomas.
_______________________________________________ dom4j-dev mailing list [EMAIL PROTECTED] http://lists.sourceforge.net/lists/listinfo/dom4j-dev

Reply via email to