Re: [dom4j-user] Normalised text nodes.

Edwin Dankert Thu, 23 Dec 2004 02:31:25 -0800

> Section 2.10 says that the XML processor must pass all characters that
> are not markup to the application. Does that include my text element?


Yes.

> Section 3.3.1 talks about normalising attributes. A text element isn't
> a node attribute, right?

No, this is only for attributes (for instance newlines are normalized
in attributes).

I think the problem is that in your document you have an internal
DOCTYPE declaration, this declaration makes your parser
(xerces/crimson) believe that any whitespace between elements is
ignorable, the parser will now pass these ignorable characters on to
the ignorableWhitespace() method.
http://www.cafeconleche.org/books/xmljava/chapters/ch06s10.html

The SAXContentHandler in dom4j does currently not implement this
method and will discard all this whitespace.

This should be easy to solve using the following code:
MyContentHandler extends SAXContentHandler {
  public void ignorableWhitespace(char[] chars, int start, int length)
throws SAXException {
    characters( chars, start, length);
  }
}

Code not tested.

Regards,
Edwin


-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now. 
http://productguide.itmanagersjournal.com/
_______________________________________________
dom4j-user mailing list
dom4j-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dom4j-user

Re: [dom4j-user] Normalised text nodes.

Reply via email to