|
Hi Chris
When you have XML like this
<pizza>
<topping>cheese</topping>
<topping>ham</topping>
</pizza>
Then when this gets parsed you'll have
Element (pizza)
Text ("/n ")
Element(topping)
Text ("/n ")
Element(topping)
Text ("/n") So in other worlds all the whitespace between elements is preserved in the
XML object structure. This is important in data-centric XML applications such as
editing hand-formatted XML documents and so forth. For data-centric applications
this whitespace is usually irrelevant - indeed its often useful to trim
it.
So there's a SAXReader option to allow whitespace to be trimmed via
SAXReader reader = new SAXReader();
reader.setStripWhitespaceText(true);
Also you may find that because of parser buffer issues, a block of text can
sometimes be split across several text nodes. To solve this you can ensure that
adjacent text nodes are merged via this
reader.setMergeAdjacentText(true);
All of the above may help you get an XML tree that matches your mental
model of what you think it should be.
James
|
- [dom4j-user] elementData, excess content? Christopher M. Golden
- James Strachan
