Hi Ken From: "Ken Sheppardson" <[EMAIL PROTECTED]> > Hi folks, > > I've just started looking at dom4j and have a couple questions I > hope someone can help me with... > > I'm trying to use the package to manipulate HTML/XHTML-Transitional > documents that have mixed data elements, e.g. > > <body> Before <div>Inside</div> After</body> > > I'd like able to swap out "Before" for some new text or element, > swap "Before" and "After", and otherwise make a general mess of > the document programatically. > > Now when I parse this document in dom4j I end up with an Element > (body) whose text is "Before After" and another element (div) > whose text is "Inside". > > In fact, the "body" Element seems to have three child nodes: > > Text (NodeType 3) : "Before" > Element (NodeType 1) : "Inside" > Text (NodeType 3) : "After" > > > My questions: > > (1) What's the simplest way to grab a list of the children > of a given node? Build an XPath expression and use > selectNode? Cast it as a Branch and use node() and/or > nodeIterator()? Firstly before I go into further detail, the Text interface has a setText(String) method allowing you to change the text. So to just change text you could do Text before = (Text) element.node(0); Text after = (Text) element.node(2); before.setText( "foo" ); after.setText( "bar" ); I've just checked in a patch (*) to fix the above code ;-) By default I'd set the default implementation of Text to be the shareable but immutable, flyweight implementation of Text which I think was a mistake. I've made the mutable non-flyweight Text implementation the default now so the above code should work if you download a new daily snapshot or take the latest CVS. (Plus a new release will happen before JavaOne). Now a brief overview of mutating Element or Document content... Both Document and Element interfaces implement the Branch interface - on both of these you can get a backed List of the contents of the branch, which allows you to manipulate the contents via the standard List API. e.g. List list = element.content(); // lets clear the list list.clear(); // or lets remove some items list.remove(3); // or remove a sublist list.subList(2,7).clear(); So you can use the List API to add and remove nodes at specific points and so on. The added complication with adding brand new nodes at specific points in the list (rather than at the end) is the use of factories. In dom4j we've tried to keep everything interface based and hide the implementation details. So we recommend the use of DocumentFactory when creating new content nodes. This all happens under the covers when you use the Element API for adding content. e.g. Element foo = ..; Element bar = foo.addElement( "bar" ); Which will create a new Element implementation and add it to the end of the content node list. What could have happened under the covers is some special schema aware BarElement implementation just got created. Or the DefaultElement class could be used. Or a persistent element, a lazy fetch or indexed element implementation or whatever. However if you wanted to add some content at an earlier point in the list you could do this instead... Element foo = ..; Element bar = DocumentHelper.createElement( "bar" ); List list = foo.content(); list.add( 2, bar ); which would add the bar node at the second point in the node content list. The bar Element would be created using the default singleton DocumentFactory instance. (BTW this can be configured by the org.dom4j.factory system property). An alternative approach is to add the content using the normal API then move them around. e.g. Element foo = ...; Element bar = foo.addElement( "bar" ); List list = foo.content(); list.remove( bar ); list.add( 2, bar ); > (2) How do I get the parent node of a Text Node? It appears as > though Text Nodes don't implement getParent(). Am I missing > something? Text, Element, Attribute, Document and all the other node interfaces all implement the Node interface which has a getParent() method in it. The core "Interface Hierarchy" is here in the javadoc (below the "Class Hierarchy") http://dom4j.org/apidocs/org/dom4j/package-tree.html A nice picture or class diagram of this hierarchy would be nice one day... ;-) So all Nodes have the getParent() method. The quick answer is, if you use the code with the new patch (*) I mention above, the getParent() will work for you on all Text nodes. The longer answer is... <longAnswer> To support the flyweight pattern such that nodes (or fragments) could be shared across documents for performance reasons (e.g. enumeration attributes align="CENTER|RIGHT|LEFT" could share 3 flyweight instances across all documents), I made the getParent() method an optional method - an implementation may not support it. There's a method supportsParent() which lets a user know if parent is supported or not. The XPath engine is capable of navigating a tree and generating a result set which supports the parent relationship even if the originating document is full of flyweight objects. The aforementioned patch avoids using the flyweight Text node by default, so the getParent() method should now work fine on Text nodes. </longAnswer> Hope that all helps - even if its a little verbose ;-) James _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com _______________________________________________ dom4j-dev mailing list [EMAIL PROTECTED] http://lists.sourceforge.net/lists/listinfo/dom4j-dev