[dom4j-dev] Re: [dom4j-user] Accessing Child/Parent Nodes?

James Strachan Wed, 23 May 2001 04:58:29 -0700
Hi Ken

From: "Ken Sheppardson" <[EMAIL PROTECTED]>
> Hi folks,
>
> I've just started looking at dom4j and have a couple questions I
> hope someone can help me with...
>
> I'm trying to use the package to manipulate HTML/XHTML-Transitional
> documents that have mixed data elements, e.g.
>
>    <body> Before <div>Inside</div> After</body>
>
> I'd like able to swap out "Before" for some new text or element,
> swap "Before" and "After", and otherwise make a general mess of
> the document programatically.
>
> Now when I parse this document in dom4j I end up with an Element
> (body) whose text is "Before After" and another element (div)
> whose text is "Inside".
>
> In fact, the "body" Element seems to have three child nodes:
>
>    Text    (NodeType 3)   : "Before"
>    Element (NodeType 1)   : "Inside"
>    Text    (NodeType 3)   : "After"
>
>
> My questions:
>
> (1) What's the simplest way to grab a list of the children
>     of a given node? Build an XPath expression and use
>     selectNode? Cast it as a Branch and use node() and/or
>     nodeIterator()?

Firstly before I go into further detail, the Text interface has a
setText(String) method allowing you to change the text.  So to just change
text you could do

    Text before = (Text) element.node(0);
    Text after = (Text) element.node(2);
    before.setText( "foo" );
    after.setText( "bar" );

I've just checked in a patch (*) to fix the above code ;-)

By default I'd set the default implementation of Text to be the shareable
but immutable, flyweight implementation of Text which I think was a mistake.
I've made the mutable non-flyweight Text implementation the default now so
the above code should work if you download a new daily snapshot or take the
latest CVS. (Plus a new release will happen before JavaOne).

Now a brief overview of mutating Element or Document content...


Both Document and Element interfaces implement the Branch interface - on
both of these you can get a backed List of the contents of the branch, which
allows you to manipulate the contents via the standard List API. e.g.

    List list = element.content();

    // lets clear the list
    list.clear();

    // or lets remove some items
    list.remove(3);

    // or remove a sublist
    list.subList(2,7).clear();

So you can use the List API to add and remove nodes at specific points and
so on.

The added complication with adding brand new nodes at specific points in the
list (rather than at the end) is the use of factories. In dom4j we've tried
to keep everything interface based and hide the implementation details. So
we recommend the use of DocumentFactory when creating new content nodes.
This all happens under the covers when you use the Element API for adding
content. e.g.

    Element foo = ..;
    Element bar = foo.addElement( "bar" );

Which will create a new Element implementation and add it to the end of the
content node list. What could have happened under the covers is some special
schema aware BarElement implementation just got created. Or the
DefaultElement class could be used. Or a persistent element, a lazy fetch or
indexed element implementation or whatever.

However if you wanted to add some content at an earlier point in the list
you could do this instead...

    Element foo = ..;
    Element bar = DocumentHelper.createElement( "bar" );
    List list = foo.content();
    list.add( 2, bar );

which would add the bar node at the second point in the node content list.
The bar Element would be created using the default singleton DocumentFactory
instance. (BTW this can be configured by the org.dom4j.factory system
property).

An alternative approach is to add the content using the normal API then move
them around. e.g.

    Element foo = ...;
    Element bar = foo.addElement( "bar" );
    List list = foo.content();
    list.remove( bar );
    list.add( 2, bar );



> (2) How do I get the parent node of a Text Node? It appears as
>     though Text Nodes don't implement getParent(). Am I missing
>     something?

Text, Element, Attribute, Document and all the other node interfaces all
implement the Node interface which has a getParent() method in it.

The core "Interface Hierarchy" is here in the javadoc (below the "Class
Hierarchy")

http://dom4j.org/apidocs/org/dom4j/package-tree.html

A nice picture or class diagram of this hierarchy would be nice one day...
;-)

So all Nodes have the getParent() method.

The quick answer is, if you use the code with the new patch (*) I mention
above, the getParent() will work for you on all Text nodes.

The longer answer is...

<longAnswer>
To support the flyweight pattern such that nodes (or fragments) could be
shared across documents for performance reasons (e.g. enumeration attributes
align="CENTER|RIGHT|LEFT" could share 3 flyweight instances across all
documents), I made the getParent() method an optional method - an
implementation may not support it. There's a method supportsParent() which
lets a user know if parent is supported or not.

The XPath engine is capable of navigating a tree and generating a result set
which supports the parent relationship even if the originating document is
full of flyweight objects.

The aforementioned patch avoids using the flyweight Text node by default, so
the getParent() method should now work fine on Text nodes.
</longAnswer>


Hope that all helps - even if its a little verbose ;-)

James






_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


_______________________________________________
dom4j-dev mailing list
[EMAIL PROTECTED]
http://lists.sourceforge.net/lists/listinfo/dom4j-dev
[dom4j-dev] Re: [dom4j-user] Accessing Child/Parent Nodes?

Reply via email to