Maybe not exactly what you need but how about:
Element.getText()
"Returns the text value of this element without recursing through child elements. This method iterates through all Text, CDATA and Entity nodes that this element contains and appends the text values together."
The won't work because it does not recurse into its descendants.
or ...
Element.getStringValue()
"Returns the XPath string-value of this node. The behaviour of this method is defined in the XPath specification. This method returns the string-value of all the contained Text, CDATA, Entity and Element nodes all appended together."
This does work, but it concatenates strings. For example if we have:
<h1>The Title</h1> <p>Paragraph 1.</p> <p>Paragraph 2.</p>
You get:
The TitleParagraph 1.Paragraph 2.
Which, of course, throughs off results for a search query and makes search results summary text look a little weird. I wonder if this method would be more useful in general if spaces (or some other separator) were inserted between text nodes. Although, then it would not be an exact representation of the text. But I can't see the use of it any other way. What do you think? How are people using Element.getStringValue()?
I did finally try this after posting, but ended up looping through the list of Text nodes so I can insert a space.
It works great so far. It is much easier than sax to weed out unnecessary content nodes. In addition, you can create special search fields for (lucene)documents so you can have the search find special content (for example, you want to search for all glossary entries in your content, etc).
Thanks, -Rob
Regards, Edwin
------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ dom4j-user mailing list dom4j-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dom4j-user