I've come across one vexing and one minor issue while trying to debug
some castor XML code. Attached is a patch against 0.9.9 that I believe
fixes both issues. In patchfile order:
1. AnyNode.getStringValue for ELEMENT nodes acts strangely and at odds
with the documentation. In particular, consider the following xml:
<a>some<b>text</b>is here</a>
With the current code, getStringValue() of the node associated with the
<b> block will be "is heretext" (clearly unexpectedly). The current
code constructs a very large buffer, then appends all the getStringValue
of all later siblings, and only then appends the getStringValue of the
children. I cannot imagine ever wanting this behavior, but...
2. I've rewritten the code that merges adjoining TEXT elements for
slightly better efficiency (eliminating redundant checks, sizing the
buffer correctly at the beginning). Also, I used buf.substring(0)
rather than buf.toString() to avoid sharing structure between the
original stringbuffer and the new String. This is likely to be a small
memory win but a small performance loss, but YMMV.
cheers
--
*Michael Thome*
BBN Technologies 10 Moulton St, Cambridge MA 02138 USA
phone: +1 617 873 1853
--- AnyNode.java.orig 2005-10-20 16:39:31.000000000 -0400
+++ AnyNode.java 2005-10-20 16:42:30.000000000 -0400
@@ -486,14 +486,7 @@
return _value;
case ELEMENT:
StringBuffer result = new StringBuffer(4096);
- AnyNode tempNode = this.getNextSibling();
- while (tempNode != null &&
- tempNode.getNodeType() == TEXT)
- {
- result.append(tempNode.getStringValue());
- tempNode = tempNode.getNextSibling();
- }
- tempNode = this.getFirstChild();
+ AnyNode tempNode = this.getFirstChild();
while (tempNode != null) {
result.append(tempNode.getStringValue());
tempNode = tempNode.getNextSibling();
@@ -653,15 +646,25 @@
* @param node1 the AnyNode that receives the text value
* @param node2 the AnyNode that needs to be merges with node1.
*/
- private void mergeTextNode(AnyNode node1, AnyNode node2) {
- if (node1.getNodeType() != node2.getNodeType())
- return;
- if (node1.getNodeType() != AnyNode.TEXT)
- return;
- StringBuffer temp = new StringBuffer(node1.getStringValue());
- temp.append(node2.getStringValue());
- node1._value = temp.toString();
- node2 = null;
+ private void mergeTextNode(AnyNode node1, AnyNode node2) {
+ // these type checks are wasteful, since this is a private method used only
+ // once, immediately after such checks. If you want to retain the checks,
+ // IMHO an exception ought to be thrown rather than NOOP.
+ //if (node1.getNodeType() != node2.getNodeType())
+ // return;
+ //if (node1.getNodeType() != AnyNode.TEXT)
+ // return;
+
+ String v1 = node1.getStringValue();
+ String v2 = node2.getStringValue();
+ // size the buffer to exactly the right size
+ StringBuffer temp = new StringBuffer(v1.length()+v2.length());
+ temp.append(v1);
+ temp.append(v2);
+ // toString() will maintain a reference to the buffer, preventing GC of the buffer
+ // this construct avoids internal structure sharing.
+ node1._value = temp.substring(0);
+ // setting node2=null is useless.
}
}
-------------------------------------------------
If you wish to unsubscribe from this list, please
send an empty message to the following address:
[EMAIL PROTECTED]
-------------------------------------------------