I use dom4j for almost all my XML work because its API is so intuitive and I can build apps faster with it. However, I do not like the way dom4j uses reference equality for objects, which is done for "performance reasons" according to the FAQ. I'll make a case for this to be changed.
There is already an operator defined for reference equality: ==. There is obviously another type of equality between XML fragments, based on their value, and I think it is the intention of the equals() method to enable that type of equality to be evaluated. Although the performance costs for calling equals() on a large branch node is great, I believe this is the users choice, as e.g. calling equals() on an attribute is not expensive, and yet useful.
The solution described in the FAQ (use a NodeComparitor) only works when the evaluation is explicit. The problem I have is that I like to use the Collections APIs, and they rely heavily on equals() being defined. Combined with the fact that I'm working with multiple XML docs with the same tags in them, so I cannot rely on reference equality as a "cheap" substitute, as you might in a single document with unique nodes.
Perhaps there could be some global or document-wide switch specified at creation time that controlled the equals() method? Such as switch would have minimal performance cost, as branches that are usually taken one way hardly slow down modern CPUs.
Regards Ben Hutchison
|