Noah, I think what you are trying to do is derive a canonical form of each XML document for comparison. This turns out to be more involved than simply addressing ignorable whitespace. The canonicalization scheme should address things like handling of empty tags (e.g. <MyTag></MyTag> versus <MyTag/>), attribute ordering, character encoding, comment preservation, XML Declaration, etc. As you mentioned this is a very useful concept especially when dealing with XML digital signatures.
W3C has developed a Recommendation called Canonical XML for this purpose (see http://www.w3.org/TR/xml-c14n). The XML Security project (http://xml.apache.org/security/) has an XML Canonicalizer class that implements this W3C Recommendation. -Mark -----Original Message----- From: Noah Davis [mailto:[EMAIL PROTECTED] Sent: Thursday, May 18, 2006 6:50 PM To: Edelson, Justin Cc: dom4j-user@lists.sourceforge.net Subject: Re: [dom4j-user] Ignorable white space and confusion So I've ended up writing a little piece of code to remove whitespace text nodes: public static void removeWhitespaceNodes(Branch a_branch) { for (int i = 0; i < a_branch.nodeCount(); i++) { Node checkNode = a_branch.node(i); if (checkNode.getNodeType() == Node.TEXT_NODE) { if (checkNode.getText().trim().equals("")) { checkNode.detach(); } } else if (checkNode.getNodeType() == Node.ELEMENT_NODE) { removeWhitespaceNodes((Element)checkNode); } } } ------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ dom4j-user mailing list dom4j-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dom4j-user ------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ dom4j-user mailing list dom4j-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dom4j-user