Hi Paul,

these default attribute values are specified in the XHTML DTD. When parsing your file, dom4j uses a SAX parser. It's that SAX parsers that passes the attribute values to dom4j (including the attribute with default values as specified in the DTD). So this is hard to solve at the dom4j level. I see two solutions/shortcuts:
1. dom4j should check if the SAX parser supports the org.xml.sax.ext.Attributes2 interface which also indicates if an attribute has been specified in the xml file. If the SAX parser supports this interface, dom4j could ignore these default attributes (or mark these attributes and ignore them when writing them again). Problem with this approach: it's not yet implemented in dom4j and not many SAX parsers support this Attributes2 interface yet.
2. this is more a 'hack' solution: you could try to remove your DTD declaration from your XHTML file before parsing it.


regards,
Maarten

Kaiser, Paul wrote:

I'm parsing a text file containing an XHTML document into a Document
object, finding a fragment via XPath, then doing a regular expression
search on the fragment as textual XML. I'm finding that the XML text
rendered from the XMLWriter via asXML() is not consistent with the input
document.


For example, when a table cell tag that came in as <td> is rendered, it
comes out as <td rowspan="1" colspan="1">. Since the regular expressions
are developed while looking at the input XHTML document, having the
fragment rendered differently is inconvenient in the least and wrong in
the worst case.

Are these extra attributes defaults found in the document type? Is there
a way to get a faithful rendering of a document fragment wrt to the
input?

Thanks,
Paul Kaiser



-------------------------------------------------------
This SF.Net email is sponsored by: GNOME Foundation
Hackers Unite! GUADEC: The world's #1 Open Source Desktop Event.
GNOME Users and Developers European Conference, 28-30th June in Norway
http://2004/guadec.org
_______________________________________________
dom4j-user mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dom4j-user




-------------------------------------------------------
This SF.Net email is sponsored by: GNOME Foundation
Hackers Unite!  GUADEC: The world's #1 Open Source Desktop Event.
GNOME Users and Developers European Conference, 28-30th June in Norway
http://2004/guadec.org
_______________________________________________
dom4j-user mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dom4j-user

Reply via email to