After some odd data corruption errors in my app, I tracked it down to
dom4j and managed to simplify the error to a short reproducable (with
1.6.1) case. After enough attributes that include a > or <,
dom4j starts to mess up and duplicate parts of the data from one
attribute into the value of another.
<?xml version="1.0" encoding="UTF-8"?>
<Test test1="11111 <" test2="22222 <" test3="33333 <"
test4="44444 <" test5="55555 <" test6="66666 <" test7="77777
<" test8="88888 <" test9="99999 <" test10="101010101010 <"
/>
becomes:
<?xml version="1.0" encoding="UTF-8"?>
<Test test1="11111 <" test2="22222 <" test3="33333 <"
test4="44444 <" test5="55555 <" test6="66666 <" test7="88888
<" test8="88888 <" test9="1010101" test10="101010101010 <"/>
when passed through dom4j. Notice test7 and test 9 are corrupted with
some of the value from test8 and test10 respectively. I've never dug
into the dom4j code before, could someone give me a tip on where to
look to troubleshout this very weird and destructive error?
Test program:
import org.dom4j.*;
public class ParseTest
{
public static String xmlString = "<?xml version=\"1.0\"
encoding=\"UTF-8\"?><Test test1=\"11111 <\" test2=\"22222 <\"
test3=\"33333 <\" test4=\"44444 <\" test5=\"55555 <\"
test6=\"66666 <\" test7=\"77777 <\" test8=\"88888 <\"
test9=\"99999 <\" test10=\"101010101010 <\" />";
public static void main(String[] args) throws Exception
{
System.out.println(xmlString);
Document xml = DocumentHelper.parseText(xmlString);
System.out.println(xml.asXML());
}
}
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
dom4j-dev mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dom4j-dev