[
https://issues.apache.org/jira/browse/DIGESTER-120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12579049#action_12579049
]
Simon Kitching commented on DIGESTER-120:
-----------------------------------------
Ok, using entities I was able to duplicate this pretty quickly. That also makes
sense; the parser has the entity already cached as a string, so of course makes
a separate call to the characters method for it.
I have committed a patch to the trunk, and deployed a new 1.8.1-SNAPSHOT
version to the apache maven snapshot repository. Could you please try it out
and confirm it fixes the problem for you?
Thanks, Simon
> digesting xml content with NodeCreateRule swallows spaces.
> ----------------------------------------------------------
>
> Key: DIGESTER-120
> URL: https://issues.apache.org/jira/browse/DIGESTER-120
> Project: Commons Digester
> Issue Type: Bug
> Affects Versions: 1.8
> Environment: jdk 1.4.2_08, digester 1.8
> Reporter: Nguyen Thanh Son Daniel
> Attachments: digester-patch.txt, simple.xml
>
>
> i need to process an xml file that contains entities: ie:
> <?xml version="1.0" encoding="UTF-8"?>
> <top>
> <body>A A</body>
> </top>
> i'm using digester as follows:
> Digester digester = new Digester ();
> digester.addRule ("top", new ObjectCreateRule (MyContent.class));
> digester.addRule ("top/body", new NodeCreateRule ());
> digester.addSetNext ("top/body", "setBody");
> then
> ...
> digester.parse (file);
> MyContent class transforms the node into text as follows:
> public class MyContent
> {
> public void setBody (Element node)
> {
> String content = serializeNode (node);
> System.out.println (content);
> }
> ...
> }
> the content displayed is in this case: <body>AA</body>
> if the body was encoded in the xml file as: <top><body>A A</body></top>, the
> content would then be correctly displayed as:
> <body>A A</body>
> looking at the NodeCreateRule.NodeBuilder.characters () implementation, the
> following code generates the problem:
> String str = new String(ch, start, length);
> if (str.trim().length() > 0) {
> top.appendChild(doc.createTextNode(str));
> when entities are being used; the characters () method is called for 'A', ' '
> and 'A' in the first case. in the second case, it is called once with 'A A'.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.