digesting xml content with NodeCreateRule swallows spaces.
----------------------------------------------------------

                 Key: DIGESTER-120
                 URL: https://issues.apache.org/jira/browse/DIGESTER-120
             Project: Commons Digester
          Issue Type: Bug
    Affects Versions: 1.8
         Environment: jdk 1.4.2_08, digester 1.8
            Reporter: Nguyen Thanh Son Daniel


i need to process an xml file that contains entities: ie:

<?xml version="1.0" encoding="UTF-8"?>
<top>
<body>&#65; &#65;</body>
</top>

i'm using digester as follows:

Digester digester = new Digester ();
digester.addRule ("top", new ObjectCreateRule (MyContent.class));
digester.addRule ("top/body", new NodeCreateRule ());
digester.addSetNext ("top/body", "setBody");

then
...
digester.parse (file);

MyContent class transforms the node into text as follows:

public class MyContent
{
 public void setBody (Element node)
 {
  String content = serializeNode (node);
  System.out.println (content);
 }
 ...
}

the content displayed is in this case: <body>AA</body>

if the body was encoded in the xml file as: <top><body>A A</body></top>, the 
content would then be correctly displayed as: 
<body>A A</body>

looking at the NodeCreateRule.NodeBuilder.characters () implementation, the 
following code generates the problem: 
String str = new String(ch, start, length);
if (str.trim().length() > 0) { 
 top.appendChild(doc.createTextNode(str));

when entities are being used; the characters () method is called for 'A', ' ' 
and 'A' in the first case. in the second case, it is called once with 'A A'.



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to