[ http://issues.apache.org/jira/browse/XMLBEANS-274?page=all ]

Peter Rodgers updated XMLBEANS-274:
-----------------------------------

        Summary: Over zealous whitespace cropping after parsing entity like 
&  (was: Over zelous whitespace cropping after parsing entity like &)
    Description: 
When white space stripping is specified the parser does not detect XML entities 
such as & and strips the whitespace following each entity.

For example

<root>dog &amp; cat</root>

is parsed as

<root>dog &amp;cat</root>

The cause of the problem is the stripLeft() method in the 
org.apache.xmlbeans.impl.store.CharUtil

Below is a fixed version of the method that detects the ';' character after an 
entity which indicates that whitespace is significant and must be preserved.  
Note this code does not fix the case where the iteration is a for loop.

public Object stripLeft ( Object src, int off, int cch )
    {
        assert isValid( src, off, cch );

        if (cch > 0)
        {
            if (src instanceof char[])
            {
                char[] chars = (char[]) src;

                while ( cch > 0 && isWhiteSpace( chars[ off ] ) && chars[off - 
1]!=';' )  //Fix for &amp; etc
                    { cch--; off++; }
            }
            else if (src instanceof String)
            {
                String s = (String) src;

                while ( cch > 0 && isWhiteSpace( s.charAt( off ) ) && 
s.charAt(off - 1)!=';' ) //Fix for &amp; etc
                    { cch--; off++; }
            }
            else
            {
                int count = 0;
                
                for ( _charIter.init( src, off, cch ) ; _charIter.hasNext() ; 
count++ )
                    if (!isWhiteSpace( _charIter.next() ))
                        break;
                
                _charIter.release();

                off += count;
            }
        }

        if (cch == 0)
        {
            _offSrc = 0;
            _cchSrc = 0;
            
            return null;
        }

        _offSrc = off;
        _cchSrc = cch;

        return src;
    }

  was:
When white space stripping is specified the parser does not detect XML entities 
such as &amp; and strips the whitespace following each entity.

For example

<root>dog &amp; cat</root>

is parsed as

<root>doc &amp;cat</root>

The cause of the problem is the stripLeft() method in the 
org.apache.xmlbeans.impl.store.CharUtil

Below is a fixed version of the method that detects the ';' character after an 
entity which indicates that whitespace is significant and must be preserved.  
Note this code does not fix the case where the iteration is a for loop.

public Object stripLeft ( Object src, int off, int cch )
    {
        assert isValid( src, off, cch );

        if (cch > 0)
        {
            if (src instanceof char[])
            {
                char[] chars = (char[]) src;

                while ( cch > 0 && isWhiteSpace( chars[ off ] ) && chars[off - 
1]!=';' )  //Fix for &amp; etc
                    { cch--; off++; }
            }
            else if (src instanceof String)
            {
                String s = (String) src;

                while ( cch > 0 && isWhiteSpace( s.charAt( off ) ) && 
s.charAt(off - 1)!=';' ) //Fix for &amp; etc
                    { cch--; off++; }
            }
            else
            {
                int count = 0;
                
                for ( _charIter.init( src, off, cch ) ; _charIter.hasNext() ; 
count++ )
                    if (!isWhiteSpace( _charIter.next() ))
                        break;
                
                _charIter.release();

                off += count;
            }
        }

        if (cch == 0)
        {
            _offSrc = 0;
            _cchSrc = 0;
            
            return null;
        }

        _offSrc = off;
        _cchSrc = cch;

        return src;
    }


> Over zealous whitespace cropping after parsing entity like &amp;
> ----------------------------------------------------------------
>
>          Key: XMLBEANS-274
>          URL: http://issues.apache.org/jira/browse/XMLBEANS-274
>      Project: XMLBeans
>         Type: Bug

>     Versions: Version 2.1
>  Environment: All
>     Reporter: Peter Rodgers

>
> When white space stripping is specified the parser does not detect XML 
> entities such as &amp; and strips the whitespace following each entity.
> For example
> <root>dog &amp; cat</root>
> is parsed as
> <root>dog &amp;cat</root>
> The cause of the problem is the stripLeft() method in the 
> org.apache.xmlbeans.impl.store.CharUtil
> Below is a fixed version of the method that detects the ';' character after 
> an entity which indicates that whitespace is significant and must be 
> preserved.  Note this code does not fix the case where the iteration is a for 
> loop.
> public Object stripLeft ( Object src, int off, int cch )
>     {
>         assert isValid( src, off, cch );
>         if (cch > 0)
>         {
>             if (src instanceof char[])
>             {
>                 char[] chars = (char[]) src;
>                 while ( cch > 0 && isWhiteSpace( chars[ off ] ) && chars[off 
> - 1]!=';' )  //Fix for &amp; etc
>                     { cch--; off++; }
>             }
>             else if (src instanceof String)
>             {
>                 String s = (String) src;
>                 while ( cch > 0 && isWhiteSpace( s.charAt( off ) ) && 
> s.charAt(off - 1)!=';' ) //Fix for &amp; etc
>                     { cch--; off++; }
>             }
>             else
>             {
>                 int count = 0;
>                 
>                 for ( _charIter.init( src, off, cch ) ; _charIter.hasNext() ; 
> count++ )
>                     if (!isWhiteSpace( _charIter.next() ))
>                         break;
>                 
>                 _charIter.release();
>                 off += count;
>             }
>         }
>         if (cch == 0)
>         {
>             _offSrc = 0;
>             _cchSrc = 0;
>             
>             return null;
>         }
>         _offSrc = off;
>         _cchSrc = cch;
>         return src;
>     }

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to