Hello everybody,

I found a bug while playing with the XMLPullParser:

When using XML entities encoded in hexadecimal the parser throws an exception. For example: &#xA; is the hexadecimal encoding of &#10; which is the line feed. If you try to parse: '<tag contents="one&#xAtwo" />' you get a "Expected Number" error.


I investigated the problem, and found that it is caused by the method XMLPullParser>>readNumberBase:

If the base of the parsed number is not 10, it executes the following lines:
numberString := self nextUpTo: $;.
self stream skip: -1.
[...]

However, the method nextUpTo: does not exist. A MNU is then raised and handled in nextCharReference: by a greedy
"on: Error do: [:ex | self errorExpected: 'Number.']."
thus causing the exception.

The solution is either to implement nextUpTo: or to change it in one of the two existing methods: nextUpToAll: or nextTrimmedBlanksUpTo:
Example: numberString := self nextUpToAll: ';'.

Also, in the following line, it calls the accessor "stream", which does not exist and causes the same problem.
The solution here is to change
self stream skip: -1.
into
stream skip: -1.

The following two test cases show the problem. The first one shows the correct behavior:

XMLPullParserTest>>testASCIIEntity
    | parser |

    parser := XMLPullParser parse: '
        <tag contents="one&#10;two" />'.

self assert: (parser next at: #contents) equals: (Character lf join: #(one two))


The second one shows the error:

XMLPullParserTest>>testUnicodeEntity
    | parser |

    parser := XMLPullParser parse: '
        <tag contents="one&#xA;two" />'.

self assert: (parser next at: #contents) equals: (Character lf join: #(one two))


I hope it is clear enough. I can also commit the fix myself, if you give me write access.

Thanks,
Tommaso

Reply via email to