LexicalHandler startEntity/endEntity events not paired and have incorrect
arguments
-----------------------------------------------------------------------------------
Key: XERCESC-1828
URL: https://issues.apache.org/jira/browse/XERCESC-1828
Project: Xerces-C++
Issue Type: Bug
Components: SAX/SAX2
Affects Versions: 2.8.0
Environment: OS/X, Win32
Reporter: Erik Wright
Attachments: test.xml
It appears that the LexicalHandler events startEntity and endEntity are not
sent correctly when parsing a document with a DTD that itself references
external entities.
(Note: I will attach sample XML, repro code, and the full output of the code.
The following is a summary.)
For example, I have been parsing a valid XHTML document. The strict XHTML DTD
includes 4 other files with entity declarations. I see the following events on
my LexicalHandler (ignoring elements, characters, whitespace, external entity
declarations, and comments):
startDocument
...
startDTD: html, -//W3C//DTD XHTML 1.0 Strict//EN,
http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd
...
startEntity: [dtd]
...
startEntity: [dtd]
...
startEntity: [dtd]
...
startEntity: [dtd]
...
endEntity: [dtd]
...
endDTD
...
endDocument
I expected something more like the following (as generated by the standard SAX
parser in Java 6):
startDocument
startDTD: 'html', '-//W3C//DTD XHTML 1.0 Strict//EN',
'http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd'
startEntity: '[dtd]'
startEntity: '%HTMLlat1'
endEntity: '%HTMLlat1'
startEntity: '%HTMLsymbol'
endEntity: '%HTMLsymbol'
startEntity: '%HTMLspecial'
endEntity: '%HTMLspecial'
startEntity: '%head.misc'
endEntity: '%head.misc'
startEntity: '%head.misc'
endEntity: '%head.misc'
startEntity: '%head.misc'
endEntity: '%head.misc'
startEntity: '%head.misc'
endEntity: '%head.misc'
startEntity: '%head.misc'
endEntity: '%head.misc'
startEntity: '%block'
endEntity: '%block'
startEntity: '%inline'
endEntity: '%inline'
startEntity: '%misc'
endEntity: '%misc'
startEntity: '%block'
endEntity: '%block'
startEntity: '%misc'
endEntity: '%misc'
startEntity: '%block'
endEntity: '%block'
startEntity: '%inline'
endEntity: '%inline'
startEntity: '%misc'
endEntity: '%misc'
endEntity: '[dtd]'
endDTD
startPrefixMapping: '', 'http://www.w3.org/1999/xhtml'
endPrefixMapping: ''
endDocument
At a minimum, the mismatch of startEntity/endEntity events appears to be caused
by the following code from DTDScanner::scanExtSubsetDecl (notice that the
conditions are not the same):
if (fDocTypeHandler && !inIncludeSect)
fDocTypeHandler->startExtSubset();
...
...
...
if (fDocTypeHandler && isDTD)
fDocTypeHandler->endExtSubset();
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]