[ https://issues.apache.org/jira/browse/XERCESC-2016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Scott Cantor updated XERCESC-2016: ---------------------------------- Affects Version/s: (was: 3.1.1) > XML 1.0 5th edition support > --------------------------- > > Key: XERCESC-2016 > URL: https://issues.apache.org/jira/browse/XERCESC-2016 > Project: Xerces-C++ > Issue Type: Improvement > Components: Non-Validating Parser > Environment: All > Reporter: Rob Cameron > Assignee: Alberto Massari > Fix For: 3.2.0 > > Attachments: diff5e > > > Xerces-C currently applies XML 1.0 4th edition rules to name characters > in XML 1.0 documents. XML 1.0 5th edition permits a broader class > of name characters, based on those permitted in XML 1.1. > Proposal: that Xerces-C 3.2.0 be updated to include support for XML 1.0 > 5th edition. > Although our main work is with icXML, we've looked at making this change > in Xerces-C original code base so that icXML support for XML 1.0 5e is > compatible with us. > I'm not entirely sure that I've handled everything, but the following change > works in our test. The change plan is below and a svn diff file is > attached. > Here is the change plan. > ---------------------------------- > (1) internal/CharTypeTables.hpp > Rename gFirstNameChars1_1 to be gFirstNameChars > Rename gNameChars1_1 to be gNameChars > (2) util/XMLChar.cpp > (2a) > Update initCharFlagTable1_1() to use the gFirstNameChars, gNameChars > Update initCharFlagTable() to use the set-ups from initCharFlagTable1_1() > to define gNameCharMask, gNCNameCharMask, and gFirstNameCharMask. > // > // Name characters are special. A name is made up of a number of > // different tables and some special case characters. > // > initOneTable(gNameChars, gNameCharMask); > // > // Name characters are special. A name is made up of a number of > // different tables and some special case characters. > // > initOneTable(gNameChars, gNCNameCharMask); > gTmpCharTable[chColon] &= ~gNCNameCharMask; > // > // Then do the first name char > // > initOneTable(gFirstNameChars, gFirstNameCharMask); > (2b) #define NEED_TO_GEN_TABLE > compile and do a sample run of a Xerces app, generate table.out > (2c) Replace the XMLChar1_0::fgCharCharsTable1_0 definition pf XMLChar.cpp > with that from table.out. > (3) XMLChar.hpp > Modify XMLChar1_0::isFirstNameChar, XMLChar1_0::isFirstNCNameChar, > XMLChar1_0::isNameChar, XMLChar1_0::isNCNameChar > to each check for and allow characters in the #x10000-#xEFFFF range > else { > if ((toCheck >= 0xD800) && (toCheck <= 0xDB7F)) > if ((toCheck2 >= 0xDC00) && (toCheck2 <= 0xDFFF)) > return true; > } > (4) Modify XMLReader::getName and XMLReader::getNCName > to allow surrogate pairs in Names and NCNames > (i.e., use the version 1.1 logic for both 1.0 and 1.1). -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: c-dev-unsubscr...@xerces.apache.org For additional commands, e-mail: c-dev-h...@xerces.apache.org