FYI: Comparing XML-1.0 and XML-1.1 specs. Comparing the XML 1.0-5 for name chars, the text from XML 1.1-2 has been copied into the XML 1.0-5 specification verbatum.
The distinction of control characters disallowed in XML-1.0 and restricted in XML-1.1 still remains. The differences in XML-1.0 and XML-1.1 end-of-line handling remain. Steven J. Hathaway > I didn't review the patch, but Rob's summary of the changes seems > correct from my knowledge of the XML 5th edition changes. Basically the > XML 1.1 name chars are now the XML 5th edition name chars. > > John > > On 02/05/13 06:23, Gareth Reakes wrote: >> Hey Rob, >> >> Thats great. I am not knowledgeable to review this from a correctness >> point of view but am happy to help on the code review / commit side. >> Anyone else out there who can check over for correctness? >> >> Cheers, >> >> Gareth >> >> On 2 May 2013, at 03:01, Rob Cameron <[email protected]> >> wrote: >> >>> Xerces-C currently applies XML 1.0 4th edition rules to name characters >>> in XML 1.0 documents. XML 1.0 5th edition permits a broader class >>> of name characters, based on those permitted in XML 1.1. >>> >>> Proposal: that Xerces-C 3.2.0 be updated to include support for XML 1.0 >>> 5th edition. >>> >>> Although our main work is with icXML, we've looked at making this >>> change >>> in Xerces-C original code base so that icXML support for XML 1.0 5e is >>> compatible with us. >>> >>> I'm not entirely sure that I've handled everything, but the following >>> change >>> works in our test. The change plan is below and a svn diff file is >>> attached. >>> >>> Here is the change plan. >>> ---------------------------------- >>> >>> >>> (1) internal/CharTypeTables.hpp >>> >>> Rename gFirstNameChars1_1 to be gFirstNameChars >>> Rename gNameChars1_1 to be gNameChars >>> >>> (2) util/XMLChar.cpp >>> (2a) >>> Update initCharFlagTable1_1() to use the gFirstNameChars, >>> gNameChars >>> Update initCharFlagTable() to use the set-ups from >>> initCharFlagTable1_1() >>> to define gNameCharMask, gNCNameCharMask, and gFirstNameCharMask. >>> // >>> // Name characters are special. A name is made up of a number of >>> // different tables and some special case characters. >>> // >>> initOneTable(gNameChars, gNameCharMask); >>> >>> // >>> // Name characters are special. A name is made up of a number of >>> // different tables and some special case characters. >>> // >>> initOneTable(gNameChars, gNCNameCharMask); >>> gTmpCharTable[chColon] &= ~gNCNameCharMask; >>> >>> // >>> // Then do the first name char >>> // >>> initOneTable(gFirstNameChars, gFirstNameCharMask); >>> >>> (2b) #define NEED_TO_GEN_TABLE >>> compile and do a sample run of a Xerces app, generate table.out >>> >>> (2c) Replace the XMLChar1_0::fgCharCharsTable1_0 definition pf >>> XMLChar.cpp >>> with that from table.out. >>> >>> (3) XMLChar.hpp >>> Modify XMLChar1_0::isFirstNameChar, XMLChar1_0::isFirstNCNameChar, >>> XMLChar1_0::isNameChar, XMLChar1_0::isNCNameChar >>> to each check for and allow characters in the #x10000-#xEFFFF >>> range >>> >>> else { >>> if ((toCheck >= 0xD800) && (toCheck <= 0xDB7F)) >>> if ((toCheck2 >= 0xDC00) && (toCheck2 <= 0xDFFF)) >>> return true; >>> } >>> >>> >>> (4) Modify XMLReader::getName and XMLReader::getNCName >>> to allow surrogate pairs in Names and NCNames >>> (i.e., use the version 1.1 logic for both 1.0 and 1.1). >>> >>> >>> <diff5e> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: [email protected] >>> For additional commands, e-mail: [email protected] >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
