Greetings Rob, I can also check for correctness. I will need to download the 5th edition and familiarize myself with the changes. I have already done significant proof-of-correctness evaluation of Xerces-C supporting the 4th edition of XML-1.0.
I can also do some evaluation of XML-1.1 compliance - albeit incomplete. More eyes are better. Sincerely, Steven J. Hathaway > Hey Rob, > > Thats great. I am not knowledgeable to review this from a correctness > point of view but am happy to help on the code review / commit side. > Anyone else out there who can check over for correctness? > > Cheers, > > Gareth > > On 2 May 2013, at 03:01, Rob Cameron <[email protected]> > wrote: > >> Xerces-C currently applies XML 1.0 4th edition rules to name characters >> in XML 1.0 documents. XML 1.0 5th edition permits a broader class >> of name characters, based on those permitted in XML 1.1. >> >> Proposal: that Xerces-C 3.2.0 be updated to include support for XML 1.0 >> 5th edition. >> >> Although our main work is with icXML, we've looked at making this change >> in Xerces-C original code base so that icXML support for XML 1.0 5e is >> compatible with us. >> >> I'm not entirely sure that I've handled everything, but the following >> change >> works in our test. The change plan is below and a svn diff file is >> attached. >> >> Here is the change plan. >> ---------------------------------- >> >> >> (1) internal/CharTypeTables.hpp >> >> Rename gFirstNameChars1_1 to be gFirstNameChars >> Rename gNameChars1_1 to be gNameChars >> >> (2) util/XMLChar.cpp >> (2a) >> Update initCharFlagTable1_1() to use the gFirstNameChars, gNameChars >> Update initCharFlagTable() to use the set-ups from >> initCharFlagTable1_1() >> to define gNameCharMask, gNCNameCharMask, and gFirstNameCharMask. >> // >> // Name characters are special. A name is made up of a number of >> // different tables and some special case characters. >> // >> initOneTable(gNameChars, gNameCharMask); >> >> // >> // Name characters are special. A name is made up of a number of >> // different tables and some special case characters. >> // >> initOneTable(gNameChars, gNCNameCharMask); >> gTmpCharTable[chColon] &= ~gNCNameCharMask; >> >> // >> // Then do the first name char >> // >> initOneTable(gFirstNameChars, gFirstNameCharMask); >> >> (2b) #define NEED_TO_GEN_TABLE >> compile and do a sample run of a Xerces app, generate table.out >> >> (2c) Replace the XMLChar1_0::fgCharCharsTable1_0 definition pf >> XMLChar.cpp >> with that from table.out. >> >> (3) XMLChar.hpp >> Modify XMLChar1_0::isFirstNameChar, XMLChar1_0::isFirstNCNameChar, >> XMLChar1_0::isNameChar, XMLChar1_0::isNCNameChar >> to each check for and allow characters in the #x10000-#xEFFFF range >> >> else { >> if ((toCheck >= 0xD800) && (toCheck <= 0xDB7F)) >> if ((toCheck2 >= 0xDC00) && (toCheck2 <= 0xDFFF)) >> return true; >> } >> >> >> (4) Modify XMLReader::getName and XMLReader::getNCName >> to allow surrogate pairs in Names and NCNames >> (i.e., use the version 1.1 logic for both 1.0 and 1.1). >> >> >> <diff5e> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] > > -- > Gareth Reakes, CTO we7 - Great Free Music > +44-20-7117-0809 http://www.we7.com > > "The music business is a cruel and shallow money trench, a long plastic > hallway where thieves and pimps run free, and good men die like dogs. > There's also a negative side." - Hunter S. Thompson > > > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
