Hey Rob,
Thats great. I am not knowledgeable to review this from a correctness
point of view but am happy to help on the code review / commit side. Anyone
else out there who can check over for correctness?
Cheers,
Gareth
On 2 May 2013, at 03:01, Rob Cameron <[email protected]> wrote:
> Xerces-C currently applies XML 1.0 4th edition rules to name characters
> in XML 1.0 documents. XML 1.0 5th edition permits a broader class
> of name characters, based on those permitted in XML 1.1.
>
> Proposal: that Xerces-C 3.2.0 be updated to include support for XML 1.0
> 5th edition.
>
> Although our main work is with icXML, we've looked at making this change
> in Xerces-C original code base so that icXML support for XML 1.0 5e is
> compatible with us.
>
> I'm not entirely sure that I've handled everything, but the following change
> works in our test. The change plan is below and a svn diff file is attached.
>
> Here is the change plan.
> ----------------------------------
>
>
> (1) internal/CharTypeTables.hpp
>
> Rename gFirstNameChars1_1 to be gFirstNameChars
> Rename gNameChars1_1 to be gNameChars
>
> (2) util/XMLChar.cpp
> (2a)
> Update initCharFlagTable1_1() to use the gFirstNameChars, gNameChars
> Update initCharFlagTable() to use the set-ups from initCharFlagTable1_1()
> to define gNameCharMask, gNCNameCharMask, and gFirstNameCharMask.
> //
> // Name characters are special. A name is made up of a number of
> // different tables and some special case characters.
> //
> initOneTable(gNameChars, gNameCharMask);
>
> //
> // Name characters are special. A name is made up of a number of
> // different tables and some special case characters.
> //
> initOneTable(gNameChars, gNCNameCharMask);
> gTmpCharTable[chColon] &= ~gNCNameCharMask;
>
> //
> // Then do the first name char
> //
> initOneTable(gFirstNameChars, gFirstNameCharMask);
>
> (2b) #define NEED_TO_GEN_TABLE
> compile and do a sample run of a Xerces app, generate table.out
>
> (2c) Replace the XMLChar1_0::fgCharCharsTable1_0 definition pf XMLChar.cpp
> with that from table.out.
>
> (3) XMLChar.hpp
> Modify XMLChar1_0::isFirstNameChar, XMLChar1_0::isFirstNCNameChar,
> XMLChar1_0::isNameChar, XMLChar1_0::isNCNameChar
> to each check for and allow characters in the #x10000-#xEFFFF range
>
> else {
> if ((toCheck >= 0xD800) && (toCheck <= 0xDB7F))
> if ((toCheck2 >= 0xDC00) && (toCheck2 <= 0xDFFF))
> return true;
> }
>
>
> (4) Modify XMLReader::getName and XMLReader::getNCName
> to allow surrogate pairs in Names and NCNames
> (i.e., use the version 1.1 logic for both 1.0 and 1.1).
>
>
> <diff5e>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
--
Gareth Reakes, CTO we7 - Great Free Music
+44-20-7117-0809 http://www.we7.com
"The music business is a cruel and shallow money trench, a long plastic hallway
where thieves and pimps run free, and good men die like dogs. There's also a
negative side." - Hunter S. Thompson
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]