I didn't review the patch, but Rob's summary of the changes seems correct from my knowledge of the XML 5th edition changes. Basically the XML 1.1 name chars are now the XML 5th edition name chars.

John

On 02/05/13 06:23, Gareth Reakes wrote:
Hey Rob,

        Thats great. I am not knowledgeable to review this from a correctness 
point of view but am happy to help on the code review / commit side. Anyone 
else out there who can check over for correctness?

Cheers,

Gareth

On 2 May 2013, at 03:01, Rob Cameron <[email protected]> wrote:

Xerces-C currently applies XML 1.0 4th edition rules to name characters
in XML 1.0 documents.    XML 1.0 5th edition permits a broader class
of name characters, based on those permitted in XML 1.1.

Proposal: that Xerces-C 3.2.0 be updated to include support for XML 1.0
5th edition.

Although our main work is with icXML, we've looked at making this change
in Xerces-C original code base so that icXML support for XML 1.0 5e is
compatible with us.

I'm not entirely sure that I've handled everything, but the following change
works in our test.  The change plan is below and a svn diff file is attached.

Here is the change plan.
----------------------------------


(1)  internal/CharTypeTables.hpp

Rename gFirstNameChars1_1 to be gFirstNameChars
Rename gNameChars1_1 to be gNameChars

(2) util/XMLChar.cpp
(2a)
    Update initCharFlagTable1_1() to use the gFirstNameChars, gNameChars
    Update initCharFlagTable() to use the set-ups from initCharFlagTable1_1()
      to define gNameCharMask, gNCNameCharMask, and gFirstNameCharMask.
     //
     //  Name characters are special. A name is made up of a number of
     //  different tables and some special case characters.
     //
     initOneTable(gNameChars, gNameCharMask);

     //
     //  Name characters are special. A name is made up of a number of
     //  different tables and some special case characters.
     //
     initOneTable(gNameChars, gNCNameCharMask);
     gTmpCharTable[chColon] &= ~gNCNameCharMask;

     //
     //  Then do the first name char
     //
     initOneTable(gFirstNameChars, gFirstNameCharMask);

(2b) #define NEED_TO_GEN_TABLE
compile and do a sample run of a Xerces app, generate table.out

(2c) Replace the XMLChar1_0::fgCharCharsTable1_0 definition pf XMLChar.cpp
with that from table.out.

(3) XMLChar.hpp
     Modify XMLChar1_0::isFirstNameChar, XMLChar1_0::isFirstNCNameChar, 
XMLChar1_0::isNameChar, XMLChar1_0::isNCNameChar
     to each check for and allow characters in the #x10000-#xEFFFF range

     else {
         if ((toCheck >= 0xD800) && (toCheck <= 0xDB7F))
            if ((toCheck2 >= 0xDC00) && (toCheck2 <= 0xDFFF))
                return true;
     }


(4)  Modify XMLReader::getName and XMLReader::getNCName
        to allow surrogate pairs in Names and NCNames
        (i.e., use the version 1.1 logic for both 1.0 and 1.1).


<diff5e>
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to