Greetings Rob,

I can also check for correctness.  I will need to download the 5th
edition and familiarize myself with the changes.  I have already done
significant proof-of-correctness evaluation of Xerces-C supporting
the 4th edition of XML-1.0.

I can also do some evaluation of XML-1.1 compliance - albeit incomplete.

More eyes are better.

Sincerely,
Steven J. Hathaway

> Hey Rob,
>
>       Thats great. I am not knowledgeable to review this from a correctness
> point of view but am happy to help on the code review / commit side.
> Anyone else out there who can check over for correctness?
>
> Cheers,
>
> Gareth
>
> On 2 May 2013, at 03:01, Rob Cameron <[email protected]>
> wrote:
>
>> Xerces-C currently applies XML 1.0 4th edition rules to name characters
>> in XML 1.0 documents.    XML 1.0 5th edition permits a broader class
>> of name characters, based on those permitted in XML 1.1.
>>
>> Proposal: that Xerces-C 3.2.0 be updated to include support for XML 1.0
>> 5th edition.
>>
>> Although our main work is with icXML, we've looked at making this change
>> in Xerces-C original code base so that icXML support for XML 1.0 5e is
>> compatible with us.
>>
>> I'm not entirely sure that I've handled everything, but the following
>> change
>> works in our test.  The change plan is below and a svn diff file is
>> attached.
>>
>> Here is the change plan.
>> ----------------------------------
>>
>>
>> (1)  internal/CharTypeTables.hpp
>>
>> Rename gFirstNameChars1_1 to be gFirstNameChars
>> Rename gNameChars1_1 to be gNameChars
>>
>> (2) util/XMLChar.cpp
>> (2a)
>>    Update initCharFlagTable1_1() to use the gFirstNameChars, gNameChars
>>    Update initCharFlagTable() to use the set-ups from
>> initCharFlagTable1_1()
>>      to define gNameCharMask, gNCNameCharMask, and gFirstNameCharMask.
>>     //
>>     //  Name characters are special. A name is made up of a number of
>>     //  different tables and some special case characters.
>>     //
>>     initOneTable(gNameChars, gNameCharMask);
>>
>>     //
>>     //  Name characters are special. A name is made up of a number of
>>     //  different tables and some special case characters.
>>     //
>>     initOneTable(gNameChars, gNCNameCharMask);
>>     gTmpCharTable[chColon] &= ~gNCNameCharMask;
>>
>>     //
>>     //  Then do the first name char
>>     //
>>     initOneTable(gFirstNameChars, gFirstNameCharMask);
>>
>> (2b) #define NEED_TO_GEN_TABLE
>> compile and do a sample run of a Xerces app, generate table.out
>>
>> (2c) Replace the XMLChar1_0::fgCharCharsTable1_0 definition pf
>> XMLChar.cpp
>> with that from table.out.
>>
>> (3) XMLChar.hpp
>>     Modify XMLChar1_0::isFirstNameChar, XMLChar1_0::isFirstNCNameChar,
>> XMLChar1_0::isNameChar, XMLChar1_0::isNCNameChar
>>     to each check for and allow characters in the #x10000-#xEFFFF range
>>
>>     else {
>>         if ((toCheck >= 0xD800) && (toCheck <= 0xDB7F))
>>            if ((toCheck2 >= 0xDC00) && (toCheck2 <= 0xDFFF))
>>                return true;
>>     }
>>
>>
>> (4)  Modify XMLReader::getName and XMLReader::getNCName
>>        to allow surrogate pairs in Names and NCNames
>>        (i.e., use the version 1.1 logic for both 1.0 and 1.1).
>>
>>
>> <diff5e>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>
> --
> Gareth Reakes, CTO         we7 - Great Free Music
> +44-20-7117-0809                   http://www.we7.com
>
> "The music business is a cruel and shallow money trench, a long plastic
> hallway where thieves and pimps run free, and good men die like dogs.
> There's also a negative side." - Hunter S. Thompson
>
>
>
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to