FYI: Comparing XML-1.0 and XML-1.1 specs.

Comparing the XML 1.0-5 for name chars, the text from XML 1.1-2
has been copied into the XML 1.0-5 specification verbatum.

The distinction of control characters disallowed in XML-1.0 and
restricted in XML-1.1 still remains.

The differences in XML-1.0 and XML-1.1 end-of-line handling remain.

Steven J. Hathaway

> I didn't review the patch, but Rob's summary of the changes seems
> correct from my knowledge of the XML 5th edition changes. Basically the
> XML 1.1 name chars are now the XML 5th edition name chars.
>
> John
>
> On 02/05/13 06:23, Gareth Reakes wrote:
>> Hey Rob,
>>
>>      Thats great. I am not knowledgeable to review this from a correctness
>> point of view but am happy to help on the code review / commit side.
>> Anyone else out there who can check over for correctness?
>>
>> Cheers,
>>
>> Gareth
>>
>> On 2 May 2013, at 03:01, Rob Cameron <[email protected]>
>> wrote:
>>
>>> Xerces-C currently applies XML 1.0 4th edition rules to name characters
>>> in XML 1.0 documents.    XML 1.0 5th edition permits a broader class
>>> of name characters, based on those permitted in XML 1.1.
>>>
>>> Proposal: that Xerces-C 3.2.0 be updated to include support for XML 1.0
>>> 5th edition.
>>>
>>> Although our main work is with icXML, we've looked at making this
>>> change
>>> in Xerces-C original code base so that icXML support for XML 1.0 5e is
>>> compatible with us.
>>>
>>> I'm not entirely sure that I've handled everything, but the following
>>> change
>>> works in our test.  The change plan is below and a svn diff file is
>>> attached.
>>>
>>> Here is the change plan.
>>> ----------------------------------
>>>
>>>
>>> (1)  internal/CharTypeTables.hpp
>>>
>>> Rename gFirstNameChars1_1 to be gFirstNameChars
>>> Rename gNameChars1_1 to be gNameChars
>>>
>>> (2) util/XMLChar.cpp
>>> (2a)
>>>     Update initCharFlagTable1_1() to use the gFirstNameChars,
>>> gNameChars
>>>     Update initCharFlagTable() to use the set-ups from
>>> initCharFlagTable1_1()
>>>       to define gNameCharMask, gNCNameCharMask, and gFirstNameCharMask.
>>>      //
>>>      //  Name characters are special. A name is made up of a number of
>>>      //  different tables and some special case characters.
>>>      //
>>>      initOneTable(gNameChars, gNameCharMask);
>>>
>>>      //
>>>      //  Name characters are special. A name is made up of a number of
>>>      //  different tables and some special case characters.
>>>      //
>>>      initOneTable(gNameChars, gNCNameCharMask);
>>>      gTmpCharTable[chColon] &= ~gNCNameCharMask;
>>>
>>>      //
>>>      //  Then do the first name char
>>>      //
>>>      initOneTable(gFirstNameChars, gFirstNameCharMask);
>>>
>>> (2b) #define NEED_TO_GEN_TABLE
>>> compile and do a sample run of a Xerces app, generate table.out
>>>
>>> (2c) Replace the XMLChar1_0::fgCharCharsTable1_0 definition pf
>>> XMLChar.cpp
>>> with that from table.out.
>>>
>>> (3) XMLChar.hpp
>>>      Modify XMLChar1_0::isFirstNameChar, XMLChar1_0::isFirstNCNameChar,
>>> XMLChar1_0::isNameChar, XMLChar1_0::isNCNameChar
>>>      to each check for and allow characters in the #x10000-#xEFFFF
>>> range
>>>
>>>      else {
>>>          if ((toCheck >= 0xD800) && (toCheck <= 0xDB7F))
>>>             if ((toCheck2 >= 0xDC00) && (toCheck2 <= 0xDFFF))
>>>                 return true;
>>>      }
>>>
>>>
>>> (4)  Modify XMLReader::getName and XMLReader::getNCName
>>>         to allow surrogate pairs in Names and NCNames
>>>         (i.e., use the version 1.1 logic for both 1.0 and 1.1).
>>>
>>>
>>> <diff5e>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [email protected]
>>> For additional commands, e-mail: [email protected]
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to