I’ve just checked in a change for osis2mod.

MODTOOLS-17 To osis2mod, added conversion of hex and decimal numeric entities 
to UTF-8, with special handling of <, >, &, ', and ".

Also:
 * Fixed a bug in hex numeric entities which defined &xHHHH; rather than 
&#xHHHH;.
 * Added entity sanity check of maximum length of 32.
 * Refactored entity handling into handleEntities and comment handling into 
handleComments.
 * Changed t_entitytype and t_commentstate into class enums EntityType and 
CommentState.
 * Added -d 1024 for entity and comment parsing.

Note: The coding allows for 0 padding of the numeric entities.
Note: The 5 need to be treated specially.
        &#38; or &#x26; → &amp;
        &#60; or &#x3C; → &lt;
        &#62; or &#x3E; → &gt;
        &#34; or &#x22; → &quot; or "
        &#39; or &#x27; → &apos; or '
When converted to these forms, &quot; should be transformed into " except in 
attributes using " and likewise &apos; into ' except in attributes having ‘.

I need to update the wiki to match.

In Him,
        DM Smith

_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page

Reply via email to