> On Aug 2, 2025, at 9:40 AM, David Haslam <dfh...@protonmail.com> wrote:
> 
> Hi DM,
> 
> Does this part of your reply contain a typo?
> Also, osis2mod will properly convert it to &amp;AMP; And it will output a 
> diagnostic for the conversion.
No typo. &AMP; may be valid for HTML, but it is not valid for XML. In XML, 
&lt;, &gt;, &amp;, &quot; and &apos; are the only entities that are defined. 
&quot; and &apos; are only needed within attributes when the mark matches the 
quotation used for the attribute. &amp; is needed to have these 4 named 
attributes but allow for & in the text.

CrossWire’s module team always uses an XML validator, xmllint, to validate that 
the XML is well formed and conformant to the schema. It would report &AMP; as 
an error.

When osis2mod encounters a named character entity, it doesn’t assume that 
xmllint or other was used. Rather it escapes all other named entities that are 
valid in HTML. So &AMP; will become &amp;AMP; with a warning. The other choice 
would be to quit at that point because the user didn’t use an XML validator 
first.

In Him,
        DM

> 
> Best regards,
> 
> David
> 
> Sent with Proton Mail <https://pr.tn/ref/SWXT9A5YZ67G> secure email.
> 
> On Thursday, July 31st, 2025 at 2:35 PM, DM Smith <dmsm...@crosswire.org> 
> wrote:
>> See below.
>> 
>>> On Jul 31, 2025, at 8:53 AM, David Haslam <dfh...@protonmail.com 
>>> <mailto:dfh...@protonmail.com>> wrote:
>>> 
>>> Hi DM,
>>> 
>>> Further to the feedback relating to the recent updates of the FreGeneve1669 
>>> module...
>>> 
>>> Has anyone yet created an issue in JIRA for the failure of libsword to 
>>> output the & character wherever a module contains the XML entity &amp; ?
>> 
>> I didn’t see any when I surveyed the Jira issues. I noticed that &c. is 
>> quite common in the KJV margin notes, and front matter.
>> 
>>> 
>>> Which other XML entities have the same issue?
>>> Likely candidates include &lt; and &gt;
>>> There may be others such as &apos; and &quot;
>> 
>> Yes these are other possible problems.
>> 
>>> 
>>> Does the issue also pertain to their numerical equivalents, whether in 
>>> decimal or hexadecimal form?  e.g. &x#0027;
>> 
>> We have a Jira issue to convert these to their unicode equivalent. I’m 
>> almost done with that code, having started yesterday.
>> 
>>> Any remedy should also take account that some such entities are also valid 
>>> when the name is uppercase, e.g. &AMP;
>> 
>> The remedy is that xmllint will complain as it is not valid in xml. This 
>> should be a friend of all OSIS module makers.
>> 
>> Also, osis2mod will properly convert it to &amp;AMP; And it will output a 
>> diagnostic for the conversion.
>> 
>>> 
>>> Further reading: 
>>> https://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references
>>> 
>>> Best regards,
>>> 
>>> David
>>> 
>>> Sent with Proton Mail <https://pr.tn/ref/SWXT9A5YZ67G> secure email.
>>> _______________________________________________
>>> sword-devel mailing list: sword-devel@crosswire.org 
>>> <mailto:sword-devel@crosswire.org>
>>> http://crosswire.org/mailman/listinfo/sword-devel
>>> Instructions to unsubscribe/change your settings at above page
> 
> _______________________________________________
> sword-devel mailing list: sword-devel@crosswire.org
> http://crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page

_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page

Reply via email to