On 09/15/2012 09:56 AM, Greg Hellings wrote:
To emphasize that we have an issue here, in the SWORD filters, here is
the output from diatheke with HTML, HTMLHREF and XHTML (which support
I just hacked in now in order to test).

greg@Gateway08:~/Source/sword/build (master)$ !diath
diatheke -b TKE -o h -f HTMLHREF -k Gen 1:2
Genesis 1:2: Elaboya kayawomele naari kayanna dhego. Yaali mahinje
ooddiiha ni owoopiha yahuruwedhiwe ni yiihi. Muneba wa Mulugu
waviravira vadhulu va mahinje, osasanyedhelaga.  <!/P><br />
(TKE)
greg@Gateway08:~/Source/sword/build (master)$ diatheke -b TKE -o h -f
HTML -k Gen 1:2
<meta http-equiv="content-type" content="text/html;
charset=UTF-8">Genesis 1:2: Elaboya kayawomele naari kayanna dhego.
Yaali mahinje ooddiiha ni owoopiha yahuruwedhiwe ni yiihi. Muneba wa
Mulugu waviravira vadhulu va mahinje, osasanyedhelaga.  <div
eID="gen11" type="paragraph"/><br />
(TKE)
greg@Gateway08:~/Source/sword/build (master)$ diatheke -b TKE -o h -f
XHTML -k Gen 1:2
Genesis 1:2: Elaboya kayawomele naari kayanna dhego. Yaali mahinje
ooddiiha ni owoopiha yahuruwedhiwe ni yiihi. Muneba wa Mulugu
waviravira vadhulu va mahinje, osasanyedhelaga.  <div eID="gen11"
type="paragraph"/>
(TKE)

All three are outputting the same verse from the same module. HTML and
XHTML are outputting <div eID="gen11" type="paragraph"/> which is what
the module has in its rawest form. HTMLHREF outputs <!/P> which is not
valid anything. There are other, odd, differences between the three
but none of those are germane to this discussion, it would seem to me.

HTML & XHTML are obviously problematic because eID isn't part of (X)HTML, but it's arguable that there is no problem with the HTMLHREF output. HTMLHREF is a proprietary format that was developed for GnomeSword, so it has extra stuff in it for subsequent processing within that application. It's HTML-ish, but not standard HTML, and the degree to which it violates HTML specs is really a matter to be decided by the Xiphos developers & anyone else using this format.

All of the extra stuff in the HTML output is inserted by Diatheke after the render filters have been applied to indicate the character encoding & add linebreaks. No one has bothered to add similar markup for the other HTML filters, and I would not necessarily argue that anything should be added.

--Chris


_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://www.crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page

Reply via email to