Ühel kenal päeval, K, 2009-06-03 kell 19:25, kirjutas DM Smith: > On Jun 3, 2009, at 1:36 PM, Mattias Põldaru wrote: > > > Hi everybody. > > > > It is nice to see you (DM, I suppose) got the osis2mod working in no > > time at all. There is one more issue with preverse stuff. Some > > whitespace gets counted as preverse on my file and I think this is > > wrong, although it isn't that complicated at all to remove whitespace > > from my source document. I paste a example here. > > > > > > Here is the input osis file. Please correct me, if I have something > > wrong here. > > <!-- start of example clip --> > > <div type="bookGroup"> > > <title>Vana Testament</title> > > <div type="book" osisID="Gen" canonical="true"> > > <title type="main">1. Moosese</title> > > <div type="section" scope="Gen.1.1-Gen.2.3" > > > <title>Maailma ja inimese loomine</title> > > <chapter sID="Gen.1" osisID="Gen.1" /> > > <title type="chapter">1. peatükk</title> > > <p> > > <verse sID="Gen.1.1" osisID="Gen. > > 1.1" /> > > Alguses lõi Jumal taevad ja maa. > > <verse eID="Gen.1.1" /> > > </p> > > <p> > > <verse sID="Gen.1.2" osisID="Gen. > > 1.2" /> > > Ja maa oli tühi ja paljas ja pimedus oli sügavuse peal ja Jumala Vaim > > hõljus vete kohal. > > <verse eID="Gen.1.2" /> > > </p> > > <!-- end of example clip --> > > > > > > > > > > And here is the corresponding module output. Please notice the one > > space > > only preverse. > > <!-- start of example clip --> > > <div sID="gen1" type="bookGroup"/> <title>Vana Testament</title> <div > > canonical="true" osisID="Gen" sID="gen2" type="book"/> <title > > type="main">1. Moosese</title> <div sID="gen3" scope="Gen.1.1-Gen.2.3" > > type="section"/> <title>Maailma ja inimese loomine</title> > > <chapter osisID="Gen.1" sID="Gen.1"/> <title type="chapter">1. > > peatükk</title> <div sID="gen4" type="paragraph"/> > > Alguses lõi Jumal taevad ja maa. <div eID="gen4" type="paragraph"/> > > <div type="x-milestone" subType="x-preverse" sID="pv1"/><div > > sID="gen5" > > type="paragraph"/> <div type="x-milestone" subType="x-preverse" > > eID="pv1"/> Ja maa oli tühi ja paljas ja pimedus oli sügavuse peal ja > > Jumala Vaim hõljus vete kohal. <div eID="gen5" type="paragraph"/> > > <!-- end of example clip --> > > The pre-verse contains "<p> " (the paragraph start and the space) > > Handling of whitespace is a bit problematic. What osis2mod does is > replace sequences of whitespace (newlines, spaces and tabs) with a > single space. If a verse contains leading or trailing space, it is > trimmed. (I don't think it should do this trimming.) > > What osis2mod does not have knowledge of the containment model of the > OSIS schema. That is, if it did, it could remove whitespace between > element tags that don't allow for text. > > In this case, the OSIS schema allows for whitespace after the opening > paragraph tag and before the verse tag. One could have: > <p>yada yada yada <verse>verse text</verse> yada yada yada</p> > In this case, it would be inappropriate to trim the whitespace off of > the text that precedes the verse. > > If we can come up with a good heuristic I'd be glad to implement it. > For the case I have, it would be sufficient to check if the preverse has any printing characters and not to add an empty preverse.
Mattias _______________________________________________ sword-devel mailing list: sword-devel@crosswire.org http://www.crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page