Re: [sword-devel] Versification discovery? [Was: Re: French versifications, was Re: FreJND and HebDelitzsch released]

Kahunapule Michael Johnson Sun, 23 Aug 2015 14:01:25 -0700

Title: signature

On 08/23/2015 08:27 AM, Matěj Cepl wrote:

On 2015-08-23, 09:29 GMT, Peter von Kaehne wrote:

Interesting and quite probably useful would be to know if any 
'colonial' translations would benefit from any of these three 
v11ns. I do think Malagassy was difficult to fit, also 
Kahunapule might have translations from French Polynesian 
background.

Yes, I have a variety of difficult-to-fit translations. Mostly this has convinced me that hard-coded versification is bad. This is especially true when you consider intentionally omitted verses for textual criticism reasons, which set tends to vary based on who is analyzing the text and making the translational decisions.

There were some talks about script which would take a given OSIS 
XML (or something else) and tries to match the text against all 
available versifications looking for the best match. Does such 
animal exist at all? Is it possible?

Yes, this algorithm exists as part of Haiola (next release). It picks the least bad versification of those available (i.e. hard coded into the version of the Sword Library that I'm using). Badness is minimized as follows:
1. Select the versification(s) with the minimum number of verses in books not in the versification. (This narrows things down when deuterocanon/apocrypha books are present.)
2. Of those selected in step 1, select the versification(s) with the minimum number of verses in chapters not in the versification. (Some versifications lack Malachi 4, for example.)
3. Of those selected in step 2, select the versification(s) with the minimum number of verses that extend beyond the expected number of verses in their chapter.
4. Of those selected in step 3, select a versification with the minimum number of verses defined in the versification but not present in the current text.

The actual implementation is more efficient than the above would seem to indicate, as it does it with one pass through the Bible, counting the misfits in each category, then one pass at the end finding the least bad fit.

The real issue, here, is how well Sword handles each category of badness. It turns out that badness levels 1 and 2 are pretty bad. Badness level 3 just results in some verses being combined with the previous verse-- maybe several verses combined. Badness level 4 isn't really that bad, and is, in fact, pretty much unavoidable unless you go with per-module versification, because we have lots of partial translations, like sometimes just one or two books of the Bible. The only reason I retain that test is to select KJV instead of KJVA, SynodalP instead of Synodal, and NRSV instead of NRSVA for Bibles with just the regular canon of 66 books or a subset of that.

I'm also aware of some other scripts that try to find the best fit, but they aren't as general or reliable as the above algorithm.

Your partner in electronic Bible publishing,

MICHAEL JOHNSON
PO BOX 881143
PUKALANI HI 96788-1143
USA

eBible.org
MLJohnson.org
Mobile: +1 808-333-6921
Skype: kahunapule

_______________________________________________
sword-devel mailing list: [email protected]
http://www.crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page

Re: [sword-devel] Versification discovery? [Was: Re: French versifications, was Re: FreJND and HebDelitzsch released]

Reply via email to