|
On 08/23/2015 08:27 AM, Matěj Cepl
wrote:
On 2015-08-23, 09:29 GMT, Peter von Kaehne wrote:Interesting and quite probably useful would be to know if any 'colonial' translations would benefit from any of these three v11ns. I do think Malagassy was difficult to fit, also Kahunapule might have translations from French Polynesian background. Yes, I have a variety of difficult-to-fit translations. Mostly this has convinced me that hard-coded versification is bad. This is especially true when you consider intentionally omitted verses for textual criticism reasons, which set tends to vary based on who is analyzing the text and making the translational decisions. There were some talks about script which would take a given OSIS XML (or something else) and tries to match the text against all available versifications looking for the best match. Does such animal exist at all? Is it possible? Yes, this algorithm exists as part of Haiola (next release). It picks the least bad versification of those available (i.e. hard coded into the version of the Sword Library that I'm using). Badness is minimized as follows: 1. Select the versification(s) with the minimum number of verses in books not in the versification. (This narrows things down when deuterocanon/apocrypha books are present.) 2. Of those selected in step 1, select the versification(s) with the minimum number of verses in chapters not in the versification. (Some versifications lack Malachi 4, for example.) 3. Of those selected in step 2, select the versification(s) with the minimum number of verses that extend beyond the expected number of verses in their chapter. 4. Of those selected in step 3, select a versification with the minimum number of verses defined in the versification but not present in the current text. The actual implementation is more efficient than the above would seem to indicate, as it does it with one pass through the Bible, counting the misfits in each category, then one pass at the end finding the least bad fit. The real issue, here, is how well Sword handles each category of badness. It turns out that badness levels 1 and 2 are pretty bad. Badness level 3 just results in some verses being combined with the previous verse-- maybe several verses combined. Badness level 4 isn't really that bad, and is, in fact, pretty much unavoidable unless you go with per-module versification, because we have lots of partial translations, like sometimes just one or two books of the Bible. The only reason I retain that test is to select KJV instead of KJVA, SynodalP instead of Synodal, and NRSV instead of NRSVA for Bibles with just the regular canon of 66 books or a subset of that. I'm also aware of some other scripts that try to find the best fit, but they aren't as general or reliable as the above algorithm. --
|
_______________________________________________ sword-devel mailing list: [email protected] http://www.crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page
