On 10/3/2010 6:19 PM, Robert Hunt wrote:
Dear all,

I've been investigating for the last two weeks about creating a small
open repository under the OpenScriptures banner for storing and
maintaining (and even documenting) XML lists of versification schemes
and international booknames, versification mappings, USFM and OSIS
booknames and abbreviations, etc. I've already received a positive
response from the author of Bibledit about starting with some of his
lists. I realise that any particular format will never please everyone,
but I'm interested in your comments on it's potential usefulness. I
found that I needed such things personally for a project so rather than
reinventing them yet again, I figured that every Bible program must need
them so why not make them available (if they're not already).

I guess my questions are:

    1/ Are these sorts of lists already freely available in a suitable
    place (preferably independent of a particular program)? If so, no
    need for me to proceed.

There's quite a lot of data currently available. Little of it is in XML or any other human-friendly format, but you're welcome to mine our data and put it in a more presentable form. However, if you work with real data (extracting v11n data from actual Bibles), you'll quickly discover that the number of v11n systems is nearly equal to the number of different translations (excluding those that use the KJV v11n exactly, since that system actually is quite common).

Most of our data is at https://crosswire.org/svn/sword-tools/trunk/versification/ (including the basicv11ns subdirectory). The v11nsystem.pl script will generate a v11n definition file from a variety of formats in the format that we use.

An explanation of our canon definition format, found in the XML files at the above address, is at http://www.crosswire.org/wiki/Alternate_Versification/Canon_Definition_Format

CCEL has data (v11n & mapping) that they received from Wycliffe, presented in an XML (OSIS-like) format: http://www.ccel.org/refsys/refsys.html. However, their data is extremely inaccurate. (I don't know who is to blame for the inaccuracy & errors.)

There's also v11n & mapping data available as part of the STEP spec: http://www.crosswire.org/bsisg/download.htm

As for localized book names, Logos has a ton of this data, which they collected through community contributions back around 2000, when they were gearing up to release Logos Series X. They may or may not be amenable to sharing this.

    Assuming they're not:
    2/ Where might the information be gleaned from (with suitable
    permissions)?
    3/ Apart from the above (versification schemes & mappings, USFM/OSIS
    booknames/filename/abbreviation standards, international
    booknames/abbreviations), what other lists do you suggest might be
    useful?
    4/ Would your program be interested in taking advantage of such XML
    lists?
    5/ If not, would another format be helpful?

XML is fine; we can make converters. Our interest in using this kind of data would be dependent on its utility and its accuracy.

--Chris

_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://www.crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page

Reply via email to