On 10/3/2010 6:19 PM, Robert Hunt wrote:
Dear all, I've been investigating for the last two weeks about creating a small open repository under the OpenScriptures banner for storing and maintaining (and even documenting) XML lists of versification schemes and international booknames, versification mappings, USFM and OSIS booknames and abbreviations, etc. I've already received a positive response from the author of Bibledit about starting with some of his lists. I realise that any particular format will never please everyone, but I'm interested in your comments on it's potential usefulness. I found that I needed such things personally for a project so rather than reinventing them yet again, I figured that every Bible program must need them so why not make them available (if they're not already). I guess my questions are: 1/ Are these sorts of lists already freely available in a suitable place (preferably independent of a particular program)? If so, no need for me to proceed.
There's quite a lot of data currently available. Little of it is in XML or any other human-friendly format, but you're welcome to mine our data and put it in a more presentable form. However, if you work with real data (extracting v11n data from actual Bibles), you'll quickly discover that the number of v11n systems is nearly equal to the number of different translations (excluding those that use the KJV v11n exactly, since that system actually is quite common).
Most of our data is at https://crosswire.org/svn/sword-tools/trunk/versification/ (including the basicv11ns subdirectory). The v11nsystem.pl script will generate a v11n definition file from a variety of formats in the format that we use.
An explanation of our canon definition format, found in the XML files at the above address, is at http://www.crosswire.org/wiki/Alternate_Versification/Canon_Definition_Format
CCEL has data (v11n & mapping) that they received from Wycliffe, presented in an XML (OSIS-like) format: http://www.ccel.org/refsys/refsys.html. However, their data is extremely inaccurate. (I don't know who is to blame for the inaccuracy & errors.)
There's also v11n & mapping data available as part of the STEP spec: http://www.crosswire.org/bsisg/download.htm
As for localized book names, Logos has a ton of this data, which they collected through community contributions back around 2000, when they were gearing up to release Logos Series X. They may or may not be amenable to sharing this.
Assuming they're not: 2/ Where might the information be gleaned from (with suitable permissions)? 3/ Apart from the above (versification schemes & mappings, USFM/OSIS booknames/filename/abbreviation standards, international booknames/abbreviations), what other lists do you suggest might be useful? 4/ Would your program be interested in taking advantage of such XML lists? 5/ If not, would another format be helpful?
XML is fine; we can make converters. Our interest in using this kind of data would be dependent on its utility and its accuracy.
--Chris _______________________________________________ sword-devel mailing list: sword-devel@crosswire.org http://www.crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page