Re: [sword-devel] Spelling (was Versification/Encoding issues)
http://www.wipfandstock.com/ WIPF and STOCK Publishers Wipf and Stock publishes new works in theology, biblical studies, church history, philosophy and related disciplines. Our vision is to publish according to the merits of content rather than exclusively to the demands of the marketplace. We continually accept new manuscripts for our different imprints: Wipf and Stock, Cascade Books, Pickwick Publications and Resource Publications. F. H. A. Scrivener's The Authorized Edition of the English Bible (1611) - Its Subsequent Reprints and Modern Representatives was republished in 2004 by WIPF and STOCK Publishers, Eugene, Oregon. ISBN 1-59244-634-5 http://wipfandstock.com/store/The_Authorized_Version_of_the_English_Bible_1611_Its_Subsequent_Reprints_and_Modern_Representatives http://wipfandstock.com/store/The_Authorized_Version_of_the_English_Bible_1611_Its_Subsequent_Reprints_and_Modern_Representatives I have just acquired a copy. -- David -- View this message in context: http://www.nabble.com/Versification-Encoding-Issues-tp21341395p21401180.html Sent from the SWORD Dev mailing list archive at Nabble.com. ___ sword-devel mailing list: sword-devel@crosswire.org http://www.crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page
Re: [sword-devel] Spelling (was Versification/Encoding issues)
Using Tessaract to help the Irish New Testament project is suggested. See http://www.crosswire.org/wiki/Non-CrossWire_Text-Development_Projects#Individual_Works http://www.crosswire.org/wiki/Non-CrossWire_Text-Development_Projects#Individual_Works We should try and establish personal contact with Pastor Craig Ledbetter. http://www.biblebc.com/Projects/irish_new_testament_project.htm http://www.biblebc.com/Projects/irish_new_testament_project.htm I think CrossWire could provide some useful technical help. -- David Peter von Kaehne wrote: Mike Hart wrote: That's interesting, because ancle is one of the words I corrected in JSFB -- the OCR had ancle, but the PDF itself, my paper KJV copy, and my JPS complete Tanach (individual volumes) had ankle... I can't say what verse it was, at the time I was hunting for e's that had been OCR'd into c's (search for 'regular expression' [bcdfghjklmnpqrstvwxy]c[bcdfgjklmnpqrstvwx] in kwrite) You should have a look at Troy's work with tesseract. Rather than search and replace a text badly ocred he seems to have figured out how to educate tesseract with one or two sample pages until it does the right thing. That might be way easier and with a better outcome in the long term for you too. Peter ___ sword-devel mailing list: sword-devel@crosswire.org http://www.crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page -- View this message in context: http://www.nabble.com/Versification-Encoding-Issues-tp21341395p21368903.html Sent from the SWORD Dev mailing list archive at Nabble.com. ___ sword-devel mailing list: sword-devel@crosswire.org http://www.crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page
Re: [sword-devel] Spelling (was Versification/Encoding issues)
Please refer to Main author: Norton, David. Title details: A textual history of the King James Bible / David Norton. Published: Cambridge : Cambridge University Press, 2005. Physical desc.: ix, 387 p. ; 26 cm. Identifier: ISBN: 0521771005 Notes: Partially based on F.H.A. Scrivener's The Authorised edition of the English Bible, 1611. Includes bibliographical references and indexes. Contents: Part I. The History: -- Making the text -- Pre-1611 evidence for the text -- The first edition -- The King's printer at work, 1612-17 -- Correcting and corrupting the text, 1629-1760 -- Setting the standard, 1762-1769 -- The current text -- Part II. The New Edition: -- Variants and orthography -- Punctuation and other matters -- Appendices 1-9. http://www.copac.ac.uk/wzgw?field=titerms=textual%20history%20of%20the%20King%20James%20Bible http://www.copac.ac.uk/wzgw?field=titerms=textual%20history%20of%20the%20King%20James%20Bible -- David -- View this message in context: http://www.nabble.com/Versification-Encoding-Issues-tp21341395p21369013.html Sent from the SWORD Dev mailing list archive at Nabble.com. ___ sword-devel mailing list: sword-devel@crosswire.org http://www.crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page
Re: [sword-devel] Spelling (was Versification/Encoding issues)
Review ‘His scholarship cannot be bettered … In the face of centuries of highly coloured myths, his dogged and committed analytic detail is greatly to be welcomed.‘ Professor David Daniell, Emeritus Professor of English at UCL ‘… meticulously researched and clearly written … This is a tremendous achievement and a valuable addition to biblical literature.‘ Contemporary Reviews ‘In recounting the history of the textual transmission of the English Bible, Dr Norton has produced an impressive piece of work. Not only does he provide a mass of information on a much fuller scale than has ever been attempted before, but he presents it with admirable clarity, using such manuscript evidence as is available, and full lists and tables of variant readings as well as the resources of computer technology. In short, it is a milestone in its particular field, and other scholars and students will find it indispensable.‘ Canon Professor J. R. Porter, Professor Emeritus of Theology, Exeter University, Church Times ‘… work of awe-inspiring diligence. … one can have nothing but praise for a beautifully written and handsomely presented piece of work.‘ Epworth Review 'His book will be the definitive work on the subject for a long time to come, and is unlikely to be superseded unless and until significant new evidence comes to light …' Churchman '… detailed analysis … formidable attention … an essential reference.' Religious Studies Review Pasted from the Amazon.co.uk page -- David David Haslam wrote: Please refer to Main author: Norton, David. Title details: A textual history of the King James Bible / David Norton. Published: Cambridge : Cambridge University Press, 2005. Physical desc.: ix, 387 p. ; 26 cm. Identifier: ISBN: 0521771005 Notes: Partially based on F.H.A. Scrivener's The Authorised edition of the English Bible, 1611. Includes bibliographical references and indexes. Contents: Part I. The History: -- Making the text -- Pre-1611 evidence for the text -- The first edition -- The King's printer at work, 1612-17 -- Correcting and corrupting the text, 1629-1760 -- Setting the standard, 1762-1769 -- The current text -- Part II. The New Edition: -- Variants and orthography -- Punctuation and other matters -- Appendices 1-9. http://www.copac.ac.uk/wzgw?field=titerms=textual%20history%20of%20the%20King%20James%20Bible http://www.copac.ac.uk/wzgw?field=titerms=textual%20history%20of%20the%20King%20James%20Bible -- David -- View this message in context: http://www.nabble.com/Versification-Encoding-Issues-tp21341395p21371853.html Sent from the SWORD Dev mailing list archive at Nabble.com. ___ sword-devel mailing list: sword-devel@crosswire.org http://www.crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page
Re: [sword-devel] Spelling (was Versification/Encoding issues)
On issue 4, spelling: I've taken everyone's advice on spelling to heart, I will try to remain true to the original text copy. As for spelling, and as a fascinating learning experience, pick up your printed KJV Bible and examine the spelling of the word ankle[s] in Ezekiel 47:3 and Acts 3:7. Some editions have ancle, others have ankle. Ostensibly both streams are based on the Authorised Version of 1769. So Peter's advice is spot on. -- David That's interesting, because ancle is one of the words I corrected in JSFB -- the OCR had ancle, but the PDF itself, my paper KJV copy, and my JPS complete Tanach (individual volumes) had ankle... I can't say what verse it was, at the time I was hunting for e's that had been OCR'd into c's (search for 'regular expression' [bcdfghjklmnpqrstvwxy]c[bcdfgjklmnpqrstvwx] in kwrite) On the subject, but in an opposing view, if you look at the 1611 text of the KJV, you'll note that some ~50% of the words are spelled different from what we call call the King James Version today, but it doesn't really seem to matter. Read for example the 23rd psalm, It is still (or originally) the same as what we know and memorize in Sunday school at age 9, regardless of the spelling. I don't remember the spelling when I recite. KJV1611 23rd pfalme http://www.us.archive.org/GnuBook/?id=holybiblefacsimi00polluoft#804 (there's a zoom button in the upper left margin, it is readable at 50% ) (**)-see further note below. Since the 1769 version is still called the King James and they both read largely the same, I'd say the spelling is not as important as the word (as pronounced). And even then, a good number of words have been 'updated' from the 1611 copy in the 1769 'true' KJV. I've taken everyone's advice on spelling to heart, I will try to remain true to the original text copy. That said, If you look at the quality of the Jewish School and Family Bible scans, you will see that I'm up against a mammoth task just getting a readable text, much less one that is letter-exact. About 10%-20% of the words were mis-interpreted by the OCR. I've managed to reverse engineer the OCR process and repair the meaning of most words. That is, an OCR interprets the same font the same way most of the time, so what may appear to be gibberish in the OCR output can be repaired by careful examination of the OCR errors. For example, in JSFB, the italicized words are generally simple short modifier words: the, of , to, etc. The OCR did poorly at interpreting these words, but it did do a fair job of being repeatable in how it interpreted them (of turned into o/* or o/' or o/.) I've done countless search and replace for things like V/ - W, etc to restore the characters to readable text. What I've got now matches the PDF for 95+% of my random checks, with mostly missing letters and punctuation for most mismatches now. (and no I'm not trying to keep italicized words.. plain text only. ) Additionally, In the JSFB, verses are marked in the margins only. I am restoring the verse indicators to the verse divisions. In volume 1 this is easy, because the verse divisions appear as asterisks. (Don't ask me why, I don't see any divisions in the PDF, but they are there in the 2nd copy of volume 1 on the archive ( http://www.archive.org/details/schoolfamilybibl01beni ) In the other volumes, the verse division is generally the nearest punctuation mark, but not always. The not always part gets tricky. I'm referring to the JSFB PDF, A hardcopy KJV, and a JPS new Tanach to see. Additionally, the JSFB has copious foot notes on each page (average 10 notes a page). I'm unable to devise a capture technique for the notes on this revision, so these are being tossed. The footnote markers are presenting another level of special problem, in that they mess with the word they're attached to. After all these issues, I by myself, will never be able to certify the correct spelling of each word from this witness, and that isn't my intention, because there is so much more to do. I'm semi-dyslexic anyway, so editing would never be my strong point. This work has a different (unique to me anyway) approach to translation, (uses The Eternal For the tetragrammation, for example) that seems to be interesting enough to study, and I study in bibletime or bible desktop, so I want it there. The years 2002-2008 were explosive for online texts. Over 1 million books now reside at the Internet Archive alone, and Google was a bigger (but more recent) operation. However, The bubble is over. The rate of books going online will drop significantly due to Microsoft dropping its program, and Google settling the lawsuits against it by the publishing industry. It is my belief that these texts (especially Judaeo-Christian texts) may not always be readily available online, so there is a limited window while they are being offered for free download to snag what you can. Also, there are many areas
Re: [sword-devel] Spelling (was Versification/Encoding issues)
Mike Hart wrote: That's interesting, because ancle is one of the words I corrected in JSFB -- the OCR had ancle, but the PDF itself, my paper KJV copy, and my JPS complete Tanach (individual volumes) had ankle... I can't say what verse it was, at the time I was hunting for e's that had been OCR'd into c's (search for 'regular expression' [bcdfghjklmnpqrstvwxy]c[bcdfgjklmnpqrstvwx] in kwrite) You should have a look at Troy's work with tesseract. Rather than search and replace a text badly ocred he seems to have figured out how to educate tesseract with one or two sample pages until it does the right thing. That might be way easier and with a better outcome in the long term for you too. Peter ___ sword-devel mailing list: sword-devel@crosswire.org http://www.crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page