Re: [sword-devel] Spelling (was Versification/Encoding issues)

Peter von Kaehne Thu, 08 Jan 2009 16:12:06 -0800

Mike Hart wrote:
> That's interesting, because ancle is one of the words I corrected in
> JSFB -- the OCR had ancle, but the PDF itself, my paper KJV copy, and
> my JPS complete Tanach (individual volumes) had ankle...  I can't say
> what verse it was, at the time I was hunting for e's that had been
> OCR'd into c's  (search for 'regular expression'
> [bcdfghjklmnpqrstvwxy]c[bcdfgjklmnpqrstvwx] in kwrite)


You should have a look at Troy's work with tesseract. Rather than search
and replace a text badly ocred he seems to have figured out how to
"educate" tesseract with one or two sample pages until it does the right
thing. That might be way easier and with a better outcome in the long
term for you too.

Peter

_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://www.crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page

Re: [sword-devel] Spelling (was Versification/Encoding issues)

Reply via email to