On Wed, May 23, 2012 at 11:10 PM, Falke <[email protected]> wrote: > From what I see, there is no traineddata for the Roman latin > alphabet. Essentially, the current eng.traineddata's shortcoming is > its lack of the macron diacritic. > > Is it possible to add the macron glyphs to the already-existing > eng.traineddata? (the Ā, ā, Ē, ē, Ō, ō, Ū, ū) > > No, it is not possible (AFAIK). But you can try to training only missing glyphs and use (in 3.02) "-l eng+missing_glyphs"
------------------- > > On a tangential issue: it's almost comical how, in practice, there is > no easy way to "google" for information on the Roman Empire (and > Catholic Church) Latin language and alphabet and glyphs, because the > other conventional use of the phrase "latin alphabet" (referring to > the modern latinate derivatives and descendants) gets in the way! > > I think a new convention and descriptor needs to be established, that > uniquely refers to and denotes the alphabet used by ancient Romans > (the "real" Latin!)... (but then, again, it differs mostly with the > macron :-) > > > > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

