On Wed, Aug 22, 2012 at 12:25:45PM +0100, Nick White wrote: > To give you a concrete example, with my Ancient Greek training, I > needed to ensure that characters with a 'breathing' mark over them > were only recognised on the first character of a word, or the second > in the case of some digraphs. I ended up doing this by creating a > ton unicharambigs rules (about 35,000 - I wrote a program to do it,) > mostly of the form: > > 2 αἀ 2 αά 1
For those interested, I just uploaded the program in question to my webspace; see breathingambigs.c at http://www.dur.ac.uk/nick.white/tools/ -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

