Research into OCR of complex alphabets

laerm Tue, 14 Oct 2008 10:34:37 -0700

Hello Team Ocropus :)

Over the summer, I did a research project OCRing an obscure Native
American language with a a rather complex alphabet (Chinook). It was
definitely slow-going. We tested four software packages - OmniPage,
ReadIRIS, the native app in the Xerox scanner we used, as well as the
plugin for Acrobat - and ReadIRIS was the best. It was the only one
that had both training and Unicode support. I tried to get Ocropus
going, but my linux knowledge wasn't good enough to configure all of
the packages.


Now it has come time to write a paper for publication on this project,
and I want to know more about OCR. I figure you folks are experts, so
I was wondering if I could talk to some people on the way OCR works. I
am specifically interested in the training aspect, as well as how it
parses individual characters. Furthermore, if anyone has any
experience using OCR on complex alphabets, I'd love to talk to you.

A bit of background on Chinook:
- 30 characters
- several accents for vowels and consonants
- accent marks, prime marks, and glottalization marks

Thanks for any and all help.

---
micah stupak
[EMAIL PROTECTED]
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"ocropus" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/ocropus?hl=en
-~----------~----~----~----~------~----~------~--~---

Research into OCR of complex alphabets

Reply via email to