Re: Scanning alphabets which are not strictly Latin

Thomas Breuel Mon, 23 Feb 2009 10:12:34 -0800

>
>
> That's all very good news.
> There's no problem with training, or training data, we have lots to
> work with.
> The existing OCR engine (OmniPage) seems not to want to work with the
> old Irish script at all, so there's no existing output.
> Is there facility or necessity for training on multiple different
> typefaces, and keeping them in separate models?
> There exist different forms of this script, from different publishers
> etc.



If you just train it one one typeface and then apply it to a completely
different typeface, it won't work well.  But you can train it on one
typeface and then gradually retrain it on other typefaces, starting with the
most similar ones, you can extend it to even very different typefaces even
if you don't have transcriptions for the text of those other typefaces.

I'll give OCRopus a try at home tonight, see if I can get it working
> on OS X.


I recommend you wait until the new release comes out; installation and use
should get a significantly simpler.


> We're working on Windows here in the office (sadly), is there a plan
> for a Windows version?


The code is fairly portable and people have ported 0.3 to Windows.  The next
release has fewer dependencies and should be easier to port.

However, it's probably easier for you to run OCRopus under coLinux or
VirtualBox.

Tom

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"ocropus" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/ocropus?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: Scanning alphabets which are not strictly Latin

Reply via email to