Re: Training tesseract: Dictionary files and DangAmbigs file not effective?

Debayan Banerjee Wed, 15 Apr 2009 01:55:00 -0700

2009/4/10 Jon <[email protected]>:
>
> I may be wrong, but I think /dict/dawg.cpp line 144 doesn't seem to
> consider UTF-8 (parameter 3 is a single byte), and thus fails on my
> Hebrew word.
> I'm still looking into it, it's the first time I'm looking at the
> code.


Well it sure does not work for Indic script too. It has made my
efforts to implement OCR in Hindi/Bengali a difficult process.
-- 
Be Intelligent, Use GNU/Linux

http://debayanin.googlepages.com/
http://debayan.wordpress.com
http://lug.nitdgp.ac.in

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: Training tesseract: Dictionary files and DangAmbigs file not effective?

Reply via email to