Re: Tesseract 3.0 with unconnected Indic script

Sriranga(78yrsold) Tue, 29 Mar 2011 23:46:43 -0700

If we succeeded in Sanskrit(Deveanagari script) which is mother lang of
Indic
no doubt tesseract 3.01 should work. I have tested with Tamil which has
dependent vowels identical to Bengali as well as Kannada, Telugu. Only
problem is with output accuracy - which can be solved for, time being,with
help of post processor ,
In FreeOCR latest version 4.1(July10)  has post processor developed by the
Ralph I found latest version tesseract.exe will work with freeOCR till
today.
Even vietOCR developed by Quan has post processsor feature which works for
indic apart from viet.


With regards,
-sriranga(78yrs)




On Wed, Mar 30, 2011 at 11:47 AM, Debayan Banerjee <[email protected]>wrote:

> Hi,
>
> I gather that Tesseract 3.0 works well for Chinese script now. The
> hallmark of Chinese script is that it is unconnected (unlike say Hindi
> which has a line connecting all its characters), and it has a large
> number of characters in the alphabet. In this light, I think it should
> also work well with unconnected Indic script such as Kannada,
> Malayalam, Punjabi etc.
>
> Anyone know if this works?
>
> --
> Debayan Banerjee
> http://hacking-tesseract.blogspot.com/
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected].
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.

Re: Tesseract 3.0 with unconnected Indic script

Reply via email to