>The best fix for the unconnected
> scripts may be to break them into sub-akshara glyphs and recognize those
> separately.

After correct recognition, is there a method to put the output in the
the accepted form the language.

MNS Rao

On Apr 2, 8:43 pm, Ray Smith <[email protected]> wrote:
> The biggest problem with unconnected Indic scripts seems to be the aspect
> ratio and the amount of horizontal detail. Hindi seems to work quite well as
> it doesn't seem to have very big ligatures. The best fix for the unconnected
> scripts may be to break them into sub-akshara glyphs and recognize those
> separately.
>
> Ray.
> Sent from my Nexus1 Android phone.
> On Mar 29, 2011 11:18 PM, "Debayan Banerjee" <[email protected]> wrote:
>
>
>
>
>
>
>
> > Hi,
>
> > I gather that Tesseract 3.0 works well for Chinese script now. The
> > hallmark of Chinese script is that it is unconnected (unlike say Hindi
> > which has a line connecting all its characters), and it has a large
> > number of characters in the alphabet. In this light, I think it should
> > also work well with unconnected Indic script such as Kannada,
> > Malayalam, Punjabi etc.
>
> > Anyone know if this works?
>
> > --
> > Debayan Banerjee
> >http://hacking-tesseract.blogspot.com/
>
> > --
> > You received this message because you are subscribed to the Google Groups
>
> "tesseract-ocr" group.> To post to this group, send email to 
> [email protected].
> > To unsubscribe from this group, send email to
>
> [email protected].> For more options, visit this 
> group at
>
> http://groups.google.com/group/tesseract-ocr?hl=en.
>
>
>
>
>
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.

Reply via email to