Please refer to the threads:
https://groups.google.com/d/topic/tesseract-ocr/rcsvxsxdjNY/discussion
https://groups.google.com/d/topic/tesseract-ocr/gh-bficm_2w/discussion

In brief, you'll need to write your own and quite intelligent
segmentation for formulas, then you'll be able to use Tesseract as a
"glyph recognizer".

Warm regards,
Dmitri Silaev
www.CustomOCR.com





On Wed, Jun 22, 2011 at 2:40 PM, Gökhan Sever <[email protected]> wrote:
> Hello,
>
> I get this failure when I try to recognize a page which contains both
> regular English text and formulations (e.g. Greek letters, divisions,
> sub-super scripts etc..)
>
>
> [gsever@ccn ~]$ tesseract scanpage1.tif outputtext
> Tesseract Open Source OCR Engine v3.01 with Leptonica
> tesseract: intmatcher.cpp:1165: int
> IntegerMatcher::FindBestMatch(INT_CLASS_STRUCT*, const
> ScratchEvidence&, uinT16, uinT8, INT_RESULT_STRUCT*): Assertion
> `ClassTemplate->NumConfigs > 0' failed.
> Aborted (core dumped)
>
> Is there a trained dataset for covering cases like mine?
>
> Thanks.
>
> --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en
>

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to