The problem with blackboxes sometimes is that we expect too much out of it.
Tesseract is in the order of 10k LOC. A large amount of it to detect and
recognize characters/text/page segmentation/alignment/transforms. If you
dove into the code a bit more, you'd be more awed than sad :)

Sermon given, I would write the parsing in Python which has some pretty
nifty tools to play with text. WIthout knowing more (in terms of example
pics, goals and ideas) it is hard to comment/advice further :)

A little knowledge is dangerous.... :)

Cheers,
M

On Tue, Apr 24, 2012 at 8:13 AM, Pleiades <[email protected]>wrote:

> Thank you for your answer.
> It's a pity... :-( It worked well with v3.0...
> Could you help me telling in which file or function I have to write my
> own parser.
> Thanks
>
> --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en
>



-- 

URL:
www.cse.msu.edu/~mudigon1
www.blindsight.com/team
Elegance is not a dispensable luxury but a factor that decides between
success and failure.
Edsger Dijkstra

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to