The problem with blackboxes sometimes is that we expect too much out of it. Tesseract is in the order of 10k LOC. A large amount of it to detect and recognize characters/text/page segmentation/alignment/transforms. If you dove into the code a bit more, you'd be more awed than sad :)
Sermon given, I would write the parsing in Python which has some pretty nifty tools to play with text. WIthout knowing more (in terms of example pics, goals and ideas) it is hard to comment/advice further :) A little knowledge is dangerous.... :) Cheers, M On Tue, Apr 24, 2012 at 8:13 AM, Pleiades <[email protected]>wrote: > Thank you for your answer. > It's a pity... :-( It worked well with v3.0... > Could you help me telling in which file or function I have to write my > own parser. > Thanks > > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > -- URL: www.cse.msu.edu/~mudigon1 www.blindsight.com/team Elegance is not a dispensable luxury but a factor that decides between success and failure. Edsger Dijkstra -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

