Hmm. What page_seg_mode are you using? On Wed, Apr 25, 2012 at 3:17 AM, Pleiades <[email protected]>wrote:
> Just to be sure I was clear about what I have and what I want. > > In version 3.0 I had as output file what I am searching now in version > 3.01 (each \t corresponds to one box of my Excel array) > > Titre CLA ORI MET NUM D NUM F X (pm) Y (pm) SE CLR > FOR RES (Excel picture: 12 columns for raw 1) > 1 # Title Part1: The title is here (Excel picture: 12 columns for raw > 2 where all the last 11 columns are empty) > > With version 3.01 I have as output file: > > Titre > CLA > MET > NUM D > NUM F > X (pm) > Y (pm) > CLR > FOR > RES > 1 # Title Part1: The title is here > > It seems so strange that the default output format is different for > versions 3.0 ant 3.01... > Have you any idea easier than write my own parser to do that? > > Thank you. > > On 25 avr, 05:41, Mayur Mudigonda <[email protected]> wrote: > > The problem with blackboxes sometimes is that we expect too much out of > it. > > Tesseract is in the order of 10k LOC. A large amount of it to detect and > > recognize characters/text/page segmentation/alignment/transforms. If you > > dove into the code a bit more, you'd be more awed than sad :) > > > > Sermon given, I would write the parsing in Python which has some pretty > > nifty tools to play with text. WIthout knowing more (in terms of example > > pics, goals and ideas) it is hard to comment/advice further :) > > > > A little knowledge is dangerous.... :) > > > > Cheers, > > M > > > > On Tue, Apr 24, 2012 at 8:13 AM, Pleiades < > [email protected]>wrote: > > > > > Thank you for your answer. > > > It's a pity... :-( It worked well with v3.0... > > > Could you help me telling in which file or function I have to write my > > > own parser. > > > Thanks > > > > > -- > > > You received this message because you are subscribed to the Google > > > Groups "tesseract-ocr" group. > > > To post to this group, send email to [email protected] > > > To unsubscribe from this group, send email to > > > [email protected] > > > For more options, visit this group at > > >http://groups.google.com/group/tesseract-ocr?hl=en > > > > -- > > > > URL:www.cse.msu.edu/~mudigon1www.blindsight.com/team > > Elegance is not a dispensable luxury but a factor that decides between > > success and failure. > > Edsger Dijkstra > > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > -- URL: www.cse.msu.edu/~mudigon1 www.blindsight.com/team Elegance is not a dispensable luxury but a factor that decides between success and failure. Edsger Dijkstra -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

