Just to be sure I was clear about what I have and what I want. In version 3.0 I had as output file what I am searching now in version 3.01 (each \t corresponds to one box of my Excel array)
Titre CLA ORI MET NUM D NUM F X (pm) Y (pm) SE CLR FOR RES (Excel picture: 12 columns for raw 1) 1 # Title Part1: The title is here (Excel picture: 12 columns for raw 2 where all the last 11 columns are empty) With version 3.01 I have as output file: Titre CLA MET NUM D NUM F X (pm) Y (pm) CLR FOR RES 1 # Title Part1: The title is here It seems so strange that the default output format is different for versions 3.0 ant 3.01... Have you any idea easier than write my own parser to do that? Thank you. On 25 avr, 05:41, Mayur Mudigonda <[email protected]> wrote: > The problem with blackboxes sometimes is that we expect too much out of it. > Tesseract is in the order of 10k LOC. A large amount of it to detect and > recognize characters/text/page segmentation/alignment/transforms. If you > dove into the code a bit more, you'd be more awed than sad :) > > Sermon given, I would write the parsing in Python which has some pretty > nifty tools to play with text. WIthout knowing more (in terms of example > pics, goals and ideas) it is hard to comment/advice further :) > > A little knowledge is dangerous.... :) > > Cheers, > M > > On Tue, Apr 24, 2012 at 8:13 AM, Pleiades > <[email protected]>wrote: > > > Thank you for your answer. > > It's a pity... :-( It worked well with v3.0... > > Could you help me telling in which file or function I have to write my > > own parser. > > Thanks > > > -- > > You received this message because you are subscribed to the Google > > Groups "tesseract-ocr" group. > > To post to this group, send email to [email protected] > > To unsubscribe from this group, send email to > > [email protected] > > For more options, visit this group at > >http://groups.google.com/group/tesseract-ocr?hl=en > > -- > > URL:www.cse.msu.edu/~mudigon1www.blindsight.com/team > Elegance is not a dispensable luxury but a factor that decides between > success and failure. > Edsger Dijkstra -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

