Joe, I got over my problem, though I don't remember how. I think I updated to the latest svn version, and no longer had the problem.
On Sunday, 27 May 2012, Joe Aspara <[email protected]> wrote: > I have the same problem reported by Brock. Anyone has a solution to force tesseract to read one line at time ignoring the multi-column layout. (I guess this was the standard behavior in the 1.xx and 2.xx versions) > > Il giorno sabato 24 settembre 2011 02:04:23 UTC+2, Brock ha scritto: >> >> Hi, >> >> I want to OCR a receipt scan, which has a left-aligned column of text, >> and a right aligned column of prices. >> >> Tesseract (most recent from SVN, with commented out dependency to let >> it compile) is parsing it into columns. I end up with the >> descriptions, and then below them, the prices. This makes joining the >> data back together difficult or impossible. >> >> I tried all the pagesegmodes (via config file), which made different >> output, but they were either garbage, or still had the columns parsed >> separately. >> >> Has anyone had and solved this problem? Any tips? >> >> Thanks, Brock > > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

